Skip to content
This repository was archived by the owner on Mar 2, 2022. It is now read-only.

Tags: WukLab/LegoOS

Tags

v0.1.1-excache

Better ExCache (pcache)

Recently we added two optimizations to pcache
- Use free list instead of bitmap
- Add piggyback for non-victim cache eviction

Free list always have good performance while associativity is high.
Although it may have extra lock contention, this is the best we could
do by software.

We used to have piggyback for victim cache flush. We recently made
the per-set as our default eviction. And now we piggyback to it as well.

Both these two are pure optimization. But.. they do make the code a little
more complex, especially the pcache fill from remote path.

v0.1.0-ImageNet

Able to run ResNet with ImageNet

After fixing memory side memory leak, we are able to
train ImageNet using TensorFlow ResNet. We tried with
"--batch_size=1024", which has around 70GB resident memory.

Who run this on CPU??
A LegoOS who currently does not support GPU monitor.
But so as you know, writing a GPU monitor is DOABLE.

v0.0.9-stable-net

Kind of a stable net layer

Oh well, we've fixed some bugs at FIT layer. Basically we end up posting recv
wr to the QP twice, every single time. And, the worst thing is, there is
so decent error checking for ib_post_recv(). My bad. Damn. I should have
went through this.

I've been tring to have decent error checking all the time. But as the repo
becomes bigger, and more contributors, sometimes it is hard to control.
After all, we are all human.

Anyway, these days patch fixed this issue at both lego and linux-fit side.
Along that, we've also added some decent rpc profiling code, which can profile
different message size and emulate highly contended multi-thread rpc.

There are still A HUGE ROOM for improvement at our net code. But let us hope
this at least let us have a stable net layer. Lesson learned, lastweek.

v0.0.8-victim-double-free-sched-irq

Fixed victim double free bug in victim hit code path

Added back the pi_lock to have irq disabled spinlock

v0.0.7-ib-pci

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Merge pull request #79: Infiniband, PCI, and DMA update

This pull request includes three major parts

PCI
Ported PCI core subsystem from Linux. Not everything is ported, only the major data structures.

DMA
We reframed the DMA APIs. Underlying, we are using x86 pci-nommu DMA ops. We did not port the drivers/iommu/intel-iommu.c Hope this simple nommu can work everywhere.

Infiniband
Walk through ib_core, mlx4_core, and mlx4_ib. I think we have a solid IB stack.

It has been tested with

1P and 1M: RAMFS microbenchmark
1P, 1M, and 1S: TF MNIST

Reviewed by:
Yutong
Yiying

v0.0.6-osdi-eval

OSDI Eval Commit Point

All OSDI experiments are carried out before this commit.
This includes all the recent bug fix, network thread model,
piggy-backed flush/miss, and cache-awared VA allocation.

v0.0.5-zerofll-vnode-diryflush

Various Updates on this pre-release: zerofill, vNode, DirtyFlush, and…

… few other fixes

v0.0.4-pte/pmd-lock

Processor: User pgtable use per-PTE, per-PMD lock

Before this, all pgtable opearations are protected by one spinlock in mm.
This is bad for multi-threaded applications. We now use per PTE page, and
per PMD page lock.

The spinlock is embedded within `struct page`. The spinlock is 4 bytes.
And as long as `struct page` is not larger than 64 bytes, we are fine.

This optimization applies to Processor only, since it is the one who
manipulate the user pgtables.

Memory probably need some similar stuff. Later.

v0.0.3-munmap-mremap-loader-bug-fixed

This tag marks a milestone where:

- munmap/mremap behaviour changed, rmap_get_pte_locked can catch bugs without any doubt.
- processor side loader bug fixed, execv() syscall can be used
- besides, the basic envorionment is hooked with pcache: creating, cleanup are decent

We should be able to run any programs at this point. Pcache should be an concern any longer.

v0.0.2-pcache-evict-ref

pcache: sync refcount between evict and normal users

This is really a nasty fix. But it can cover most race conditions.
The problem can be described as: two threads are using the same pcm, while
one of them is trying to evict it and another is using it.

The evict one is pcache_evict_line. The others can be munmap, mremap, wp
handler. So how we syn between these? We actually need help from two spinlocks
(pte lock and pcache lock), and pcache refcount.

pte lock actually ensure other parties (munmap, mremap, wp) can see a safe
pcm. And the rule is: once drop the pte lock and acquire it again, it must
check if the pte has been changed. If can be unmapped by eviction at the same
time.

Backgroud eviction again live usage is really hard to do. Unlike the Jave GC
which counts refcount to an object, which will not reclaim live objects.
But here, we have the danger of reclaiming live (used by other threads concurrently) pcm.