You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I'm recently studying the principle of xv6 and benefits a lot from the explicit comments and the textbook. I think that there may be a bug in the logging layer in the file system. I'll describe the details below:
This is the code of install_trans() in /kernel/log.c :
the line bunpin(dbuf) unpins dbuf in the buffer cache so that this block can be evicted out of the memory after brelse(dbuf). The code works correctly if everything goes normally: there is a bpin() in log_write() which pins the block in the buffer cache by incrementing its reference count and the bunpin() here decrements it.
However, if a crash happened during the installation of a transaction, after xv6 reboots, it would call recover_from_log() in initlog() and call install_trans to write logging blocks into their home blocks. At this moment, there isn't a bpin() before, so the bunpin() here will decrement the reference count to zero and in brelse() the reference count will experience an underflow and become UINT_MAX since the refcnt is a uint type variable. As a result, the block will stay in the buffer cache "forever".
We can expose this bug by adding two lines in xv6:
We emulate a crash by adding a panic() in install_trans() like this:
brelse(dbuf);
}
+ if (log.lh.n != 0) panic("crash");
}
Here the condition log.lh.n != 0 ensures that the panic() won't be triggered during the recover procedure.
We add an assertion in brelse():
acquire(&bcache.lock);
+ if (b->refcnt == 0) panic("bug");
b->refcnt--;
if (b->refcnt == 0) {
Before decrementing the reference count, b->refcnt >= 1 should always hold.
After we fire up xv6 for the first time, we'll encounter a "crash", and after rebooting, xv6 will panic at panic("bug"). I made a disk image, fs-bug.img, which records a crashed xv6 using the first step. If we load this image, we'll directly get a buggy scene (and can be detected by the assertion in brelse()). The disk image is available here.
A possible solution for this problem is to add a bpin() in read_head() so that the bunpin() during the recover procedure will have corresponding bpin().
I hope that my suggestions will be helpful. Welcome for further discussions, thanks!
The text was updated successfully, but these errors were encountered:
Hello, I'm recently studying the principle of xv6 and benefits a lot from the explicit comments and the textbook. I think that there may be a bug in the logging layer in the file system. I'll describe the details below:
This is the code of install_trans() in
/kernel/log.c
:the line
bunpin(dbuf)
unpinsdbuf
in the buffer cache so that this block can be evicted out of the memory afterbrelse(dbuf)
. The code works correctly if everything goes normally: there is abpin()
inlog_write()
which pins the block in the buffer cache by incrementing its reference count and thebunpin()
here decrements it.However, if a crash happened during the installation of a transaction, after xv6 reboots, it would call
recover_from_log()
ininitlog()
and callinstall_trans
to write logging blocks into their home blocks. At this moment, there isn't abpin()
before, so thebunpin()
here will decrement the reference count to zero and inbrelse()
the reference count will experience an underflow and becomeUINT_MAX
since therefcnt
is auint
type variable. As a result, the block will stay in the buffer cache "forever".We can expose this bug by adding two lines in xv6:
We emulate a crash by adding a
panic()
ininstall_trans()
like this:brelse(dbuf); } + if (log.lh.n != 0) panic("crash"); }
Here the condition
log.lh.n != 0
ensures that thepanic()
won't be triggered during the recover procedure.We add an assertion in
brelse()
:acquire(&bcache.lock); + if (b->refcnt == 0) panic("bug"); b->refcnt--; if (b->refcnt == 0) {
Before decrementing the reference count,
b->refcnt >= 1
should always hold.After we fire up xv6 for the first time, we'll encounter a "crash", and after rebooting, xv6 will panic at
panic("bug")
. I made a disk image,fs-bug.img
, which records a crashed xv6 using the first step. If we load this image, we'll directly get a buggy scene (and can be detected by the assertion inbrelse()
). The disk image is available here.A possible solution for this problem is to add a
bpin()
inread_head()
so that thebunpin()
during the recover procedure will have correspondingbpin()
.I hope that my suggestions will be helpful. Welcome for further discussions, thanks!
The text was updated successfully, but these errors were encountered: