Skip to content

Commit

Permalink
runtime: fix block leak due to race in span set
Browse files Browse the repository at this point in the history
The span set data structure may leak blocks due to a race in the logic
to check whether it's safe to free a block. The simplest example of this
race is between two poppers:

1. Popper A claims slot spanSetEntries-2.
2. Popper B claims slot spanSetEntries-1.
3. Popper A gets descheduled before it subtracts from block.used.
4. Popper B subtracts from block.used, sees that claimed
   spanSetEntries-1, but also that block.used != 0, so it returns.
5. Popper A comes back and subtracts from block.used, but it didn't
   claim spanSetEntries-1 so it also returns.

The spine is left with a stale block pointer and the block later gets
overwritten by pushes, never to be re-used again.

The problem here is that we designate the claimer of slot
spanSetEntries-1 to be the one who frees the block, but that may not be
the thread that actually does the last subtraction from block.used.

Fixing this problem is tricky, and the fundamental problem there is that
block.used is not stable: it may be observed to be zero, but that
doesn't necessarily mean you're the last popper!

Do something simpler: keep a counter of how many pops have happened to a
given block instead of block.used. This counter monotonically increases
when a pop is _completely done_.  Because this counter is monotonically
increasing, and only increases when a popper is done, then we know for
sure whichever popper is the last to increase it (i.e. its value is
spanSetBlockEntries) is also the last popper in the block. Because the
race described above still exists, the last popper may not be the one
which claimed the last slot in the block, but we know for certain nobody
else is popping from that block anymore so we can safely free it.
Finally, because pops serialize with pushes to the same slot, we need
not worry about concurrent pushers at all.

Updates #37487.

Change-Id: I6697219372774c8ca7d8ee6895eaa230a64ce9e1
Reviewed-on: https://go-review.googlesource.com/c/go/+/230497
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
  • Loading branch information
mknyszek committed Apr 28, 2020
1 parent 0ddde4a commit 39380e8
Showing 1 changed file with 31 additions and 12 deletions.
43 changes: 31 additions & 12 deletions src/runtime/mspanset.go
Expand Up @@ -60,12 +60,10 @@ type spanSetBlock struct {
// Free spanSetBlocks are managed via a lock-free stack.
lfnode

// used represents the number of slots in the spans array which are
// currently in use. This number is used to help determine when a
// block may be safely recycled.
//
// Accessed and updated atomically.
used uint32
// popped is the number of pop operations that have occurred on
// this block. This number is used to help determine when a block
// may be safely recycled.
popped uint32

// spans is the set of spans in this block.
spans [spanSetBlockEntries]*mspan
Expand Down Expand Up @@ -135,7 +133,6 @@ retry:

// We have a block. Insert the span atomically, since there may be
// concurrent readers via the block API.
atomic.Xadd(&block.used, 1)
atomic.StorepNoWB(unsafe.Pointer(&block.spans[bottom]), unsafe.Pointer(s))
}

Expand Down Expand Up @@ -202,8 +199,19 @@ claimLoop:
// corruption. This way, we'll get a nil pointer access instead.
atomic.StorepNoWB(unsafe.Pointer(&block.spans[bottom]), nil)

// If we're the last possible popper in the block, free the block.
if used := atomic.Xadd(&block.used, -1); used == 0 && bottom == spanSetBlockEntries-1 {
// Increase the popped count. If we are the last possible popper
// in the block (note that bottom need not equal spanSetBlockEntries-1
// due to races) then it's our resposibility to free the block.
//
// If we increment popped to spanSetBlockEntries, we can be sure that
// we're the last popper for this block, and it's thus safe to free it.
// Every other popper must have crossed this barrier (and thus finished
// popping its corresponding mspan) by the time we get here. Because
// we're the last popper, we also don't have to worry about concurrent
// pushers (there can't be any). Note that we may not be the popper
// which claimed the last slot in the block, we're just the last one
// to finish popping.
if atomic.Xadd(&block.popped, 1) == spanSetBlockEntries {
// Clear the block's pointer.
atomic.StorepNoWB(blockp, nil)

Expand Down Expand Up @@ -236,10 +244,20 @@ func (b *spanSet) reset() {
blockp := (**spanSetBlock)(add(b.spine, sys.PtrSize*uintptr(top)))
block := *blockp
if block != nil {
// Sanity check the used value.
if block.used != 0 {
throw("found used block in empty span set")
// Sanity check the popped value.
if block.popped == 0 {
// popped should never be zero because that means we have
// pushed at least one value but not yet popped if this
// block pointer is not nil.
throw("span set block with unpopped elements found in reset")
}
if block.popped == spanSetBlockEntries {
// popped should also never be equal to spanSetBlockEntries
// because the last popper should have made the block pointer
// in this slot nil.
throw("fully empty unfreed span set block found in reset")
}

// Clear the pointer to the block.
atomic.StorepNoWB(unsafe.Pointer(blockp), nil)

Expand Down Expand Up @@ -270,6 +288,7 @@ func (p *spanSetBlockAlloc) alloc() *spanSetBlock {

// free returns a spanSetBlock back to the pool.
func (p *spanSetBlockAlloc) free(block *spanSetBlock) {
atomic.Store(&block.popped, 0)
p.stack.push(&block.lfnode)
}

Expand Down

0 comments on commit 39380e8

Please sign in to comment.