Conversation
PR SummaryMedium Risk Overview Reviewed by Cursor Bugbot for commit d59d81a. Bugbot is set up for automated code reviews on this repo. Configure here. |
| // MapSubscriber receives lifecycle notifications from the sandbox Map. | ||
| // | ||
| // Callbacks are invoked synchronously from the goroutine that performed the | ||
| // state change. Implementations must be non-blocking; if async work is needed, | ||
| // the subscriber is responsible for dispatching it. | ||
| type MapSubscriber interface { | ||
| // OnInsert is triggered when a sandbox transitions to the running state. | ||
| OnInsert(ctx context.Context, sandbox *Sandbox) |
There was a problem hiding this comment.
🟣 NFSHandler.OnNetworkRelease performs blocking I/O (blocking on <-doneCh inside chroot.Close()), which violates the new non-blocking contract added to MapSubscriber in this PR (map.go:14-21). This is a pre-existing issue: NetworkReleased was already invoked synchronously before this PR and NFSHandler was already blocking, but the PR replaces the old 'synchronous, caller can rely on completion' comment with a new 'must be non-blocking' requirement without updating NFSHandler to comply. A slow chroot teardown will stall the goroutine calling NetworkReleased, delaying sandbox network slot recycling.
Extended reasoning...
What the bug is
The PR adds an interface-level contract to MapSubscriber (map.go:14-18):
Callbacks are invoked synchronously from the goroutine that performed the state change. Implementations must be non-blocking; if async work is needed, the subscriber is responsible for dispatching it.
At the same time, the PR removes the old NetworkReleased comment that said "Subscribers are invoked synchronously so the caller can rely on them having completed before taking any follow-up action." The old contract explicitly allowed blocking; the new contract explicitly forbids it.
The code path that triggers the violation
NFSHandler.OnNetworkRelease (nfsproxy/chroot/nfs.go:93-113) acquires h.mu.Lock(), removes entries from an internal map, releases the lock, then calls chroot.Close() in a loop for each removed entry. chroot.Close() calls fs.ns.Close() (chroot.go:119-121), which dispatches to mountNS.Close() (mountns.go:76-99). mountNS.Close() closes the stopCh channel to signal a dedicated OS thread to restore the original mount namespace, then blocks on <-doneCh (mountns.go:91), waiting for that thread to call unix.Setns() and signal completion. This is a genuine synchronous blocking operation that can take non-trivial time for each chroot being torn down.
Why existing code doesn't prevent it
NetworkReleased calls m.trigger() synchronously (no goroutine dispatch), and trigger() iterates all subscribers calling OnNetworkRelease in sequence. There is no timeout, context cancellation check, or goroutine wrapper around the subscriber calls. OnNetworkRelease is called directly on the same goroutine that initiated the network release.
Impact
Each chroot teardown in OnNetworkRelease stalls the goroutine calling NetworkReleased for the duration of the mount namespace restoration syscall. When multiple network slots are released concurrently or sequentially, these stalls accumulate, delaying sandbox network slot recycling. Future contributors reading the MapSubscriber interface will see "must be non-blocking" and assume compliance, potentially introducing race conditions or deadlocks if they rely on that guarantee.
Proof by example
- A sandbox is stopped; its IP slot is released via NetworkReleased(ctx, ip).
- NetworkReleased calls m.trigger(), which iterates subscribers and calls NFSHandler.OnNetworkRelease(ctx, sbx).
- NFSHandler.OnNetworkRelease iterates its chroot map and calls chroot.Close() for each entry.
- Each chroot.Close() → mountNS.Close() closes stopCh and then does
<-doneCh. - The dedicated OS goroutine receives the stop signal, calls unix.Setns() to restore the original namespace, then closes doneCh.
- Only after doneCh is closed does OnNetworkRelease return, unblocking NetworkReleased.
- The network slot recycling (and any follow-up logic) is delayed by the sum of all Setns syscall latencies.
How to fix
Wrap the blocking work in a goroutine inside NFSHandler.OnNetworkRelease, or provide a separate async teardown path. For example:
func (h *NFSHandler) OnNetworkRelease(ctx context.Context, sbx *sandbox.Sandbox) {
chroots := h.collectChroots(sbx) // non-blocking extraction
go func() {
for _, c := range chroots {
c.Close()
}
}()
}This would satisfy the new non-blocking contract while preserving the teardown semantics.
Make OnInsert sync