fix(envd): fix fan-out deadlock when process subscriber disconnects#2579
Conversation
There was a problem hiding this comment.
An organization admin can view or raise the cap at claude.ai/admin-settings/claude-code. The cap resets at the start of the next billing period.
Once the cap resets or is raised, push a new commit or reopen this pull request to trigger a review.
ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one. |
PR SummaryMedium Risk Overview Reviewed by Cursor Bugbot for commit a7ab522. Bugbot is set up for automated code reviews on this repo. Configure here. |
❌ 8 Tests Failed:
View the full list of 10 ❄️ flaky test(s)
To view more test analytics, go to the Test Analytics Dashboard |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 62d63c2d5f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
Code Review
The unbuffered channel returned by Fork can cause a deadlock during bootstrap event writes if the consumer goroutine exits early. Providing a buffer of at least 1 ensures the initial write completes without hanging the request handler. Iterating over the channels slice without holding a mutex is unsafe because the remove method modifies the underlying array in-place. This race condition can cause the fan-out loop to skip subscribers or process them multiple times, so the loop should use a shallow copy of the slice.
62d63c2 to
2b1c1d4
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 2b1c1d4. Configure here.
The fan-out loop sent to unbuffered subscriber channels while holding
RLock. If a subscriber stopped reading (e.g. client disconnect), the
send blocked forever, preventing remove() from acquiring the write lock.
This froze the output stream for all subscribers and hung any new
Connect RPC to that process.
Each subscriber now carries a done channel; fan-out delivers via
select { case s.ch <- v: case <-s.done: } so a cancelled subscriber
never wedges the loop. Includes regression tests.
2b1c1d4 to
a7ab522
Compare
…eration run() copied only the slice header under RLock, sharing the backing array with remove(). When remove() shifted elements in-place via append(channels[:i], channels[i+1:]...), the fan-out's stale snapshot would skip one subscriber and deliver to another twice — silent stdout/stderr corruption. Deep-copy the slice under RLock so the iteration is immune to concurrent mutations. Fixes: a67f983 ("fix(envd): fix fan-out deadlock when process subscriber disconnects (#2579)")
…eration run() copied only the slice header under RLock, sharing the backing array with remove(). When remove() shifted elements in-place via append(channels[:i], channels[i+1:]...), the fan-out's stale snapshot would skip one subscriber and deliver to another twice — silent stdout/stderr corruption. Deep-copy the slice under RLock so the iteration is immune to concurrent mutations. Fixes: a67f983 ("fix(envd): fix fan-out deadlock when process subscriber disconnects (#2579)")
…2b-dev#2579) The fan-out loop sent to unbuffered subscriber channels while holding RLock. If a subscriber stopped reading (e.g. client disconnect), the send blocked forever, preventing remove() from acquiring the write lock. This froze the output stream for all subscribers and hung any new Connect RPC to that process. Each subscriber now carries a done channel; fan-out delivers via `select { case s.ch <- v: case <-s.done: }` so a cancelled subscriber never wedges the loop. Includes regression tests. Also bumps envd version to 0.5.16.

The fan-out loop sent to unbuffered subscriber channels while holding
RLock. If a subscriber stopped reading (e.g. client disconnect), the
send blocked forever, preventing remove() from acquiring the write lock.
This froze the output stream for all subscribers and hung any new
Connect RPC to that process.
Each subscriber now carries a done channel; fan-out delivers via
select { case s.ch <- v: case <-s.done: }so a cancelled subscribernever wedges the loop. Includes regression tests.
Also bumps envd version to 0.5.16.