Skip to content

fix(browser): resolve flaky TestManagedBackendErrorPropagation (CDP send race)#116

Merged
cnjack merged 1 commit into
mainfrom
fix/cdp-send-prefer-response-over-close
Jul 5, 2026
Merged

fix(browser): resolve flaky TestManagedBackendErrorPropagation (CDP send race)#116
cnjack merged 1 commit into
mainfrom
fix/cdp-send-prefer-response-over-close

Conversation

@cnjack

@cnjack cnjack commented Jul 5, 2026

Copy link
Copy Markdown
Owner

Problem

TestManagedBackendErrorPropagation in internal/browser fails intermittently on CI:

cdp_test.go: expected boom error, got cdp connection closed during Target.getTargets
FAIL github.com/cnjack/jcode/internal/browser

(Seen blocking an unrelated PR — the failure has nothing to do with that PR's diff.)

Root cause

wsCDP.send (cdp.go) waits on a select over three cases: the per-request response channel ch, ctx.Done(), and c.closed. When a server writes a result/error frame and immediately drops the socket, the read loop:

  1. delivers the frame into the buffered ch (ch <- msg), then
  2. on the next read sees the closed socket, closes any still-pending channels, and close(c.closed).

That leaves both <-ch and <-c.closed ready at once. Go's select picks a ready case at random, so ~50% of the time send returns "cdp connection closed during <method>" and discards the response that had already arrived. This is both the test flake and a latent correctness bug (a real CDP response racing a socket close could be dropped).

CI runs go test ./... without -race, so it only manifested as the occasional assertion failure, not a race warning.

Fix

In the c.closed branch, do a non-blocking receive on ch first and return the buffered response if one is present. The read loop always delivers-or-closes every pending channel before signaling c.closed, so this recovery is decisive — no new races, no busy-waiting.

Verification

  • Reproduced the exact CI failure pre-fix under scheduling pressure: GOMAXPROCS=2 go test ./internal/browser -run TestManagedBackendErrorPropagation -count=3000 → multiple expected boom error, got cdp connection closed failures.
  • Post-fix: same 3000× stress passes; focused browser tests pass under -race.
  • Hardened the test to loop the boom-then-close scenario 50× so a regression fails reliably instead of ~half the time.

Note (out of scope)

Running the full internal/browser package under -race also surfaces a separate, pre-existing data race in TestBridgeKeepAlivePing (it mutates the package globals keepAlivePing/keepAliveWait that a keepAlive() goroutine reads at bridge.go:178). It exists on main independent of this change and CI doesn't run -race, so it's left for a follow-up.

Generated with Jack AI bot

Summary by CodeRabbit

  • Bug Fixes
    • Improved reliability when a browser connection closes at the same time a command error is returned, so the original error is preserved more consistently.
    • Made error handling during this race condition more deterministic, reducing cases where a generic connection-closed message could mask the real failure.
  • Tests
    • Strengthened coverage for this close/error race to ensure the expected error is consistently reported.

wsCDP.send selected between the response channel and the connection-closed
signal. When a server writes a result/error frame and immediately drops the
socket, the read loop buffers the frame into the request's channel *and*
closes c.closed, leaving both select cases ready — so Go picked between them
at random and send would intermittently report "cdp connection closed during
<method>" instead of returning the actual response. This surfaced as a flaky
TestManagedBackendErrorPropagation on CI.

Recover the buffered response in the c.closed branch before reporting the
close (the read loop always delivers or closes every pending channel before
signaling c.closed, so a non-blocking receive is decisive). Harden the test
to loop the boom-then-close scenario so the race fails reliably instead of
~half the time; verified it reproduces the CI failure pre-fix (GOMAXPROCS=2,
count=3000) and passes post-fix.

Generated with Jack AI bot
@coderabbitai

coderabbitai Bot commented Jul 5, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5fef5cdf-c57b-4b92-9b17-903718852099

📥 Commits

Reviewing files that changed from the base of the PR and between 8b4f961 and 10b7226.

📒 Files selected for processing (2)
  • internal/browser/cdp.go
  • internal/browser/cdp_test.go

📝 Walkthrough

Walkthrough

The wsCDP.send method's close-handling path is modified to first check for an already-available pending response before falling back to the closed-connection error. The corresponding test is updated to write an error frame and immediately close the connection, looping 50 times to deterministically verify error propagation.

Changes

CDP send close/error race fix

Layer / File(s) Summary
Prefer pending response over closed signal
internal/browser/cdp.go
On the closed-channel path in wsCDP.send, a non-blocking check of the pending response channel is added to return the real response or error if already available, before falling back to the "connection closed" error.
Deterministic race test
internal/browser/cdp_test.go
The fake server now writes a "boom" error frame and immediately closes the socket; the test loops 50 times, each time creating a backend and calling send, asserting the returned error contains "boom".

Estimated code review effort: 3 (Moderate) | ~20 minutes

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant wsCDPsend as wsCDP.send
  participant PendingChannel as ch
  participant ClosedChannel as c.closed

  Caller->>wsCDPsend: send(command)
  ClosedChannel->>wsCDPsend: closed signal ready
  wsCDPsend->>PendingChannel: non-blocking read
  alt response available
    PendingChannel-->>wsCDPsend: response frame or msg.Error
    wsCDPsend-->>Caller: return response/error
  else no response
    wsCDPsend-->>Caller: return connection closed error
  end
Loading

Related issues: None specified.

Related PRs: None specified.

Suggested labels: bug, tests

Suggested reviewers: None specified.

🐰 A race between "closed" and "boom",
resolved now within the send room,
fifty loops confirm the fix,
no more flaky error tricks,
the rabbit hops away, review's done.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the main fix: a CDP send race causing a flaky browser test.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/cdp-send-prefer-response-over-close

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@cnjack cnjack merged commit df4a9b6 into main Jul 5, 2026
3 checks passed
@cnjack cnjack deleted the fix/cdp-send-prefer-response-over-close branch July 5, 2026 03:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant