Stronger session isolation and faster E2E tests with parallelization by canercidam · Pull Request #374 · flashbots/builder-playground

canercidam · 2026-02-18T14:01:00Z

Port selection in sessions now have an offset of 5 (per port selected) like port = port + concurrent_sessions × 5
Session directories are now isolated by their own names instead of defaulting to a devnet directory:
- The main playground dir is still like $HOME/.local/state/builder-playground
- The new session paths are like <playground-dir>/sessions/<session-id>
- There are two new symlinks:
  - <playground-dir>/devnet (keeping old experience to access to the latest dir)
  - <playground-dir>/sessions/latest (new symlink for keeping reference to the latest dir)
The docker bridge networks used for each session is now isolated by name as well, like builder-playground-<session-id>
Creating new E2E test suite which can:
- run the playground binary to start a session per test case (in parallel)
- apply side effects to the services of sessions
- make checks on the state of the running services
The component/integration tests are abandoned in favor of the more complete E2E tests

With these changes, now it is also possible to start multiple sessions on a machine for testing and connect services of different sessions.

Fixes #333

claude · 2026-02-18T14:10:48Z

Claude finished @canercidam's task in 2m 39s —— View job

PR Review: Stronger session isolation and faster E2E tests

Read and analyze all changed files
Review code quality and patterns
Check for bugs, edge cases, and security issues
Post review feedback with inline comments

Summary

The PR introduces session isolation (ports, directories, networks) and replaces integration tests with a new E2E test framework. The overall direction is good, but there are several issues worth addressing.

Issues Found

Bugs / Correctness:

cmdErrChan consumed in waitForOutput, unavailable to waitForReady (e2e_test.go:152-158): The channel is read in waitForOutput to detect early process exit. Once consumed, waitForReady can no longer detect if the process exited, potentially hanging until timeout.
resp.StatusCode only checked for last endpoint (recipes_test.go:130): The TestBuilderNet test checks status code outside the loop, only validating the final endpoint. Also, defer resp.Body.Close() inside a loop defers all closes until function return, leaking connections during the loop.

Race Conditions / Reliability:

TOCTOU race in CountConcurrentPlaygroundSessions (process.go:15): The ps-based count is a point-in-time snapshot. Two sessions starting simultaneously both see count=0 and get the same offset, defeating the isolation goal.
portOffsetMultiplier = 5 may be too small (local_runner.go:186): A recipe like buildernet has 6+ services, each with 1-4 ports. With an offset of only 5 per concurrent session, port ranges overlap quickly. The reservePort probing provides a fallback, but it undermines the deterministic offset intent.
time.Now().UnixNano() for test dir uniqueness (e2e_test.go:82): Can collide on VMs with coarse clock resolution. t.TempDir() would be both unique and auto-cleaned.

Code Quality:

Fragile process matching (process.go:36): Simple substring matching on ps output can produce false positives (e.g., editing a file whose path contains "builder-playground" and "start").
startupMu serialization pattern (e2e_test.go:24,112-136): The mutex serializes all playground startups but is released as soon as "Waiting for services to get healthy" appears. This is a pragmatic workaround for port conflicts, but it's worth a comment explaining why full parallelism isn't safe (the ps-based counting + offset).

Minor Observations

The CI workflow (checks.yaml) was significantly simplified, which is a positive change. The separation of unit tests (go test ./playground/...) from E2E tests (make e2e-test) is clean.
The Makefile E2E target correctly builds the binary before running tests.
Session directory isolation via sessionID in NewOutput and network naming via builder-playground-<session-id> are straightforward and well-implemented.
The deleted components_test.go (343 lines) contained integration tests that are now covered by the E2E suite with better isolation. Good tradeoff.

utils/process.go

e2e/e2e_test.go

e2e/recipes_test.go

playground/local_runner.go

e2e/e2e_test.go

claude · 2026-02-18T22:08:17Z

Claude finished @canercidam's task in 2m 53s —— View job

PR Review (Re-review after updates)

Analyze new commits since last review
Review all changed files for bugs, security, and code quality
Post inline comments on issues found
Provide summary feedback

Changes Since Last Review

The new commits (c794643 through b44871b) address several issues from the initial review:

Fixed:

cmdErrChan consumption bug — Replaced with processCtx/processErr mutex pattern. waitForOutput and waitForReady now both use p.processCtx.Done() to detect process exit. Clean fix.
resp.StatusCode only checked for last endpoint — Status code is now asserted inside the loop per endpoint.
defer resp.Body.Close() in loop — Changed to explicit resp.Body.Close() after processing each response.
time.Now().UnixNano() for dir uniqueness — Now uses t.TempDir().
Session directory isolation — Session dirs are now under sessions/<sessionID> with latest and devnet symlinks for backward compatibility.

Still Present (3 inline comments posted):

Fragile process matching (utils/process.go:34-38) — Substring matching on ps output can produce false positives. The test suite also lacks false-positive test cases. Suggested tightening the matching to verify start/cook appears as a distinct argument token.
TOCTOU race in CountConcurrentPlaygroundSessions (utils/process.go:13-20) — Two sessions starting simultaneously can both see count=0 and get the same offset. The E2E startupMu mitigates this in tests, but production usage has no coordination. The reservePort probe provides a safety net, but the deterministic offset intent is undermined.
portOffsetMultiplier = 5 is too small (local_runner.go:43) — A recipe like buildernet has 6+ services with multiple ports each. With only a 5-port offset between concurrent sessions, ranges overlap quickly. Consider increasing to ~100.

Other Observations

The E2E test framework is well-structured. The playgroundInstance abstraction with startPlayground/runPlayground/waitForReady is clean.
The startupMu serialization in launchPlayground is a pragmatic workaround. The comment at line 22 (ensures only one playground starts at a time) is minimal — it would benefit from explaining why (the ps-based counting needs the startup phase to be serialized).
CI workflow simplification (separating unit tests from E2E) is a good structural change.
The cleanup() method correctly handles t.TempDir() output dirs — the redundant os.RemoveAll at line 109 is harmless since t.TempDir() auto-cleans, but could be removed.

utils/process.go

playground/local_runner.go

claude · 2026-02-18T22:19:20Z

Claude finished @canercidam's task in 3m 22s —— View job

PR Review (3rd pass — after latest push)

Read and analyze all changed files
Check previous review comments and author responses
Review for bugs, edge cases, and code quality issues
Post review feedback with inline comments

Previous Issues — Status

Most issues from the first two reviews have been addressed:

Issue	Status
`cmdErrChan` consumed by `waitForOutput`	Fixed — replaced with `processCtx`/`processErr` mutex pattern
`resp.StatusCode` only checked for last endpoint	Fixed — assertion moved inside the loop
`defer resp.Body.Close()` in loop	Fixed — explicit `Close()` after each response
`time.Now().UnixNano()` for dir uniqueness	Fixed — uses `t.TempDir()`
`portOffsetMultiplier = 5` too small	Resolved — author clarified the offset applies per-port, not per-session range. This is reasonable.
TOCTOU race in `CountConcurrentPlaygroundSessions`	Acknowledged — author accepts the limitation
Fragile process matching	Still present — re-flagged with suggestion to add false-positive test cases

New Issues (3 inline comments)

lineBuffer.Contains checks Write buffers individually (e2e/e2e_test.go:35) — If a target string spans two Write calls (due to OS stdout buffering), the match fails. Low risk in practice, but easily fixable by checking the concatenated output.
Redundant os.RemoveAll after t.TempDir() (e2e/e2e_test.go:109) — t.TempDir() auto-cleans on test completion. The explicit removal is unnecessary.
Fragile process matching (utils/process.go:34-37) — Re-flagged. The test suite lacks false-positive test cases to document the known limitation.

Overall

The PR is in good shape after the revisions. The session isolation (directories, networks, ports) is well-implemented. The E2E test framework with playgroundInstance is clean and the startupMu serialization is a pragmatic approach. The remaining inline comments are minor improvements rather than blockers.

e2e/e2e_test.go

utils/process.go

canercidam added 4 commits February 17, 2026 21:28

isolate session dirs

ce8c191

isolate pids of parallel sessions

207b9e2

isolate sessions further and introduce new e2e tests

e943531

drop integration tests in favor of new e2e tests

42790c7

claude bot reviewed Feb 18, 2026

View reviewed changes

utils/process.go Show resolved Hide resolved

claude bot reviewed Feb 18, 2026

View reviewed changes

utils/process.go Show resolved Hide resolved

claude bot reviewed Feb 18, 2026

View reviewed changes

e2e/e2e_test.go Outdated Show resolved Hide resolved

claude bot reviewed Feb 18, 2026

View reviewed changes

e2e/e2e_test.go Show resolved Hide resolved

claude bot reviewed Feb 18, 2026

View reviewed changes

e2e/recipes_test.go Outdated Show resolved Hide resolved

claude bot reviewed Feb 18, 2026

View reviewed changes

playground/local_runner.go Show resolved Hide resolved

claude bot reviewed Feb 18, 2026

View reviewed changes

e2e/e2e_test.go Show resolved Hide resolved

use temp dir in e2e tests

87ab4a8

canercidam mentioned this pull request Feb 18, 2026

Test components/recipes in parallel #259

Closed

canercidam added 5 commits February 18, 2026 18:34

fix status code check and close req bodies earlier

c794643

better wait funcs with context

ab4e36c

keep devnet and add latest, as symlinks to the latest session dir

8b16483

add sessions segment to session dir path

74b0676

fix paths

b44871b

claude bot reviewed Feb 18, 2026

View reviewed changes

utils/process.go Show resolved Hide resolved

claude bot reviewed Feb 18, 2026

View reviewed changes

utils/process.go Show resolved Hide resolved

claude bot reviewed Feb 18, 2026

View reviewed changes

playground/local_runner.go Show resolved Hide resolved

fix linter error

a8d4a8e

claude bot reviewed Feb 18, 2026

View reviewed changes

e2e/e2e_test.go Show resolved Hide resolved

claude bot reviewed Feb 18, 2026

View reviewed changes

e2e/e2e_test.go Show resolved Hide resolved

claude bot reviewed Feb 18, 2026

View reviewed changes

utils/process.go Show resolved Hide resolved

canercidam merged commit 10b23bc into main Feb 18, 2026
16 checks passed

canercidam deleted the caner/parallel-e2e branch February 18, 2026 22:27

canercidam mentioned this pull request Feb 18, 2026

The e2e tests in CI taking 6 minutes seems really long #330

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stronger session isolation and faster E2E tests with parallelization#374

Stronger session isolation and faster E2E tests with parallelization#374
canercidam merged 11 commits intomainfrom
caner/parallel-e2e

canercidam commented Feb 18, 2026 •

edited

Loading

Uh oh!

claude bot commented Feb 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

claude bot commented Feb 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

claude bot commented Feb 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

canercidam commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Stronger session isolation and faster E2E tests

Summary

Issues Found

Minor Observations

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

claude bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review (Re-review after updates)

Changes Since Last Review

Other Observations

Uh oh!

Uh oh!

Uh oh!

Uh oh!

claude bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review (3rd pass — after latest push)

Previous Issues — Status

New Issues (3 inline comments)

Overall

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

canercidam commented Feb 18, 2026 •

edited

Loading

claude bot commented Feb 18, 2026 •

edited

Loading

claude bot commented Feb 18, 2026 •

edited

Loading

claude bot commented Feb 18, 2026 •

edited

Loading