fix(internal/syncwriter): stop corrupting test output on sink write errors#226
Conversation
Coverage Report for CI Build 1Coverage increased (+0.008%) to 96.034%Details
Uncovered ChangesNo uncovered changes found. Coverage RegressionsNo coverage regressions found. Coverage Stats
💛 - Coveralls |
4302134 to
93b5136
Compare
|
Worth flagging that the single-write change to That suppression is a judgment call, so I'd like your preference between:
For what it's worth, I scoped the current suppression to the true "reader gone" errors and deliberately left |
mafredri
left a comment
There was a problem hiding this comment.
The println change is fine, but can't make a judgement call about pipe handling without seeing the code that triggered it.
caf495c to
7674226
Compare
The error handler used the builtin println, which writes to fd 2 unsynchronized with the test stream and emits the message and its trailing newline as two separate syscalls. Under concurrency a test framework's "--- PASS:"/"--- FAIL:" line can land between those two writes, so test2json never records that test's terminal action and tools like gotestsum report an innocent, passing test as unknown or failed. Format the message and its newline into one fmt.Fprintf to os.Stderr so the write can't be split, which keeps the result marker at the start of a line. Fixes #225.
7674226 to
d24a8a4
Compare
The problem
internal/syncwriterreported sink write and sync failures with the builtinprintln. That call writes raw bytes to fd 2, and it emits the message and its trailing newline as two separate, unsynchronized syscalls. Under concurrency, a test framework's--- PASS:line can land between those two writes, sotest2jsonnever records the terminal action and tools likegotestsumflag an innocent, passing test as failed or unknown. We hit this incoder/coder, where a passingTestRetryWithIntervalwas reported as a failure.Fixes #225.
The fix
Replace
printlnwith onefmt.Fprintf(os.Stderr, ...). The write can't be split across syscalls, so a reported error can no longer wedge into another test's result line, which keeps the result marker at the start of a line.This is the minimal change. Suppressing the benign "reader gone" errors that triggered the original flake is being handled on the
coder/coderside instead (a writer that opts out of those errors), so slog does not hide the fact that a user-assigned sink failed to receive output.Evidence from a real CI failure
From this
coder/coderrun (2026-05-28, commit094fe97).gotestsummarked a passingTestRetryWithIntervaland all its subtests(unknown):because a concurrent
port-forward -vtest's teardown logged a closed-pipe error through the oldprintln, which split the line:The
--- PASS:marker is no longer at the start of a line, sotest2jsondropped the terminal action. This change keeps the newline attached so the marker always starts a line.Tests
TestWriter_defaultErrorReportsThroughStderr: the default handler reports through theos.Stderrvariable in one newline-terminated write, and fails if reverted toprintln.go vet ./...,gofmt/gofumpt, andgo test -race ./internal/syncwriter/are clean, and the full module suite passes.Picking this up in coder/coder
coder/coderpinscdr.dev/slog/v3 v3.0.0with noreplace, so it won't update on its own. Once this merges and a tag likev3.0.1is cut, a follow-up PR there bumps the dependency withgo get cdr.dev/slog/v3@v3.0.1 && go mod tidy.Investigation and decision log
Root cause (
internal/syncwriter/syncwriter.goonmain): theNewerror handler ranprintln(fmt.Sprintf(...)).printlnwrites to fd 2 out of sync with the test stream and splits the text and newline across two syscalls, which is what lets the interleave droptest2jsonterminal actions.Scope: an earlier revision also suppressed benign
io.ErrClosedPipe/syscall.EPIPEwrite errors inside slog. Per maintainer feedback, that was dropped: if a user-assigned sink returns an error, slog should not hide it. The opt-out belongs in the consumer (coder/coder), which is where the closing writer lives.Test scope: reproducing the interleave itself is inherently racy, so the test asserts a deterministic, revert-sensitive property instead: diagnostics flow through the redirectable
os.Stderrvariable rather than the rawprintlnbuiltin.Generated by Coder Agents on behalf of @EhabY.