Summary
internal/syncwriter reports sink write errors by calling the builtin println, which writes unsynchronized bytes to fd 2 (stderr). When slog runs inside a test binary, those raw writes are merged into the stream that go test/test2json parses. Under concurrency they interleave inside another test's --- PASS:/--- FAIL: line, so test2json never records that test's terminal action and tools like gotestsum report the (passing) test as unknown/failed. This causes false-positive CI flakes where the blamed test is an innocent bystander.
We hit this in coder/coder: a passing TestRetryWithInterval was reported as a failure purely due to this corruption.
Where (v3.0.0)
internal/syncwriter/syncwriter.go:
func New(w io.Writer) *Writer {
return &Writer{
w: w,
errorf: func(f string, v ...interface{}) {
println(fmt.Sprintf(f, v...)) // raw, unsynchronized write to fd 2
},
}
}
func (w *Writer) Write(name string, p []byte) {
...
if err != nil {
w.errorf("%v: failed to write entry: %+v", name, err)
}
}
Trigger: any sink whose underlying writer can fail, e.g. a sloghuman.Sink over an io.Pipe whose reader has gone away (io.ErrClosedPipe). In coder this happens with coder port-forward -v (and similar) when background goroutines keep logging after a test closes the pipe.
Evidence (deterministic)
1. slog emits the error into the captured test stream. A sloghuman.Sink over a closed io.Pipe, then one logger.Warn, run under go test -json:
{"Action":"output","Output":"sloghuman: failed to write entry: io: read/write on closed pipe\n"}
2. println writes the text and the newline as separate syscalls, so under concurrency the framework's --- PASS: write lands between them. Real CI output from coder/coder:
sloghuman: failed to write entry: io: read/write on closed pipe--- PASS: TestRetryWithInterval (0.00s)
3. Feeding that exact byte pattern to go tool test2json drops the pass action:
- clean
--- PASS: TestVictim (0.00s) -> pass TestVictim
sloghuman: ...closed pipe--- PASS: TestVictim (0.00s) -> no terminal action (parsed as plain output), so gotestsum reports TestVictim as unknown.
Impact
Any project using slog in tests can have unrelated, passing tests randomly reported as failed/unknown whenever a sink's writer is closed (pipes, network writers, broken stdio, etc.). It is very hard to diagnose because the failing test is not the one that logged.
Proposed fix
Don't write sink diagnostics to raw stderr in a way that can corrupt captured output:
- Minimal: in
Writer.Write, suppress benign "destination closed" errors (io.ErrClosedPipe, os.ErrClosed, syscall.EPIPE), mirroring the existing Sync() suppression of EINVAL/ENOTTY/EBADF.
- Better: make the error handler configurable (default no-op, or route through the logger itself) instead of an unconditional builtin
println to fd 2.
Happy to send a PR.
Filed by Coder Agents on behalf of @EhabY.
Summary
internal/syncwriterreports sink write errors by calling the builtinprintln, which writes unsynchronized bytes to fd 2 (stderr). When slog runs inside a test binary, those raw writes are merged into the stream thatgo test/test2jsonparses. Under concurrency they interleave inside another test's--- PASS:/--- FAIL:line, sotest2jsonnever records that test's terminal action and tools likegotestsumreport the (passing) test asunknown/failed. This causes false-positive CI flakes where the blamed test is an innocent bystander.We hit this in
coder/coder: a passingTestRetryWithIntervalwas reported as a failure purely due to this corruption.Where (v3.0.0)
internal/syncwriter/syncwriter.go:Trigger: any sink whose underlying writer can fail, e.g. a
sloghuman.Sinkover anio.Pipewhose reader has gone away (io.ErrClosedPipe). Incoderthis happens withcoder port-forward -v(and similar) when background goroutines keep logging after a test closes the pipe.Evidence (deterministic)
1. slog emits the error into the captured test stream. A
sloghuman.Sinkover a closedio.Pipe, then onelogger.Warn, run undergo test -json:2.
printlnwrites the text and the newline as separate syscalls, so under concurrency the framework's--- PASS:write lands between them. Real CI output fromcoder/coder:3. Feeding that exact byte pattern to
go tool test2jsondrops the pass action:--- PASS: TestVictim (0.00s)->pass TestVictimsloghuman: ...closed pipe--- PASS: TestVictim (0.00s)-> no terminal action (parsed as plain output), so gotestsum reportsTestVictimasunknown.Impact
Any project using slog in tests can have unrelated, passing tests randomly reported as failed/unknown whenever a sink's writer is closed (pipes, network writers, broken stdio, etc.). It is very hard to diagnose because the failing test is not the one that logged.
Proposed fix
Don't write sink diagnostics to raw stderr in a way that can corrupt captured output:
Writer.Write, suppress benign "destination closed" errors (io.ErrClosedPipe,os.ErrClosed,syscall.EPIPE), mirroring the existingSync()suppression ofEINVAL/ENOTTY/EBADF.printlnto fd 2.Happy to send a PR.
Filed by Coder Agents on behalf of @EhabY.