CI Run: https://github.com/coder/coder/actions/runs/24164877546
Failed Job: test-go-pg-17 (job ID: 70524212886)
Commit: d9544603804c7cd4f6505a6b226c12e9623c7261 (author: Matt Vollmer)
Failure:
=== FAIL: coderd/x/chatd/chatloop TestRun_FirstPartDisarmsStartupTimeout (1.02s)
chatloop_test.go:430:
Error: Not equal:
expected: 1
actual : 2
Test: TestRun_FirstPartDisarmsStartupTimeout
Error analysis:
- The test expects only one attempt and no retry after the first stream part is received, but
attempts incremented to 2 (startup timeout path retried).
- This suggests the startup timeout guard fired despite the first part being emitted, likely due to a timing race around disarming the startup timer vs the delayed delta parts.
- No data race warnings, panic traces, or OOM indicators found in the job logs.
Root cause classification: Flaky test (timing-sensitive startup timeout disarm race).
Assignment analysis:
git log --oneline --follow coderd/x/chatd/chatloop/chatloop_test.go
git show 70f031d793f0cde79f48acc2bd070a0abb7a1655 -- coderd/x/chatd/chatloop/chatloop_test.go
- Commit 70f031d ("feat(coderd/chatd): structured chat error classification and retry hardening") added
TestRun_FirstPartDisarmsStartupTimeout and the startup-timeout guard logic/tests.
- Assigning to @ethanndickson as the most recent non-trivial modifier of the failing test and startup-timeout behavior.
Related issues:
- No matching issues found after searching coder/internal for:
- "TestRun_FirstPartDisarmsStartupTimeout"
- "chatloop_test.go"
- "startup timeout" + "chatloop"
- "coderd/x/chatd"
Reproduction (likely flaky):
go test ./coderd/x/chatd/chatloop -run TestRun_FirstPartDisarmsStartupTimeout -count=1
CI Run: https://github.com/coder/coder/actions/runs/24164877546
Failed Job: test-go-pg-17 (job ID: 70524212886)
Commit: d9544603804c7cd4f6505a6b226c12e9623c7261 (author: Matt Vollmer)
Failure:
Error analysis:
attemptsincremented to 2 (startup timeout path retried).Root cause classification: Flaky test (timing-sensitive startup timeout disarm race).
Assignment analysis:
git log --oneline --follow coderd/x/chatd/chatloop/chatloop_test.gogit show 70f031d793f0cde79f48acc2bd070a0abb7a1655 -- coderd/x/chatd/chatloop/chatloop_test.goTestRun_FirstPartDisarmsStartupTimeoutand the startup-timeout guard logic/tests.Related issues:
Reproduction (likely flaky):