CI Run Link: https://github.com/coder/coder/actions/runs/18962366101
Workflow Job: test-go-pg (macos-latest)
Commit: 9298e7e073970011f7711cf37ffce5c5defe1a8d by Jake Howell
When: 2025-10-31 04:21 UTC
What failed
- Package: github.com/coder/coder/v2/codersdk/toolsdk
- Tests: TestTools/GetTaskLogs subtests [ByUUID, ByIdentifier]
Evidence from logs
=== FAIL: codersdk/toolsdk TestTools/GetTaskLogs/ByUUID (0.86s)
toolsdk_test.go:1570:
Error Trace: /Users/runner/work/coder/coder/codersdk/toolsdk/toolsdk_test.go:1570
Error: Received unexpected error:
get task logs "7218018d-547e-4d68-9793-1799c19d69e1":
github.com/coder/coder/v2/codersdk/toolsdk.init.func32
/Users/runner/work/coder/coder/codersdk/toolsdk/toolsdk.go:2134
- GET http://127.0.0.1:59633/api/experimental/tasks/nervous-tesla6-bvG/7218018d-547e-4d68-9793-1799c19d69e1/logs: unexpected status code 400: Task status must be active.
Error: Task status is "initializing", it must be "active" to interact with the task.
Classification
- Type: Flaky test (timing-dependent; task still initializing when logs are requested)
- Not a matrix cancellation artifact (only macOS job failed; Windows succeeded)
- No data race indicators (no "WARNING: DATA RACE" present)
- No panic/OOM signatures
Precise assignment analysis (Test function blame)
- Failing test block: toolsdk_test.go “GetTaskLogs” table tests around lines ~1493–1580.
Commands used:
- grep -n "GetTaskLogs" codersdk/toolsdk/toolsdk_test.go
- git blame -L 1493,1580 codersdk/toolsdk/toolsdk_test.go
- Recent modification to this exact test section: a1fa58ac17c4 (2025-10-28) "fix: update dbgen and dbfake task creation and toolsdk test fixtures" by Mathias Fredriksson (refactor to use dbfake.WithTask and new task model). This is the last change touching the failing lines.
- Assigning to: @mafredri (last modifier of the failing test function lines per blame).
Root cause hypothesis
- Server endpoint for task logs correctly enforces that task status is "active" before interaction.
- The test appears to create a task and request logs immediately after agent connect, without waiting for the task to progress from "initializing" to "active", causing intermittent 400s.
- Recent changes to task creation/linking in tests (switch to dbfake.WithTask and explicit task IDs) may have narrowed timing margins, increasing likelihood of hitting the initializing window.
Related issues
Proposed fix
- In TestTools/GetTaskLogs, wait/poll for the created task to reach status "active" before requesting logs. Alternatively, the test can relax timing by ensuring the agent-side task app is ready before asserting on logs.
Reproduction hints
- Re-run on macOS runner:
go test ./codersdk/toolsdk -run "TestTools/GetTaskLogs" -count=50
- Expect occasional failures with the 400 "Task status must be active" response if no readiness wait is added.
Quality checklist
- Used grep to identify failing test and lines; blame points to last modifier of the failing function
- Searched for data race/panic/OOM (none)
- Searched coder/internal (open/closed) for duplicates: "GetTaskLogs", error text (none found)
- Assignment based on test ownership (blame), not CI run author
CI Run Link: https://github.com/coder/coder/actions/runs/18962366101
Workflow Job: test-go-pg (macos-latest)
Commit: 9298e7e073970011f7711cf37ffce5c5defe1a8d by Jake Howell
When: 2025-10-31 04:21 UTC
What failed
Evidence from logs
Classification
Precise assignment analysis (Test function blame)
Commands used:
Root cause hypothesis
Related issues
Proposed fix
Reproduction hints
go test ./codersdk/toolsdk -run "TestTools/GetTaskLogs" -count=50Quality checklist