CI Run Link: https://github.com/coder/coder/actions/runs/21235972985
Failing job:
- Workflow: nightly-gauntlet
- Job: test-go-pg (macos-latest)
- Completed at: 2026-01-22T04:36:34Z (run_attempt=1)
Commit:
- 26ce070393466ec05036375992c0ec657b95fec0 (DevCats) — "feat: update doc-check workflow to utilize claude-skills (#21588)"
Failure evidence:
- File:
coderd/workspaces_test.go
- Test:
TestPostWorkspacesByOrganization/AllProvisionersStale
- Lines: 1392-1393
Observed failure:
workspaces_test.go:1392:
Error: Should be zero, but was 1
Test: TestPostWorkspacesByOrganization/AllProvisionersStale
workspaces_test.go:1393:
Error: Not equal:
expected: 2026-01-22 03:28:27.445925 +0000 UTC
actual : 2026-01-22 04:28:27.364759 +0000 UTC
What the test is doing (relevant snippet):
- Starts coderd with
IncludeProvisionerDaemon: true
- Manually backdates provisioner daemons:
newLastSeenAt := dbtime.Now().Add(-time.Hour)
UPDATE provisioner_daemons SET last_seen_at = $1
- Creates a workspace and expects:
MatchedProvisioners.Count == 1
MatchedProvisioners.Available == 0
MatchedProvisioners.MostRecentlySeen.Time == newLastSeenAt
Hypothesis / likely root cause:
- The in-memory provisioner daemon is still running and can heartbeat/update
provisioner_daemons.last_seen_at concurrently with the test’s manual UPDATE.
- If the daemon updates
last_seen_at after the test sets it stale (but before the workspace creation response is assembled), then:
MostRecentlySeen becomes ~now (not -1h)
Available can become 1 (daemon considered healthy), violating the test’s expectation.
This matches the observed 1-hour skew and Available == 1.
Not a data race / crash:
- No
WARNING: DATA RACE, panic:, or OOM indicators seen in the job logs.
Suggested fix direction:
- Make the daemon deterministically stale for the duration of the assertion, e.g.:
- stop/pause the provisioner daemon heartbeat before updating
last_seen_at, or
- run the test without a live provisioner daemon and insert a stale provisioner_daemons row directly, or
- add test helpers to control/override daemon last-seen timestamps without racing a background heartbeat.
Duplicate search (coder/internal):
- Searched:
AllProvisionersStale, TestPostWorkspacesByOrganization, workspaces_test.go:1392, provisioner_daemons last_seen_at — no matches.
Assignment analysis:
- Recent relevant area changes include provisioner operation/test plumbing changes in
3194bcfc (Steven Masley) and the existing related flake in TestTemplateVersionDryRun/ImportNotFinished is also owned by the templates/provisioner code.
- Assigning to the most recent meaningful modifier in the provisioner/test integration area.
CI Run Link: https://github.com/coder/coder/actions/runs/21235972985
Failing job:
Commit:
Failure evidence:
coderd/workspaces_test.goTestPostWorkspacesByOrganization/AllProvisionersStaleObserved failure:
What the test is doing (relevant snippet):
IncludeProvisionerDaemon: truenewLastSeenAt := dbtime.Now().Add(-time.Hour)UPDATE provisioner_daemons SET last_seen_at = $1MatchedProvisioners.Count == 1MatchedProvisioners.Available == 0MatchedProvisioners.MostRecentlySeen.Time == newLastSeenAtHypothesis / likely root cause:
provisioner_daemons.last_seen_atconcurrently with the test’s manualUPDATE.last_seen_atafter the test sets it stale (but before the workspace creation response is assembled), then:MostRecentlySeenbecomes ~now (not -1h)Availablecan become 1 (daemon considered healthy), violating the test’s expectation.This matches the observed 1-hour skew and
Available == 1.Not a data race / crash:
WARNING: DATA RACE,panic:, or OOM indicators seen in the job logs.Suggested fix direction:
last_seen_at, orDuplicate search (coder/internal):
AllProvisionersStale,TestPostWorkspacesByOrganization,workspaces_test.go:1392,provisioner_daemons last_seen_at— no matches.Assignment analysis:
3194bcfc(Steven Masley) and the existing related flake inTestTemplateVersionDryRun/ImportNotFinishedis also owned by the templates/provisioner code.