Skip to content

flake: Goroutine leak in prebuilds reconciler - enterprise/coderd tests #1116

@flake-investigator

Description

@flake-investigator

CI Run Link: https://github.com/coder/coder/actions/runs/19254608199
Failed Job: test-go-pg (macos-latest)
Commit: c21b3e49b36e8aea1b743e5b76e9ac6d3b8e3339 (Atif Ali)
Date: 2025-11-11

Root cause classification: Flaky test (goroutine leak)

Evidence from logs:

Goroutine leak detected by goleak:

goleak: Errors on successful test run: found unexpected goroutines:
[Goroutine 241684 in state select, with github.com/coder/coder/v2/enterprise/coderd/prebuilds.(*MetricsCollector).BackgroundFetch on top of the stack:
  github.com/coder/coder/v2/enterprise/coderd/prebuilds.(*MetricsCollector).BackgroundFetch(0x14017a61d40, {0x10877f388, 0x140443aa320}, 0xdf8475800, 0x2540be400)
    /Users/runner/work/coder/coder/enterprise/coderd/prebuilds/metricscollector.go:235 +0xac
  github.com/coder/coder/v2/enterprise/coderd/prebuilds.(*StoreReconciler).Run.func2()
    /Users/runner/work/coder/coder/enterprise/coderd/prebuilds/reconcile.go:155 +0x70
  created by github.com/coder/coder/v2/enterprise/coderd/prebuilds.(*StoreReconciler).Run in goroutine 241683
    /Users/runner/work/coder/coder/enterprise/coderd/prebuilds/reconcile.go:153 +0x368
 Goroutine 241683 in state select, with github.com/coder/coder/v2/enterprise/coderd/prebuilds.(*StoreReconciler).Run on top of the stack:
  github.com/coder/coder/v2/enterprise/coderd/prebuilds.(*StoreReconciler).Run(0x1403cd8bb00, {0x10877f350, 0x1403d76bda0})
    /Users/runner/work/coder/coder/enterprise/coderd/prebuilds/reconcile.go:188 +0x4f8
  runtime/pprof.Do({0x10877e6f8?, 0x10b3c2560?}, {{0x14043dfeda0?, 0x1403d75d880?, 0x14043ff6960?}}, 0x1403d77fc20)
    /Users/runner/work/_tool/go/1.24.10/arm64/src/runtime/pprof/runtime.go:51 +0x78
  created by github.com/coder/coder/v2/coderd/pproflabel.Go in goroutine 240105
    /Users/runner/work/coder/coder/coderd/pproflabel/pproflabel.go:10 +0xac
 Goroutine 241685 in state select, with github.com/coder/coder/v2/enterprise/coderd/prebuilds.(*StoreReconciler).Run.func3 on top of the stack:
  github.com/coder/coder/v2/enterprise/coderd/prebuilds.(*StoreReconciler).Run.func3()
    /Users/runner/work/coder/coder/enterprise/coderd/prebuilds/reconcile.go:173 +0xc4
  created by github.com/coder/coder/v2/enterprise/coderd/prebuilds.(*StoreReconciler).Run in goroutine 241683
    /Users/runner/work/coder/coder/enterprise/coderd/prebuilds/reconcile.go:171 +0x3d4
]
FAIL	github.com/coder/coder/v2/enterprise/coderd	191.159s

Notes:

  • No data race warnings present (checked for "WARNING: DATA RACE" and "race detected").
  • No panic, OOM, or other process-crash indicators in logs.
  • The failure was within minutes of the Slack alert; correct run was analyzed.

Best assessment:

  • The enterprise/coderd package tests started a prebuilds StoreReconciler, which in turn started MetricsCollector.BackgroundFetch.
  • These background goroutines were not shut down at test teardown, causing goleak to fail the package.
  • Likely missing Stop() on the StoreReconciler during test cleanup or server teardown isn’t stopping the reconciler/collector.

Assignment analysis:

  • Specific failing test function is not identified; goleak triggers after the package tests complete.
  • The leaked goroutines originate from enterprise/coderd/prebuilds (StoreReconciler, MetricsCollector). Recent ownership/contributors in this area include Susana Ferreira and Sas Swart.
  • Suggest assignment to prebuilds ownership for investigation. If ownership differs, please re-route accordingly.

Proposed next steps:

  • Ensure StoreReconciler.Stop() is called in test teardowns where a reconciler is created or where the server starts it.
  • Consider registering a t.Cleanup to Stop() any reconciler started during tests.
  • Optionally gate metrics registration in tests or use a shorter context/interval to avoid long-lived background goroutines.

Related issues search:

  • Searched coder/internal for duplicates with queries: "goleak", "goroutine leak", "MetricsCollector", "StoreReconciler", "BackgroundFetch" – none found open or recently closed.

Reproduction hint:

  • Run enterprise/coderd package tests on macOS with -count=1 and observe if goleak triggers intermittently:
    go test ./enterprise/coderd -count=1 -run .

Labels: flake

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions