Skip to content

Remove Go's testing package from Fleet's production binary #45220

@lucasmrod

Description

@lucasmrod

Goal

User story
As a Fleet operator,
I want Fleet's production binary (fleet and fleetctl) to not link Go's testing package,
so that test-only code, helpers, and flags cannot accidentally be reachable in production and the binary is smaller, leaner, and easier to audit.

Original requests

Follow-up to the dependency-graph cleanup tracked in #36087. While auditing transitive imports, we found that the production fleet binary still links testing because several production packages directly import "testing" from non-_test.go files.

Quick check on the current main:

$ go list -deps ./cmd/fleet | grep '^testing$'
testing

$ go tool nm $(go env GOPATH)/bin/fleet | grep '^.* testing\.'
... testing..inittask
... testing.init
... testing.supportedTypes
...

12 first-party packages reachable from cmd/fleet import testing directly:

  • server/config
  • server/dev_mode
  • server/pubsub
  • server/goose(via server/datastore/mysql)
  • server/datastore/mysql
  • server/datastore/redis/redistest
  • server/datastore/s3
  • server/mdm/maintainedapps
  • server/mdm/testing_utils
  • server/platform/mysql/testing_utils
  • server/service/schedule
  • ee/server/service/scep
  • cmd/fleetctl/fleetctl (also affects fleetctl)

The first move has already landed on the refactor-for-production-binary-to-not-include-testing-code branch (6c9e2cc) — server/service test helpers have been moved into a new server/service/svctest package. This story tracks completing the same pattern across the remaining packages.

Why this matters

  • Security / blast radius: testing brings test-only flags (-test.v, -test.run, -test.coverprofile, …) into the binary's default flag.CommandLine. Any code that ever calls flag.Parse() from a transitive dep can surface them.
  • Production-reachable test helpers: helpers that take a testing.TB, create insecure HTTP servers, seed fixtures, install bypass routes, or panic on assertion are reachable from production call sites today. Moving them to *test subpackages closes that door at compile time.
  • Binary size + audit clarity: removes a transitive dependency that has no reason to be in cmd/fleet or cmd/fleetctl, and makes "is test code linked into production?" trivially answerable with go list -deps.
  • Aligned with the fleetctl audit in Audit/resolve unneccessary fleetctl transitive dependencies #36087: same theme (remove unnecessary transitive deps), but specifically scoped to the testing stdlib package which we can fully eliminate.

Changes

Engineering

  • Audit every non-_test.go file under the module that imports "testing". Current set is go list -deps ./cmd/fleet | xargs -n1 go list -f '{{.ImportPath}} {{.Imports}}' | grep ' testing '.
  • For each package, move the test helpers into a sibling <pkg>test (or <pkg>/testing) subpackage that's only imported from _test.go files. Mirror the server/serviceserver/service/svctest pattern that already landed.
    • server/config test helpers
    • server/dev_mode
    • server/pubsub
    • server/datastore/mysql + server/goose migration helpers
    • server/datastore/redis/redistest
    • server/datastore/s3
    • server/mdm/maintainedapps
    • server/mdm/testing_utils
    • server/platform/mysql/testing_utils
    • server/service/schedule
    • ee/server/service/scep
    • cmd/fleetctl/fleetctl (also remove testing from the fleetctl binary)
  • Add a CI guard (lint rule or small go test-based check) that fails if a non-_test.go file reachable from ./cmd/fleet/... or ./cmd/fleetctl/... imports "testing". Idea: a tools/check_no_testing_in_prod script wired into make lint-go.
  • Confirm go list -deps ./cmd/fleet and go list -deps ./cmd/fleetctl no longer include testing.
  • Confirm go tool nm $(which fleet) | grep '^.* testing\.' returns no symbols.
  • Record before/after binary sizes for fleet and fleetctl in the PR description.
  • No public API surface changes for end users — this is internal refactoring only.
  • Test plan is finalized
  • This is a premium only feature: No

Product

  • UI changes: No changes
  • CLI (fleetctl) usage changes: No changes (internal-only refactor; flags and outputs unchanged)
  • YAML changes: No changes
  • REST API changes: No changes
  • Fleet's agent (fleetd) changes: No changes
  • Fleet server configuration changes: No changes
  • Exposed, public API endpoint changes: No changes
  • fleetdm.com changes: No changes
  • GitOps mode UI changes: No changes
  • GitOps generation changes: No changes
  • Activity changes: No changes
  • Permissions changes: No changes
  • Changes to paid features or tiers: No changes
  • My device and fleetdm.com/better changes: No changes
  • Usage statistics: No changes
  • Other reference documentation changes: No changes

Risk assessment

  • Requires testing in a hosted environment: No
  • Requires load testing: No
  • Risk level: Medium
  • Risk description: Pure code movement, but it touches many packages and the integration-test suites. Risk is regressions in test setup (e.g. mysql/redis suites failing to spin up, fleetctl test scaffolding breaking) rather than runtime behavior. Mitigated by running the full integration matrix and by the new CI guard.

Test plan

Engineering verification

  • go list -deps ./cmd/fleet | grep -x testing returns nothing
  • go list -deps ./cmd/fleetctl | grep -x testing returns nothing
  • go tool nm $(go env GOPATH)/bin/fleet | grep ' testing\.' returns nothing (or only DWARF/zero-sized symbols)
  • make lint-go passes (including the new "no testing in prod" guard)
  • All existing CI test bundles pass: fast, mysql, service, integration-core, integration-enterprise, integration-mdm, fleetctl, vuln, main

Smoke test

  • make build produces functional fleet and fleetctl binaries
  • make serve brings up the server with no behavior change
  • fleet --help and fleetctl --help show no new/leaked -test.* flags
  • Run a basic flow: login, enroll a host, run a query — same behavior as main

Confirmation

  1. Engineer: Added comment to user story confirming successful completion of test plan.
  2. QA: N/A — no user-visible changes; engineer-only verification.

Metadata

Metadata

Assignees

Labels

#g-orchestrationOrchestration product groupstoryA user story defining an entire feature~backendBackend-related issue.~engineering-initiatedEngineering-initiated story, such as a bug, refactor, or contributor experience improvement.

Type

No type

Projects

Status

🐣 In progress

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions