Skip to content

init: import whichtests sources#1

Draft
ethanndickson wants to merge 7 commits into
mainfrom
init
Draft

init: import whichtests sources#1
ethanndickson wants to merge 7 commits into
mainfrom
init

Conversation

@ethanndickson
Copy link
Copy Markdown
Member

@ethanndickson ethanndickson commented May 20, 2026

Initial import of the whichtests tool, extracted from coder/coder's scripts/testselect/ and renamed.

whichtests answers: given a PR diff, which Go tests should run? The flake-go workflow uses its output to rerun new/modified tests under shuffle and surface flakes early.

Commits

  • init: import scripts/testselect from coder/coder and split main.go across cohesive files (cli, config, request, gitexec, diff, snapshot, broadening, selection, inventory, plan, publish, githubactions)
  • split: organize tests along source-file boundaries (1:1 with sources)
  • chore: import boilerplate from coder/paralleltestctx (.gitignore, .golangci.yaml, .prettierrc, LICENSE, Makefile, CI workflow, setup-go composite action)
  • fix: address mechanical lint findings (errcheck, govet shadow, staticcheck QF1008, unparam)
  • rename: testselect -> whichtests (module path, Makefile target, CLI doc comment, README)

Outstanding

  • 6 gosec findings (G304/G301) deferred for review. Recommendation is to lower publish.go MkdirAll perms to 0o750 and #nosec annotate the remaining 4 sites (test temp dir, GITHUB_EVENT_PATH, --out-matrix and --out-summary flag values).
  • coder/coder's scripts/testselect/ directory and the flake-go.yaml reference still need to be migrated to use this repo; that's out of scope for this PR.

Relates to CODAGT-381

Imports the testselect Go test-plan generator from coder/coder@bf6a0d953f
(branch ethan/go-test-flake-detector) and splits the single ~1.6k-line
main.go into ten focused files inside package main:

- cli.go: entrypoint, flag parsing, command orchestration
- config.go: config / commandConfig types and defaults
- request.go: runRequest, diffRange, revision validation
- gitexec.go: gitRunner / gitFetcher types and exec.Command impl
- diff.go: git diff parsing, change kinds, hunks, line ranges
- snapshot.go: AST snapshot parsing, fileSnapshot, sharedDecl, fallbacks
- broadening.go: per-kind broadening rules (broadeningScope)
- selection.go: per-change selection logic
- inventory.go: inventoryCache for package/directory test discovery
- plan.go: plan construction, matrix and summary rendering

githubactions.go and publish.go are imported unchanged. Tests
(githubactions_test.go, integration_test.go, main_test.go) are kept as
single files initially.

go vet, go build, go test, and go test -race are all green.
Split main_test.go (2297 lines) into focused *_test.go files mirroring
the source layout, and move two helpers (mergePackageSelection,
packagePattern) from plan.go to selection.go so layering is consistent
(plan depends on selection, not vice versa).

Test files now line up 1:1 with their production counterparts:
  cli_test.go        - run/runCommand end-to-end tests
  diff_test.go       - hunk/change parsing tests
  snapshot_test.go   - parseFileSnapshot/fallback tests
  selection_test.go  - selectTestsForSnapshots and merge tests
  plan_test.go       - buildExecutionPlan/renderSummary tests
  publish_test.go    - publishPlan tests (from githubactions_test.go)
  githubactions_test.go - GitHub event/range tests
  integration_test.go   - real-git integration (unchanged)
  helpers_test.go    - shared line-range + inventory helpers
  gitfake_test.go    - fakeGitRepo mock + dispatcher methods

go vet, go build, go test, and go test -race all pass.
Mirrors the parallel testctx repo's structure: Makefile, golangci-lint
v2 config, prettier config, gitignore, MIT license, CI workflow with
fmt/lint/test jobs, and the composite setup-go action (default Go
version bumped to 1.26.2 to match go.mod).

Drive-by gofumpt fix: blank line between needsOldSnapshot and
addMatchingTests in selection.go.
- publish.go: surface Close errors via named return on appendFile,
  fixing errcheck. Close errors on a write path can indicate lost data,
  so propagating beats silently discarding.
- plan.go: drop `:=` in selectTestPlan loop so the inner err reuses
  the outer binding, fixing govet shadow.
- githubactions.go: drop the redundant cfg.config selector and rely on
  the promoted method, fixing staticcheck QF1008.
- helpers_test.go: drop both dir and packageName parameters from
  mustPackageInventory; all 16 call sites pass the same synthetic
  values, so hardcode them and document the helper, fixing unparam.

go vet, go build, make test, and make fmt are clean. make lint still
reports six gosec findings (G301/G304) covering directory permissions
and reads from user-provided paths; those are policy decisions.
The previous name was honest but generic — many things 'select tests.'
'whichtests' is the question this tool answers: given a Go PR's diff,
which tests should run? Renaming makes the intent obvious in workflow
YAML, CLI help, and the module path.

Touchpoints:
- go.mod module github.com/coder/whichtests
- Makefile builds build/whichtests
- cli.go doc comment
- README.md heading and invocation examples

Internal symbol names like selectTestsForSnapshots are untouched —
they describe what the function does, not the tool's identity.
Copy link
Copy Markdown
Member Author

@ethanndickson ethanndickson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for extracting this from coder/coder. The algorithm is small, the git/AST/matrix boundaries are kept clean, and the regex gates on safeTestNameRE/safePackagePatternRE plus the GITHUB_OUTPUT CRLF + 1 MiB guard close the highest-impact injection paths. The two real-git integration tests are a particularly strong asset.

Reviewer context: this binary has exactly one consumer — coder/coder's flake-go.yaml, invoked as whichtests --repo-root . --github-actions --out-matrix "$RUNNER_TEMP/flake-matrix.json" after a per-PR go install from a pinned SHA (no binary caching). Severity is calibrated to that flow.

11 reviewers in parallel cross-checked into 6 P2, 8 P3, 1 perf bundle, 2 nit bundles across 17 inline comments. Most P2s are deletions (dead config, fossil fields, a test-only run() parallel entry that 9 reviewers independently flagged) plus two algorithm findings the smoke test exposed. None block merge.

Comment thread request.go
MergeBaseRef string
Sinks outputSinks
OutputSizeLimit int
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 runRequest.OutputSizeLimit is plumbed through but never written. (Structural Analyst P2, Edge Case Analyst P2, Go Architect P2, Product Reviewer P3)

OutputSizeLimit is declared on runRequest, threaded through publishPlan(... outputSizeLimit int), threaded again into appendGitHubOutput(... outputSizeLimit int), and finally compared inside appendGitHubOutput with a 0-means-default escape (if outputSizeLimit == 0 { outputSizeLimit = defaultGitHubOutputValueLimit }). rg finds zero writers of req.OutputSizeLimit anywhere in production or tests; both invocation sites pass 0.

Delete the field, drop the parameter from publishPlan/appendGitHubOutput, and reference defaultGitHubOutputValueLimit directly inside appendGitHubOutput. The existing publish_test.go tests pass the limit explicitly to appendGitHubOutput, so they keep their existing signatures — the deletion only touches the runRequestpublishPlan plumbing. Three layers of conditional plumbing carry one constant; this is the kind of "looks load-bearing, isn't" hook that traps future maintainers.

Comment thread selection.go Outdated
}

type testDecl struct {
FilePath string
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 testDecl{FilePath, Range} is a set in disguise — both fields are written, neither is read. (Structural Analyst P2, Go Architect P2)

the type is map[string][]testDecl and testDecl{FilePath, Range} is written in inventoryCache.loadPackageInventory and never read. The only readers are hasTest(name) (presence check on the map key) and allTests() (slices.Sorted(maps.Keys(...))). The slice value, including both testDecl fields, is dead. rg -n '\.FilePath�' and rg -n 'testDecl�' confirm no field reads outside the constructors.

Collapse Tests map[string][]testDeclTests map[string]struct{} and delete testDecl. The hasTest/allTests accessors stay identical (they already only use keys), and mustPackageInventory in helpers_test.go simplifies. The struct currently misleads readers into believing the inventory layer knows where each test lives — it doesn't.

Comment thread broadening.go
func (decl sharedDecl) broadeningScopeOnNewSide(oldSnapshot *fileSnapshot) broadeningScope {
switch decl.Kind {
case sharedDeclImport:
return broadeningPackage
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Package-wide broadening on any new import is the dominant degenerate matrix in the smoke test — selectivity is not earning its keep. (Product Reviewer P2)

per context.md's smoke-test note, roughly half of the non-empty matrices were driven by a single stdlib import addition, each then hitting the maxBroadenedTests = 50 cap and falling back to RunAll = true with test_count = 1. End state: the matrix entry runs every test in the touched package exactly once, with no shuffle multiplier. That is approximately the same coverage flake-go would produce by running the touched package unconditionally, so the selector's selectivity is not earning its keep on these PRs.

Two cheap fixes worth considering:

  1. Treat stdlib imports as non-broadening — they cannot introduce init-time side effects via a new module, so the touched test func plus its in-file siblings is sufficient.
  2. When the broadened set blows past maxBroadenedTests, prefer re-narrowing to the tests literally touched in the hunk over the current "drop to RunAll, demote to test_count = 1" path. The current fallback inverts the product value — flake reruns are most useful at test_count = 10, but this is exactly the case where the binary downgrades them to 1×.

Happy to scope this to a follow-up if you'd rather keep this PR a clean lift.

Comment thread broadening.go
)

func broadeningScopeForOldHunk(decls []sharedDecl, candidate lineRange) broadeningScope {
for _, decl := range decls {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 broadeningScopeFor{Old,New}Hunk returns the first overlapping decl's scope rather than the highest; a hunk spanning an Import and an Init under-broadens (Package instead of Directory). (Edge Case Analyst P2)

broadeningScopeForOldHunk iterates decls in source order and returns immediately on first overlap (regardless of whether the result is broadeningPackage or higher). broadeningScopeForNewHunk is slightly better — it skips broadeningNone — but still returns the first non-None match. Go files typically have imports (Package) → vars/types → funcs → init/TestMain in source order, so a hunk that overlaps an Import and an Init returns broadeningPackage and misses the directory-wide broadening that should apply.

--unified=0 is used (diff.go:196), so hunks are tight to changed lines and overlapping two decls requires a contiguous edit across two adjacent decls — uncommon but real. Example: adding a single line between the last import block and func init().

Fix: scan all overlapping decls and return max(scope). Cheap, local, and removes a silent correctness gap. A regression test with the import→init adjacency would pin this down.

Comment thread cli.go Outdated
}

func run(ctx context.Context, cfg config, stdout, stderr io.Writer, git gitRunner) error {
req, err := explicitRunRequest(cfg)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The test-only run() shim is a parallel composition path; runCommand (production) has no end-to-end test, so regressions in fetch wiring or sink wiring would slip past the suite. (Structural Analyst P2, Test Auditor P3, Contract Auditor P3, Edge Case Analyst P3, Performance Analyst P3, Product Reviewer P3, Duplication Checker P3, Go Architect P3, Style Reviewer Nit)

main calls runCommand(...). The thin wrapper run (explicitRunRequest + executeRunRequest with fetch=nil) is the entry every test in cli_test.go and integration_test.go uses; rg -n 'runCommand\(' finds exactly one caller (main). Tests of githubActionsRunRequest stop at the runRequest value — they never feed it through executeRunRequest/publishPlan. So the only thing that pins the production composition together (the if cfg.GitHubActions branch, the call to executeRunRequest, the os.Stdout/os.Stderr wiring, the size-limit defaulting via req.OutputSizeLimit=0) is main, which is never executed in tests.

This is the most-convergent finding in the review — 9 of 11 reviewers flagged it independently. Concrete fix: delete run, migrate the 19 test sites to runCommand(ctx, commandConfig{config: cfg}, ..., nil), and add the missing TestRunCommandGitHubActionsEnd2End that wires runCommand against a fake event payload (Test Auditor specifically called this out as the highest-value glue test currently missing). The deletion + new test collapses both code paths into one and adds end-to-end coverage of the --github-actions path that today only the live consumer exercises.

Comment thread plan.go Outdated
result.Matrix.Include = result.Matrix.Include[:keep]
result.Matrix.Include = append(result.Matrix.Include, matrixEntry{
Package: strings.Join(overflowPackages, " "),
TestCount: runOnceTargetCount,
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 Matrix overflow's Package field is a space-joined argv-list, not a single path; the contract is implicit at three boundaries and undocumented. (Contract Auditor Obs, Edge Case Analyst P3, Structural Analyst Obs, Go Architect P4)

overflowPackages := strings.Join(overflowPackages, " ") produces a string like "./a ./b ./c". This works because flake-go.yaml will spread it into go test ${{matrix.package}}, where YAML→bash word-splitting expands it into multiple args. If a future consumer ever wraps the value in quotes ("${{matrix.package}}") or maps it to a ${{matrix.package}} field that doesn't undergo splitting (e.g. an actions/upload-artifact name), the overflow target will fail in a confusing way.

I traced the actual path on the coder/coder side to verify: test-go-pg/action.yaml:104 sets TEST_PACKAGES: ${{ inputs.test-packages }} (env var, preserves spaces); Makefile:1470 runs gotestsum --packages="$(TEST_PACKAGES)"; gotestsum itself splits --packages on whitespace. So the contract works today, but the chain is "Go binary emits space-joined string → YAML env var → Make recipe with quoted shell expansion → gotestsum's internal whitespace split" — four implicit assumptions in a row.

Two cleaner shapes: (a) emit packages []string as a JSON array and have the workflow join with ${{ join(matrix.packages, ' ') }}; (b) keep the current shape but add a one-line code comment at line 159 documenting that Package is an argv-list when Broadened && len(targets) > maxMatrixEntries so a future maintainer doesn't "fix" it by quoting on either side.

Comment thread request.go
}

func validateRevision(flagName, revision string) error {
if revision == "" {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 validateRevision is much looser than the GitHub-supplied SHAs it protects, and the fields it validates are named BaseSHA/HeadSHA — the name promises something the validator does not enforce. (Security Reviewer P3, Contract Auditor P3)

the only checks are non-empty, no leading -, no :, no NUL. The pull_request.base.sha, pull_request.head.sha, workflow_dispatch.inputs.* all flow through this validator before being concatenated into revision + ":" + filePath (diff.go:270), revision + "^{commit}" (gitexec.go:25), and positional args to git merge-base, git diff, git ls-tree. No command injection is reachable — arguments don't pass through a shell, and the leading-- gate prevents flag confusion. But the symmetry with validateRef (which forbids @{, .., *, control/whitespace, leading .) is missing for what should be the most rigidly-typed input — a 40-char hex SHA.

Contract Auditor adds: the README's local-debug example uses --base-sha origin/main, so the field happily accepts arbitrary revspecs. The promise from the name ("SHA") is broken by the validator and contradicted by the documentation.

Two equivalent fixes: tighten validateRevision to ^[0-9a-fA-F]{7,64}$ on the GitHub Actions path (where the input is from GitHub-set payload), or rename BaseSHA/HeadSHABaseRev/HeadRev (and --base-sha/--head-sha--base-rev/--head-rev) and document that revspecs are accepted. The README example would have to update either way.

Comment thread inventory.go
}
for testName, declRange := range snapshot.tests {
inventory.Tests[testName] = append(inventory.Tests[testName], testDecl{FilePath: filePath, Range: declRange})
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 Performance cluster: loadDirectoryInventories re-reads + re-AST-parses files multiple times, and every readFileAtRevision costs 3 git fork+execs. Not blockers on a one-shot CLI, but real wins if whichtests's build/run latency ever lands on a critical path. (Performance Analyst P2 × 2, Duplication Checker P3)

inventory.go:116-152loadDirectoryInventories reads + AST-parses every _test.go in a directory twice on the same revision (once for the collect-names pass, once per package key in the cache fill), and one extra time per additional package in that directory. For a dir with both foo and foo_test packages that's 3N reads + 3N parses where N+something would suffice. This path fires for sharedDeclInit / sharedDeclTestMain changes — the broadenings that already pay a wide-fanout cost.

Performance Analyst adds (diff.go:258-292 + gitexec.go:24-30): readFileAtRevision calls ensureRevisionExists (a git cat-file -e), then fileExistsAtRevision (a git ls-tree), then git show — three processes per file read. ensureRangeAvailable already verified BaseSHA/HeadSHA exist before selectTestPlan is reachable, so the per-file cat-file is redundant for every subsequent read of the same SHA. Inside loadPackageInventory/loadDirectoryInventories, the files came from a previous ls-tree -r of the same revision, so fileExistsAtRevision cannot fail; the only legitimate caller is the initial readChangeFile for renames/deletes. For a PR touching 5 test files in 3 dirs averaging 15 test files each, the change-file reads alone cost ~30 git processes and the inventory step costs ~135 — cutting the redundant pair drops the total ~3× on the hottest path.

Concrete cleanup that is also a complexity reduction (orchestrator's emphasis): a per-file map[revision�path]fileSnapshot cache on inventoryCache dedupes across loadPackageInventory and loadDirectoryInventories; memoising ensureRevisionExists per (cache, revision) eliminates redundant cat-file invocations. Performance Analyst also flagged git ls-tree -r walking the full subtree when only direct children are needed — sharp edge for any future _test.go at or near the repo root.

Comment thread go.mod
@@ -0,0 +1,14 @@
module github.com/coder/whichtests

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit Modernization cluster (the module is on Go 1.26 and already uses errors.AsType[*exec.ExitError], strings.SplitSeq, slices.Sorted(maps.Keys(...)), cmp.Or, t.Context() — xerrors and the pre-CutPrefix patterns are the lone holdovers). (Modernization Reviewer, Go Architect P4)

  • go.mod:7 + every xerrors.New/xerrors.Errorf call site across cli.go, diff.go, gitexec.go, githubactions.go, inventory.go, plan.go, publish.go, request.go, selection.go, plus tests: drop golang.org/x/xerrors in favor of fmt.Errorf + errors.New. %w chaining has been in fmt.Errorf since Go 1.13; the code never relies on xerrors.Frame/Formatter/As. A package-wide gofmt -r 'xerrors.New(a) -> errors.New(a)' && gofmt -r 'xerrors.Errorf(a) -> fmt.Errorf(a)' handles the migration in one pass and empties the only non-test direct dependency in require. flake-go.yaml does go install per PR (no binary cache), so trimming the dep is small but real.
  • snapshot.go:211-214strings.HasPrefix(name.Name, prefix) + strings.TrimPrefix(name.Name, prefix)rest, ok := strings.CutPrefix(name.Name, prefix); if !ok { return false }. Single scan; the canonical .claude/docs/GO.md pattern.
  • gitfake_test.go:92-93 — same strings.HasSuffix + strings.TrimSuffixstrings.CutSuffix shape.

Comment thread config.go Outdated
defaultHeadSHA = "HEAD"
defaultOutSummary = "-"
defaultTargetCount = "10"
runOnceTargetCount = "1"
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit Naming cluster (linter-uncatchable; pinned to the first occurrence for collapse, none are blockers). (Style Reviewer)

  • config.go:11-12defaultTargetCount/runOnceTargetCount populate a TestCount field serialised as test_count. Rename → defaultTestCount/runOnceTestCount to align the three names.
  • githubactions.go:248fetchSpecs(ctx, req, fetch) performs the fetches, doesn't compute specs, and collides with the fetchSpec type name. runFetches / performFetches reads better and disambiguates from the near-duplicate loop in ensureConcreteRangeAvailable.
  • githubactions.go:340-342remoteTrackingFetchRef returns a refspec (refs/heads/X:refs/remotes/origin/X), not a ref. Sibling branchFetchRef does return a bare ref, so the parallel naming misleads. → remoteTrackingRefspec.
  • request.go:17runRequest.Prepare []fetchSpec could mean anything; every read site uses it as a fetch plan. → Fetches.
  • githubactions.go:108-115expectedHead for event.PullRequest.Head.SHA is misleading; the checked-out HEAD is authoritative, the payload value is the claim being checked. → payloadHead or claimedHead (matches the error text).
  • selection.go:163,224selectTestsForSnapshots / selectSourceRemovalForSnapshots take raw []byte, not snapshots; the ForSnapshots suffix is inaccurate. → selectTestsFromHunks / selectSourceRemoval.
  • snapshot.go:19 vs plan.go:16packagePatternRE (Go package clause) and safePackagePatternRE (./... test-path pattern) share a name root but match unrelated things. Rename → packageClauseRE in snapshot.go.
  • diff.go:17-25changeKind constants are bare letters; a short doc comment on the group ("Letters match git diff --diff-filter codes: A added, D deleted, M modified, R renamed, T type-changed.") disambiguates changeType = "T" from "Go type declaration".

Copy link
Copy Markdown
Member Author

@ethanndickson ethanndickson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 1 cleanups all landed cleanly at 4acef74. The testDecl flattening, run() shim removal, fallback-parsing deletion, pull_request.head.sha strictness, broadening max-scope correction, summary filename quoting, and xerrors removal are all in place; the resulting code is noticeably smaller and clearer.

Round 2 findings are simplification opportunities in the same vein: dead struct fields (matching the round-1 testDecl pattern), single-use helpers, double-AST-parsing with unreachable fallback branches, and a defaults function pair that risks drift. 9 P3, 4 nits, 1 obs across 14 inline comments. None block merge.

Comment thread gitexec.go
"strings"
)

type gitResult struct {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 gitResult.Stderr and gitResult.ExitCode (and the exitCode() helper) are write-only outside execGit itself; the struct presents a richer surface than callers actually use. (Edge Case Analyst P3, Product Reviewer P3, Go Architect P3, Modernization Reviewer Nit)

After round-1 cleanup, no caller reads either field. Inside execGit the Stderr/ExitCode writes feed only its own local error-message logic (lines 49-52), and the two test fakes (gitFailure and the literal in diff_test.go:63) populate them only because the struct still has those fields. Collapse gitResult to struct { Stdout string }, use stderr.String() and exitCode(err) as locals when building execGit's error, and drop the exitCode parameter from gitFailure.

Edge Case Analyst adds: exitCode(err error) int exists for the sole purpose of populating result.ExitCode. Once that field is gone, the helper is gone with it.

Worth taking now: this is the same shape as round 1's testDecl{FilePath, Range} cleanup, the deletion is mechanical, and it removes a "we surface exit codes" cue from the gitRunner contract that no consumer actually depends on.

Comment thread plan.go
}

type executionAccumulator struct {
Package string
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 executionAccumulator.Package is set on init but never read; pure dead state left over after the round-1 cleanup. (Contract Auditor P3, Structural Analyst P3, Product Reviewer P3)

The struct's Package field is set once at plan.go:95 (entry = &executionAccumulator{Package: packagePath, ...}) and never read again. grep -n '\.Package�' in plan.go returns line 147, but that entry is the matrixEntry loop variable from for _, entry := range result.Matrix.Include[keep:], not the accumulator. The compiler doesn't flag this because the struct literal field assignment counts as "use."

Drop the field and the Package: packagePath init line. No behavior change, no test changes; same pattern the round-1 runRequest.OutputSizeLimit deletion fixed.

Comment thread selection.go
Tests map[string]struct{}
}

func (inventory packageInventory) allTests() []string {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 packageInventory.allTests() sorts its result, but its only consumer dumps that result into an unordered map[string]struct{}. (Edge Case Analyst P3, Performance Analyst P3, Product Reviewer P3, Modernization Reviewer Nit)

allTests() is called from a single site, allPackageTestsSelectionForFiles at selection.go:257:

for _, testName := range inventory.allTests() {
    selection.Tests[testName] = struct{}{}
}

The body inserts into selection.Tests, which is a map[string]struct{}. The slices.Sorted(maps.Keys(inventory.Tests)) work inside allTests() (allocating a slice, sorting it, ranging it) is thrown away. Equivalent and cheaper: maps.Copy(selection.Tests, inventory.Tests) (or maps.Clone to drop the helper entirely while preserving the nil-on-empty behavior).

Worth doing this round. This sits on the broadening fan-out path (directoryWideSelections calls it once per package), exactly the case where the inventory is largest.

Comment thread snapshot.go
tests map[string]lineRange
shared []sharedDecl
sharedKeys map[string]struct{}
testingDotImport bool
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 fileSnapshot.testingDotImport is a struct field for a single in-constructor read. (Structural Analyst P3, Go Architect P3, Style Reviewer Nit)

The field is assigned exactly once (line 53, from hasTestingDotImport(file)) and read in exactly one place (line 82, when classifying funcs in the same constructor). grep -rn testingDotImport *.go confirms zero external readers. The struct-shaped form misleadingly suggests downstream code consults it (e.g. broadening rules) when it's actually scoped to the parse loop.

Demote to a local variable in parseFileSnapshot. Removes one field from the snapshot surface and signals that "dot-import of testing" is purely a parse-time decision, not part of the snapshot anyone else inspects.

Comment thread selection.go
return selection
}

func allDirectoryTestsSelection(inventory packageInventory, filePath string) *packageSelection {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 allDirectoryTestsSelection populates Tests (and the not-yet-needed Broadened) that mergeSelection always discards on the DirectoryWide path. (Duplication Checker P3, Go Architect P3)

DirectoryWide is only set here (line 276) and only read in mergeSelection (line 120). The DirectoryWide branch of mergeSelection (lines 125-134) ignores selection.Tests, selection.Broadened, and selection.Key.Name — it only forwards selection.Key.Dir and selection.Files to cache.directoryWideSelections, which then builds fresh per-package selections via allPackageTestsSelectionForFiles. So the work allDirectoryTestsSelection does — calling allPackageTestsSelection, walking inventory.allTests() to fill a Tests map, and then a selection == nil reconstruction branch — is all write-only.

The whole helper collapses to:

func allDirectoryTestsSelection(dir, filePath string) *packageSelection {
    return &packageSelection{
        Key:           packageKey{Dir: dir},
        Files:         map[string]struct{}{filePath: {}},
        DirectoryWide: true,
    }
}

Callers (selection.go:179, 186, 226) currently pass newInventory/inventory; only .Key.Dir is needed. grep DirectoryWide *_test.go returns nothing, so the deletion is mechanically safe.

Net result: ~10 lines deleted from selection.go, the conditional reconstruction goes away, and the type stops carrying state nobody reads.

Comment thread selection.go
}

var oldKey packageKey
oldKeyOK := oldExists
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit Co-located simplifications in selectChange. (Product Reviewer P3, Style Reviewer)

  1. oldKeyOK := oldExists and newKeyOK := newExists (lines 60, 68) are unconditional copies — if oldExists is true and packageKeyForData fails, the function returns immediately at line 64, so oldKeyOK == oldExists at every subsequent read. Drop the aliases and use oldExists/newExists at the call sites (lines 76, 83, 93).
  2. change.NewPath != "" at line 55 is already implied by the next conjunct: isRunnableTestFilePath("") is false because strings.HasSuffix("", "_test.go") is false. Drop the explicit non-empty check.
  3. packageInventory.hasTest(name) (selection.go:25-28) is a two-line wrapper around _, ok := inv.Tests[name]; ok with two call sites (lines 198, 231). With Tests now a map[string]struct{} after the round-1 flattening, the inline form is obvious; bundle the inlining with the other selection.go simplifications if you touch this file.

Comment thread publish.go
return nil
}

func appendNewline(data []byte) []byte {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit appendNewline is a single-call helper that hides a one-line append. (Performance Analyst Obs, Product Reviewer Obs, Duplication Checker Obs, Go Architect Obs, Modernization Reviewer, Style Reviewer)

The body does make([]byte, 0, len(data)+1); append(_, data...); append(_, ' ') to produce append(data, ' '). The lone caller (publishPlan at publish.go:20) receives data from json.Marshal, which always returns a fresh slice, so mutating-append at the call site is safe. Inline as writeFile(sinks.OutMatrix, append(matrixData, ' ')) and delete the helper.

Six lines plus a named symbol for free.

Comment thread plan.go
return true
}

func unsafeRunRegexTestNames(tests []string) []string {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit unsafeRunRegexTestNames returns a []string that the single caller only counts. (Contract Auditor P3, Go Architect P4)

The only call site at plan.go:118 is if unsafeTestNames := unsafeRunRegexTestNames(tests); len(unsafeTestNames) > 0. The names themselves are never used — the broadening note ("Selected %d test names that cannot be passed safely through RUN…") only uses the count, and they are not logged on stderr or surfaced anywhere. The function allocates a slice on every package even though all that escapes is len > 0.

Either rename to hasUnsafeTestName(tests []string) bool and short-circuit on the first hit, or use slices.ContainsFunc(tests, func(n string) bool { return !safeTestNameRE.MatchString(n) }) inline. Tiny.

Comment thread diff.go
return strings.HasSuffix(path, "_test.go") && isRunnableGoTestPath(path)
}

func isRunnableGoTestPath(path string) bool {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit isRunnableGoTestPath is misnamed; the body has no Go-test-specific logic. (Style Reviewer)

The function name promises a Go-test–specific check, but the body only rejects hidden / underscore-prefixed segments and testdata / vendor directories — the _test.go suffix is enforced by its only caller, isRunnableTestFilePath. Either rename it to reflect what it actually does (e.g. hasExcludedPathSegment returning the inverse) or fold the two functions together; today the pair reads like two distinct gates when it is really one filter split across two names.

Comment thread diff.go
)

type testFileChange struct {
Kind changeKind
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obs testFileChange.Kind is a 5-way enum no consumer distinguishes; listChangedTestFiles collapses A/D/M/T into a single default arm at diff.go:128, and no production code reads change.Kind after construction. (Edge Case Analyst P3, downgraded)

rg '\.Kind' --type go | grep -v _test.go returns zero production reads. The only branch keyed on Kind runs during construction (diff.go:115-145), immediately after parseChangeKind returned the same value as a local; everywhere after, only OldPath / NewPath are consulted. The kind is fully recoverable from the path pair: both set and equal => modified/type, both set and different => renamed, only OldPath => deleted, only NewPath => added.

Why this is Obs not P3: gitfake_test.go does read change.Kind to emit the simulated git diff --name-status output, so the field is not purely dead in the test infrastructure. Collapsing the 5-way enum to a bool isRename (or deriving from paths in the fake) is a real but optional simplification — flagging it because round 2 specifically asked for opportunities to remove ceremony, and this matches the round-1 testDecl smell exactly. Take it or leave it.

Copy link
Copy Markdown
Member Author

@ethanndickson ethanndickson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Redundant follow-up; the round-2 review above already contains all 14 comments.)

Copy link
Copy Markdown
Member Author

@ethanndickson ethanndickson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up to the round-2 review: one P3 simplification that I previously catalogued as an observation. On reflection, since PR #1 is the initial import, every line lands now, so the "future-shape" framing was the wrong lens. This one is a real simplification with clear safety, so promoting it to P3 and posting inline.

Comment thread selection.go
DirectoryWide bool
}

func selectChange(ctx context.Context, cfg config, git gitRunner, cache *inventoryCache, selections map[packageKey]*packageSelection, change testFileChange) error {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 selectChange takes cfg config and git gitRunner alongside cache *inventoryCache, but cache.cfg and cache.git are the same values; the parameter list duplicates state the cache already owns. (Structural Analyst Obs, re-promoted)

plan.go:70-73 builds cache := newInventoryCache(cfg, git), then calls selectChange(ctx, cfg, git, cache, ...). Inside selectChange, cfg/git are used for listDiffHunks (line 39) and the two readChangeFile calls (lines 47, 51); cache is used for loadPackageInventory (lines 77, 94) and (via mergeSelection) directoryWideSelections. Both halves pull from the same underlying state.

The simplest fix: drop the standalone cfg/git parameters from selectChange and mergeSelection, and read them off the cache at the I/O sites, either via exported helpers on inventoryCache (cache.listDiffHunks(...), cache.readChangeFile(...)) or by exposing cache.cfg/cache.git to the package. Either way the signature becomes selectChange(ctx, cache, selections, change), which removes two parameters and removes the implicit invariant that "the cfg passed alongside cache matches cache.cfg."

Caveat already noted by the round-2 reviewers: a separate P3 in this review suggests threading the parsed fileSnapshot from packageKeyForData through selectTestsFromHunks/selectSourceRemoval to delete the double-AST-parse. If both are done, do them together; selectChange's signature gets reshaped either way.

Worth doing now. This is the initial-import PR, so "land it later" means "ship a wider signature into main than the code actually needs."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant