Skip to content

ci(bench): build dist/ before pre-publish benchmark gate#1057

Merged
carlos-alm merged 5 commits into
mainfrom
fix/issue-1055-bench-methodology
May 4, 2026
Merged

ci(bench): build dist/ before pre-publish benchmark gate#1057
carlos-alm merged 5 commits into
mainfrom
fix/issue-1055-bench-methodology

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

  • Pre-publish benchmark ran src/ via --strip-types while every historical baseline was measured against compiled dist/ via the post-publish --npm path. Same code, two load paths — version-to-version deltas were largely measurement artifacts.
  • Adds --dist to bench-config.ts so local mode can opt into dist/. CI now npm run builds and passes --dist to all four benchmark scripts. Local developer flow (no --dist) is unchanged.

Why this addresses #1055

The reported +52% on fnDeps (and the related +2080ms native rebuild flat overhead in #1054) appeared on Linux CI between v3.9.6 and v3.10.0 even though the fnDeps query path code (domain/queries.tsanalysis/dependencies.tsdb/repository/native-repository.tsdb/connection.tsinfrastructure/native.ts) was unchanged across those tags.

What did change is how the benchmark loads the code under test:

  • v3.9.6's row in QUERY-BENCHMARKS.md was produced by the post-publish workflow installing @optave/codegraph@3.9.6 and running benchmarks against dist/ (bench-config.ts line 204 prefers dist/ when present in npm mode).
  • v3.10.0's pre-publish gate (added in ci(release): gate npm publish on benchmark regressions #1040) runs node --strip-types --import ./scripts/ts-resolve-loader.js scripts/benchmark.ts against src/. Same Rust binary, but the JS layer is now type-stripped TypeScript instead of tsc-compiled JavaScript.

This PR aligns the gate with the historical methodology so 3.10.x and later are measured the same way 3.9.6 was.

Test plan

  • npm run build produces a working dist/.
  • node --import ./scripts/ts-resolve-loader.js scripts/query-benchmark.ts --dist runs end-to-end on Mac M1 and produces sensible numbers (fnDeps depth3 = 23.2ms, in line with src/-loaded run at 24.9ms).
  • npx tsc --noEmit clean.
  • Pre-publish benchmark gate runs cleanly on next dispatch and produces v3.10.x numbers comparable to v3.9.6.

Fixes #1055
Refs #1054

Pre-publish benchmark ran against `src/` via `--strip-types`, while every
historical baseline (v3.9.6 and earlier) was measured against compiled JS
in `dist/` via the post-publish `--npm` path. Same code, two load paths —
the load-path delta showed up as version-to-version regressions in the
query/build benchmarks even though the code under test was unchanged.

Adds `--dist` to `bench-config.ts` so local mode can load from `dist/`,
and runs `npm run build` in the pre-publish-benchmark job before each
benchmark invocation. Local developers default to `src/` unchanged.

Fixes #1055
Refs #1054
@claude
Copy link
Copy Markdown

claude Bot commented May 3, 2026

Claude finished @carlos-alm's task —— View job


PR Review Complete ✅

Reviewing PR #1057: ci(bench): build dist/ before pre-publish benchmark gate

Todo List:

  • Phase 1: Root cause analysis - understand the benchmark methodology issue
  • Phase 2: Configuration change detection - review workflow and bench config changes
  • Phase 3: Test integrity verification
  • Phase 4: Code quality review
  • Phase 5: Backlog compliance assessment
  • Provide final assessment and recommendation

Root Cause Analysis

Problem Being Solved: Pre-publish benchmark gate measured against src/ via --strip-types while historical baselines (v3.9.6 and earlier) were measured against compiled dist/ via post-publish --npm path. This load-path discrepancy caused version-to-version performance deltas that were measurement artifacts, not actual code changes.

Why This Approach: The fix aligns measurement methodology by building dist/ first, then using --dist flag to load compiled JavaScript (matching historical methodology). Local developer workflow unchanged (still defaults to src/).

Risk Assessment: Low risk - purely additive flag with error handling for missing dist/, no breaking changes.

Configuration Change Review ⚠️ HIGH SCRUTINY

Workflow Changes (.github/workflows/publish.yml:283-284):

  • ✅ Added npm run build step before benchmarks (necessary)
  • ✅ Added --dist to all 4 benchmark commands (build, resolution, query, incremental)
  • ✅ Clear documentation explaining the reasoning

Benchmark Config Changes (scripts/lib/bench-config.ts):

  • ✅ Added --dist CLI argument parsing
  • ✅ Added dist directory validation with clear error message
  • ✅ Proper fallback: src/ remains default for local development
  • ✅ No relaxation of existing rules or timeouts

Assessment: Configuration changes are justified and well-implemented. No red flags detected.

Test Integrity ✅

  • No test files modified
  • No assertions weakened or removed
  • No timeout increases or coverage threshold reductions
  • Changes are purely additive (new --dist flag)
  • Existing functionality preserved with same defaults

Code Quality ✅

Positive aspects:

  • Clear separation of concerns between CLI parsing and source resolution
  • Robust error handling (fs.existsSync check with helpful error message)
  • Comprehensive documentation in both code comments and commit message
  • Security-conscious: no shell injection risks in new code
  • Follows existing patterns and conventions
  • Lines 95-97 (bench-config.ts): Proper error for missing dist/ with actionable fix suggestion

No issues found:

  • No performance regressions introduced
  • No security vulnerabilities
  • No architectural violations
  • Implementation matches described behavior exactly

Backlog Compliance

  • Zero-dep: ✓ — No new runtime dependencies. Builds on existing fs, path, and build infrastructure
  • Foundation-aligned: ✓ — Aligns with P1 (always-current graph) and P4 (zero-cost core). Improves measurement accuracy without changing core functionality. No cloud API calls or external dependencies.
  • Problem-fit: 3 — Useful infrastructure improvement that ensures consistent benchmark methodology, but doesn't directly address the core "AI agents getting lost in codebases" problem. Enables accurate performance regression detection.
  • Breaking: No — Purely additive. New --dist flag is optional, existing behavior unchanged
  • Tier: 1 (Zero-dep ✓ + Foundation-aligned ✓)

Critical Concerns

None identified. This is a well-executed infrastructure fix that addresses a real measurement consistency issue.

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: Clean implementation of a necessary benchmark methodology fix. Addresses the root cause (load-path inconsistency) rather than symptoms. Well-documented, properly tested approach with good error handling. The only reason this isn't 5/5 is that it's infrastructure work rather than a feature that directly improves user experience.

Minor observations:

  • Excellent use of clear variable naming (distDir, srcDir) in bench-config.ts:93-99
  • Good defensive programming with existence check before using dist/
  • Commit message follows conventional commit format correctly

This PR should be merged after CI passes. It resolves a legitimate measurement artifact issue that was affecting benchmark reliability.


@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 3, 2026

Greptile Summary

This PR fixes a benchmark methodology mismatch in the pre-publish gate: historical baselines were measured against compiled dist/ (via the post-publish --npm path), but the gate introduced in #1040 was loading src/ through --strip-types, making version-to-version deltas partly a measurement artifact rather than real performance changes. The fix adds npm run build before the four benchmark steps and threads --dist through parseArgs() / resolveBenchmarkSource() so CI now benchmarks the same compiled JS that ships to npm.

Confidence Score: 5/5

Safe to merge — no logic bugs, no security issues; changes are confined to CI workflow and the benchmark config helper.

Both changed files are straightforward: the workflow change inserts a build step in the right place and appends --dist to all four benchmark invocations; the TypeScript change is a well-guarded flag addition with clear error messaging. No P0 or P1 findings identified.

No files require special attention.

Important Files Changed

Filename Overview
.github/workflows/publish.yml Adds npm run build step before all four benchmark scripts and passes --dist to each; step ordering and fail-fast behavior look correct.
scripts/lib/bench-config.ts Adds --dist flag to parseArgs() and resolveBenchmarkSource(), with a clear error if dist/ is missing and a warning when combined with --npm; logic is sound.

Sequence Diagram

sequenceDiagram
    participant CI as publish.yml (CI)
    participant Build as npm run build
    participant BenchScript as scripts/benchmark.ts (--strip-types)
    participant BenchConfig as bench-config.ts
    participant Dist as dist/ (compiled JS)

    CI->>Build: npm run build
    Build-->>CI: dist/ ready

    CI->>BenchScript: node --strip-types ... --dist
    BenchScript->>BenchConfig: resolveBenchmarkSource()
    BenchConfig->>Dist: fs.existsSync(distDir) ✓
    Dist-->>BenchConfig: srcDir = dist/
    BenchConfig-->>BenchScript: { srcDir: dist/, version, cleanup }
    BenchScript->>Dist: dynamic import via srcImport()
    Dist-->>BenchScript: codegraph modules (compiled JS)
    BenchScript-->>CI: benchmark-result.json
Loading

Reviews (3): Last reviewed commit: "Merge branch 'fix/issue-1055-bench-metho..." | Re-trigger Greptile

Comment on lines 88 to 91
if (!npm) {
// Local mode — use repo src/, version derived from git state
// Local mode — use repo src/ (or dist/ when --dist), version from git state
const root = path.resolve(path.dirname(new URL(import.meta.url).pathname.replace(/^\/([A-Z]:)/, '$1')), '..', '..');
const pkg = JSON.parse(fs.readFileSync(path.join(root, 'package.json'), 'utf8'));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 When --npm and --dist are both supplied, dist is parsed but silently dropped in resolveBenchmarkSource() because the if (!npm) branch is never entered. A developer running node … --npm --dist would get no feedback that --dist was ignored, and the flag combination doesn't make semantic sense (npm mode always auto-selects dist/ from the installed package anyway). A small warning would prevent confusion.

Suggested change
if (!npm) {
// Local mode — use repo src/, version derived from git state
// Local mode — use repo src/ (or dist/ when --dist), version from git state
const root = path.resolve(path.dirname(new URL(import.meta.url).pathname.replace(/^\/([A-Z]:)/, '$1')), '..', '..');
const pkg = JSON.parse(fs.readFileSync(path.join(root, 'package.json'), 'utf8'));
if (dist && npm) {
console.error('Warning: --dist is ignored in --npm mode (the installed package already uses dist/ automatically).');
}
if (!npm) {
// Local mode — use repo src/ (or dist/ when --dist), version from git state
const root = path.resolve(path.dirname(new URL(import.meta.url).pathname.replace(/^\/([A-Z]:)/, '$1')), '..', '..');
const pkg = JSON.parse(fs.readFileSync(path.join(root, 'package.json'), 'utf8'));

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in cfea14c — added a warning when --dist is combined with --npm, since --dist is silently ignored in npm mode.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 3, 2026

Codegraph Impact Analysis

2 functions changed28 callers affected across 9 files

  • parseArgs in scripts/lib/bench-config.ts:58 (25 transitive callers)
  • resolveBenchmarkSource in scripts/lib/bench-config.ts:85 (27 transitive callers)

carlos-alm and others added 3 commits May 3, 2026 22:09
In --npm mode, --dist was silently ignored because the installed package
already auto-selects dist/ from its prebuilt files. Print a clear warning
when both flags are passed so developers don't silently get unexpected
behavior.
@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit bbed2c1 into main May 4, 2026
14 checks passed
@carlos-alm carlos-alm deleted the fix/issue-1055-bench-methodology branch May 4, 2026 06:53
@github-actions github-actions Bot locked and limited conversation to collaborators May 4, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf(native): fnDeps query 52% slower in v3.10.0 (likely shares root cause with #1054)

1 participant