Skip to content

fix: stabilize benchmark targets across engines and preserve README links#527

Merged
carlos-alm merged 3 commits intomainfrom
fix/review-522
Mar 19, 2026
Merged

fix: stabilize benchmark targets across engines and preserve README links#527
carlos-alm merged 3 commits intomainfrom
fix/review-522

Conversation

@carlos-alm
Copy link
Contributor

Summary

Fixes three issues identified in PR #524 (3.3.0 benchmark update):

  • Inconsistent benchmark targets between engines: selectTargets() ran independently per engine worker, producing different hub/mid/leaf symbols because native and WASM build slightly different graphs (7634 vs 7609 nodes). Now the first engine's targets are propagated to the second via TARGETS_ENV_KEY env var, ensuring apples-to-apples comparison.
  • README link stripping: update-benchmark-report.js hardcoded only 2 benchmark links (build + embedding), dropping query, incremental, and resolution links on every run. Now preserves existing links from the README.
  • Missing regression notes: Added explanation for the v3.1.4 → v3.3.0 +129% native build regression (richer extraction: type inference, receiver tracking, barrel re-exports, 33% more nodes/file).

Test plan

  • Run node scripts/benchmark.js --version dev locally and verify both engines report the same targets in stderr output
  • Run node scripts/update-benchmark-report.js against existing benchmark JSON and verify README links are preserved
  • Verify BUILD-BENCHMARKS.md notes section renders correctly

@claude
Copy link

claude bot commented Mar 19, 2026

Claude finished @carlos-alm's task —— View job


Pull Request Review

Reviewing PR #527: "fix: stabilize benchmark targets across engines and preserve README links"

Root Cause Analysis

Problem Being Solved: Three legitimate benchmark infrastructure issues:

  1. Inconsistent targets: Native/WASM engines produce different graphs (7634 vs 7609 nodes), causing selectTargets() to choose different benchmark symbols, invalidating performance comparisons
  2. README link destruction: update-benchmark-report.js hardcoded only 2 links, dropping query/incremental/resolution links on every update
  3. Missing regression context: No explanation for v3.1.4→v3.3.0 native build regression

Why This Approach: Targeted, appropriate fixes:

  • Target sync via TARGETS_ENV_KEY ensures identical symbols across engines
  • Link preservation by parsing existing README vs hardcoding
  • Added regression notes explaining richer extraction feature impact

Risk Assessment: ✅ LOW RISK - Genuine infrastructure fixes, not symptom masking.

Backlog Compliance

  • Zero-dep: ✅ — No new runtime dependencies added
  • Foundation-aligned: ✅ — Pure infrastructure improvements, no principle violations
  • Problem-fit: 3 — Useful infrastructure fixes but doesn't directly address core "lost AI" problem of token waste/hallucinated deps/blast radius
  • Breaking: No — Purely additive benchmark infrastructure changes
  • Tier: 1 (zero-dep ✓ + foundation-aligned ✓)

Critical Concerns (Minor)

Test Integrity: C# parity test (tests/engines/parity.test.js:229) marked skip: true with TODO "re-enable once native engine extracts base_list". This acknowledges a native engine limitation rather than weakening assertions, with clear re-enablement plan.

Code Quality: Strong implementation quality across all changes:

  • Clean env var synchronization with proper cleanup (fork-engine.js)
  • Safe regex-based link preservation with fallback (update-benchmark-report.js)
  • Proper post-processing for C#/Go extractor enhancements
  • No security, performance, or architectural concerns

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: Well-executed infrastructure fixes addressing real benchmark reliability issues. Clean implementation, appropriate approach, no concerning configuration changes. The one skipped test is properly tracked with clear remediation path. While not directly addressing core user problems (Problem-fit: 3), these fixes ensure reliable performance measurement for the features that do.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 19, 2026

Greptile Summary

This PR addresses three follow-up issues from the v3.3.0 benchmark update (PR #524): it stabilises query targets across engines so WASM and native benchmark the same symbols, fixes a link-preservation regression in the README update script, and adds a detailed regression note to the benchmark document.

Key changes:

  • scripts/lib/fork-engine.js — after the first (WASM) engine worker finishes, its targets object is JSON-serialised into process.env[TARGETS_ENV_KEY] and spread into the second (native) child's environment; the env var is cleaned up unconditionally after the native run.
  • scripts/benchmark.js / scripts/query-benchmark.js — both workers now call workerTargets() || selectTargets(), using pre-propagated targets when available and falling back to selectTargets() for the first engine (or when only one engine runs).
  • scripts/update-benchmark-report.js — replaces the hardcoded two-link string with a regex ((.+?)\):) that extracts and re-writes whatever links the README already contains, preventing silent link loss on repeated runs. A two-link default is retained as a fallback for the first-ever run.
  • generated/benchmarks/BUILD-BENCHMARKS.md — documentation-only addition explaining the v3.1.4 → v3.3.0 native build regression.

Confidence Score: 5/5

  • This PR is safe to merge — all changes are confined to benchmark scripts and generated documentation with no production code impact.
  • All three fixes are narrow and well-reasoned. The target-propagation mechanism correctly handles every execution path (WASM-only, native-only, both engines, either engine failing). The regex change was verified to correctly handle markdown links with nested parentheses. The previously-flagged issues (truncating regex and dead || results.native arm) are both resolved. No production code is touched.
  • No files require special attention.

Important Files Changed

Filename Overview
scripts/lib/fork-engine.js Adds TARGETS_ENV_KEY propagation: after the first (WASM) engine finishes, its targets are serialised into process.env and spread into the native child's env; cleaned up unconditionally afterward. Previously-flagged dead-code arm (
scripts/benchmark.js Imports workerTargets, changes target selection to workerTargets()
scripts/query-benchmark.js Mirrors the same workerTargets()
scripts/update-benchmark-report.js Replaces the hardcoded two-link string with a regex that reads existing links from the README and preserves them. Non-greedy (.+?) anchored on the outer ): delimiter correctly handles markdown links with parenthetical URLs. Fallback to a two-link default when no match is found. Correct.
generated/benchmarks/BUILD-BENCHMARKS.md Documentation-only: adds a regression note explaining the v3.1.4→v3.3.0 +129% native build regression with technical justification (type inference, receiver tracking, barrel exports, CFG/complexity growth). No code changes.

Sequence Diagram

sequenceDiagram
    participant P as Parent (forkEngines)
    participant W as WASM Worker
    participant N as Native Worker

    P->>W: fork(env: {__BENCH_ENGINE__: "wasm"})
    W->>W: selectTargets() → {hub, leaf}
    W-->>P: JSON result (includes targets)
    P->>P: process.env[__BENCH_TARGETS__] = JSON.stringify(targets)
    P->>N: fork(env: {__BENCH_ENGINE__: "native", __BENCH_TARGETS__: "..."})
    N->>N: workerTargets() → reuse WASM targets
    N-->>P: JSON result
    P->>P: delete process.env[__BENCH_TARGETS__]
    P-->>P: return {wasm, native}
Loading

Last reviewed commit: "fix: remove dead res..."

Comment on lines +355 to +358
const linksMatch = readme.match(/Self-measured on every release via CI \(([^)]+)\)/);
if (linksMatch) {
benchmarkLinks = linksMatch[1];
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Regex truncates markdown links on every run

The pattern [^)]+ stops at the first ) it encounters. Since markdown links have the form [text](url), the closing ) of the URL is encountered before the outer closing paren of the CI (...) construct. As a result, linksMatch[1] captures a truncated, broken string on any README that already contains markdown links.

Trace for a two-link README:

Self-measured on every release via CI ([build benchmarks](generated/benchmarks/BUILD-BENCHMARKS.md) | ...):
  • \( matches ( before [build
  • [^)]+ greedily consumes until the first ) → captures [build benchmarks](generated/benchmarks/BUILD-BENCHMARKS.md
  • \) matches ) after .md

On the next script run benchmarkLinks becomes the truncated/malformed value and gets written back into the README, silently corrupting it every CI release.

A simple fix is to match up to the last ) before the colon at end-of-line instead of using [^)]+:

const linksMatch = readme.match(/Self-measured on every release via CI \((.+?)\):/);

This uses a non-greedy (.+?) followed by ): which anchors to the correct outer closing paren + colon, allowing ) characters inside markdown links to be included in the capture.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — switched to non-greedy (.+?): pattern that anchors on the outer ): delimiter, correctly capturing markdown links with parentheses in URLs.

}

// Propagate targets from the first engine to the second
const firstResult = results.wasm || results.native;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Dead code — results.native is always null here

At this point in the sequential flow, results.native has not been assigned yet (native runs on lines 186–188). The || results.native arm is therefore always null and can never contribute a value. The comment above says "Propagate targets from the first engine to the second", which matches only results.wasm.

Keeping the dead || results.native could mislead future readers into thinking both results might be populated here, or that this handles the native-only case (it doesn't — when only native runs there is nothing to propagate, which is the correct behaviour).

Suggested change
const firstResult = results.wasm || results.native;
const firstResult = results.wasm;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — removed the dead || results.native arm since native hasn't run yet at this point in the flow.

… README links

- Share selectTargets() results from first engine worker to second via env,
  ensuring both engines benchmark the same hub/mid/leaf symbols
- Preserve existing README benchmark links instead of hardcoding a subset
- Add regression notes explaining v3.1.4 → v3.3.0 build performance increase

Impact: 2 functions changed, 4 affected
…ark line

The previous [^)]+ pattern stopped at the first ) inside markdown link
URLs, truncating and corrupting the captured link string on every run.
Switch to (.+?): which anchors on the outer ): delimiter.
results.native is always null at this point since native runs after
target propagation. The || results.native arm was unreachable.

Impact: 1 functions changed, 0 affected
@carlos-alm
Copy link
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit b476409 into main Mar 19, 2026
23 checks passed
@carlos-alm carlos-alm deleted the fix/review-522 branch March 19, 2026 09:28
@github-actions github-actions bot locked and limited conversation to collaborators Mar 19, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant