Summary
In the v3.9.4 build benchmark (PR #958), the native engine processed 668 files while the WASM engine processed 728 files — a 60-file (~8%) gap. In v3.9.3 both engines ran on the same 727 files, so something regressed between 3.9.3 and 3.9.4.
Flagged by Greptile: #958 (comment)
Why this matters
CLAUDE.md states:
Dual-engine architecture: [...] Both engines must produce identical results. If they diverge, the less-accurate engine has a bug — fix it, don't document the gap.
A divergence in the set of files processed (not just results per file) means the native engine is silently failing to collect or parse ~60 files that WASM handles fine. Consequences:
- Per-file metrics (build ms/file, nodes/file, edges/file) are computed from different denominators — engine-vs-engine comparisons in the benchmark table are misleading.
- Anything the native engine silently drops is missing from downstream graph output in production use.
Evidence
From generated/benchmarks/BUILD-BENCHMARKS.md in PR #958:
| Version |
Native files |
WASM files |
| 3.9.2 |
727 |
727 |
| 3.9.3 |
727 |
727 |
| 3.9.4 |
668 |
728 |
Investigation hints
- File collection / ignore logic:
src/domain/graph/builder/ — check whether the native path now applies a different filter than WASM.
- Native addon:
crates/codegraph-core/ — silent parse failure could be swallowed rather than falling back to WASM per file.
- Recent changes touching the native engine or file collection between 3.9.3 and 3.9.4 tags.
Acceptance
- Native and WASM process the same file set (modulo parsers not available in one engine, which must be explicit, not silent).
- A CI check or benchmark assertion fails if the file counts diverge by more than a known, documented delta.
Summary
In the v3.9.4 build benchmark (PR #958), the native engine processed 668 files while the WASM engine processed 728 files — a 60-file (~8%) gap. In v3.9.3 both engines ran on the same 727 files, so something regressed between 3.9.3 and 3.9.4.
Flagged by Greptile: #958 (comment)
Why this matters
CLAUDE.mdstates:A divergence in the set of files processed (not just results per file) means the native engine is silently failing to collect or parse ~60 files that WASM handles fine. Consequences:
Evidence
From
generated/benchmarks/BUILD-BENCHMARKS.mdin PR #958:Investigation hints
src/domain/graph/builder/— check whether the native path now applies a different filter than WASM.crates/codegraph-core/— silent parse failure could be swallowed rather than falling back to WASM per file.Acceptance