feat(parser): Python / Rust / Go structural extractors (#883 phase 2)#928
Merged
Conversation
Extends the language-aware fast path to three more languages, mirroring the TS extractor from phase 1 (#927). All four languages now share a single rendering / bucketing scaffold so adding the next language is two files plus a switch case. Implementation: - New `structuralDiff.ts` — shared scaffolding: walks a unified diff, calls a per-language `parseLine` callback, buckets symbols into added / removed / signature-change, formats the templated summary. Knows how to render method / impl / module / trait kinds in addition to the function / class / type / etc. set the TS phase needed. - `tsStructuralDiff.ts` refactored onto the shared scaffold. Preserved as the export for existing callers; existing tests continue to pass. - `pythonStructuralDiff.ts` — recognizes module-level `def` / `async def` / `class` / PEP 695 `type` aliases / ALL_CAPS const assignments. Exported flag tracks underscore-prefix convention (leading `_` → not exported). Decorators are skipped — the following def carries the structural signal. - `rustStructuralDiff.ts` — recognizes pub-qualified fn / struct / enum / trait / impl (both `impl T` and `impl Trait for T`) / type aliases / pub const / pub static / mod declarations. Accepts up to 4 spaces of leading indent so rustfmt'd impl-block-method declarations get caught. `pub(crate)` / `pub(super)` / `pub(in path)` all count as exported. - `goStructuralDiff.ts` — recognizes top-level func (incl. receivers, rendered as `Receiver.method`), type X struct / interface / aliases, single-line var / const. Exported tracks Go's capital-first-letter convention. - `summarizeLargeFiles.ts` — gains a `detectStructuralLanguageId` + `dispatchStructuralSummary` pair so the runtime hot path stays flat. The `languageAware.languages` allowlist expands from `'ts' | 'js'` to `'ts' | 'js' | 'py' | 'rs' | 'go'`; the type is mirrored through `LLMService.fastPath.languageAware`, `lib/types.ts`, the parser-options factory, and the regenerated JSON schema. Architectural note: still regex-first. Tree-sitter integration (scopes, receiver types, signature deltas) and the quality-eval scaffolding listed under #883 stay in the backlog because both need design discussion (parser packaging strategy, golden-set sourcing) before they ship. Tests (49 new): - structuralDiff shared rendering exercised via each language's summarizer. - pythonStructuralDiff: def / async def / class / type alias / ALL_CAPS const, underscore-prefixed = not exported, decorator skip, indent gate, body-only-edit fallthrough, signature change. - rustStructuralDiff: pub fn / pub async fn / pub const fn, visibility modifiers, struct / enum / trait, impl (plain + Trait-for-Type), type alias, ALL_CAPS const + static, mod declarations, indent gate. - goStructuralDiff: top-level func, method receivers rendered as `Receiver.method`, struct / interface / type alias, single-line var / const, capitalization-based exported flag, indent gate. Validation: - npx tsc --noEmit → 0 errors - npx jest → 1568/1568 pass (49 new) - npx eslint on touched files → clean - npm run build:schema regenerated — 'py' / 'rs' / 'go' now appear in the languageAware.languages enum. Refs #883.
This was referenced May 13, 2026
gfargo
added a commit
that referenced
this pull request
May 13, 2026
A/B harness for the language-aware fast path. Runs the parser pipeline twice against a fixed input set — once with `fastPath.languageAware.enabled: false` (LLM baseline), once with it on — and reports LLM calls saved, fast-path hit count, and token deltas per file. Why now: phase 1 + 2 of #883 (#927, #928) shipped regex extractors behind an opt-in flag. The unit tests verify the parser; nothing yet verifies the *resulting commit-message pipeline behavior* when the fast path fires. Without a mechanical eval we can't flip the flag on by default with any confidence, and we can't compare tree-sitter vs. regex when #933 lands. The user's insight: the scenario library (#908) is already a deterministic git-state factory. Reusing it as the golden-set provider means every scenario's commits become an eval input, fresh and byte-identical each run, with zero new infrastructure for "where do the test commits come from". Implementation: - `src/lib/parsers/default/__evals__/structuralExtractEval.ts` — the harness. Mocks the LLM via dependency injection (fake `chain.invoke` passed through `summarizeLargeFiles`'s existing options), counts the calls, classifies each file's outcome (`unchanged` / `trivial` / `markdown` / `languageAware` / `llm`) by inspecting the rewritten diff. Returns a typed `EvalReport` with per-run totals + pairwise deltas. Public API: `runStructuralExtractEval(diffs, configs)` and `renderEvalReportMarkdown(report, title)`. - `src/lib/parsers/default/__evals__/scenarioInputs.ts` — the golden-set adapter. `buildScenarioFixtures(scenarioName)` spins up the scenario, walks its commit log, calls `git show --numstat` + per-file `git show` to build `FileDiff[]` per commit. Returns the temp repo handle so the caller can clean up. - `src/lib/parsers/default/__evals__/fixtures.ts` — hand-crafted modification diffs that target the language-aware path specifically. Scenarios mostly trigger the lossless trivial- shape path (pure additions) so they don't exercise the fast path; the fixtures fill that gap. One fixture per language so a regression in any single extractor surfaces in its own outcome row. - `bin/structuralExtractEval.ts` — CLI driver. Writes per-input JSON + Markdown to `.bench/structural-extract-eval/<timestamp>/` and prints an aggregate summary table. Flags: `--scenario NAME`, `--fixtures-only`, `--no-fixtures`, `--languages ts,js`, `--out DIR`. - `package.json`: new `eval:structural-extract` script. - `.gitignore`: `.bench/structural-extract-eval/` excluded (output is local; regression-detection baseline is a follow-up). - `src/lib/testUtils/README.md`: section describing the eval as a consumer of the scenarios + the extraction-boundary reaffirmation. - `__evals__/README.md`: documents harness layout, input sources, CLI usage, and what's intentionally not here yet (committed baseline / regression check, real-LLM live mode, per-language pivot). Sample run (full scenarios + fixtures): 64 input files, 9 LLM calls in baseline → 3 with languageAware on. 6 calls saved, 6 fast-path hits. All 5 language fixtures (TS×2, Python, Rust, Go) correctly trigger the fast path; body-only TS and markdown-only fixtures correctly fall through to the LLM. Out of scope (each tracked separately): - Committed baseline + CI regression check. - Real-LLM live-mode harness (for "is the resulting message better" comparison, not just "did the LLM get called"). - Per-language breakdown pivot in the report. Tests (9 new): - runStructuralExtractEval: single-config no-deltas, baseline vs fast-paths-on with TS + markdown, languages allowlist honored, empty-config rejection. - renderEvalReportMarkdown: includes / omits delta table based on run count. - buildScenarioFixtures: rejects unknown scenarios, produces a fixture per commit with non-empty diffs + valid shas, output is byte-identical across runs (determinism check). Validation: - npx tsc --noEmit → 0 errors - npx jest → 1597/1597 pass (9 new) - npx eslint on touched code → clean - Manual: `npm run eval:structural-extract` produces meaningful per-input reports + aggregate summary. Closes the harness scaffolding portion of #934.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 2 of #883. Tree-sitter integration and quality-eval scaffolding stay in the backlog — both need design discussion before they ship.
Test plan
Refs #883.