test(parsers): gate LANGUAGE_REGISTRY ↔ NATIVE_SUPPORTED_EXTENSIONS parity by carlos-alm · Pull Request #1154 · optave/ops-codegraph-tool

carlos-alm · 2026-05-18T04:40:21Z

Summary

Closes #1121.

The acceptance criterion from #1071 called for a CI gate that fires when a JS LANGUAGE_REGISTRY entry has no corresponding Rust extractor (or sits on an explicit allowlist). The existing drift guard in tests/parsers/native-drop-classification.test.ts covers NATIVE_SUPPORTED_EXTENSIONS ↔ parser_registry.rs (link 2 ↔ 3), but the link from LANGUAGE_REGISTRY to NATIVE_SUPPORTED_EXTENSIONS (link 1 ↔ 2) had no test. A WASM-only language added to the registry would silently be classified as unsupported-by-native and only the JS backfill would parse it — exactly the regression class #1071 was meant to prevent.

Adds a new describe block (LANGUAGE_REGISTRY ↔ NATIVE_SUPPORTED_EXTENSIONS parity) with three tests:

Every LANGUAGE_REGISTRY extension has a Rust extractor (or appears on the documented WASM_ONLY_ALLOWLIST).
Allowlist entries can't reference extensions that already have a Rust extractor (catches stale entries left behind after a Rust port).
Allowlist entries must reference extensions actually declared in LANGUAGE_REGISTRY (catches typos and dead config).

WASM_ONLY_ALLOWLIST is an empty Set<string> today, documented in-code so future contributors know when to add to it (intentional WASM-only languages with a tracked Rust port) vs. when to port directly.

Test plan

npx vitest run tests/parsers/native-drop-classification.test.ts — 16/16 passing (13 existing + 3 new).
Verified the parity logic catches simulated drift (added .toy to a registry copy and confirmed the detector flagged it) before removing the throwaway test.
npx biome check tests/parsers/native-drop-classification.test.ts — clean.
CI green.

…arity Closes #1121. The drift guard between NATIVE_SUPPORTED_EXTENSIONS and parser_registry.rs covered link 2 ↔ 3, but link 1 ↔ 2 (LANGUAGE_REGISTRY ↔ NATIVE_SUPPORTED_EXTENSIONS) had no test. A WASM-only language added to the registry would silently degrade the native engine without flagging. Adds three tests with an explicit WASM_ONLY_ALLOWLIST (currently empty) so the allowlist itself can't rot — entries must reference a real registry extension and must not duplicate a language that's already been ported.

claude · 2026-05-18T04:40:34Z

Claude finished @carlos-alm's task —— View job

PR Review - Final Assessment

Reviewing PR #1154: test(parsers): gate LANGUAGE_REGISTRY ↔ NATIVE_SUPPORTED_EXTENSIONS parity

Todo List

Read the modified test file to understand implementation
Analyze against Phase 1: Root Cause Analysis
Analyze against Phase 2: Configuration Change Detection
Analyze against Phase 3: Test Integrity
Analyze against Phase 4: Code Quality
Analyze against Phase 5: Backlog Compliance
Provide final recommendation and rating

Phase 1: Root Cause Analysis ✅

Problem Being Solved: Real issue from #1121 - there was a gap in drift detection between the JS LANGUAGE_REGISTRY (single source of truth for supported languages) and NATIVE_SUPPORTED_EXTENSIONS (hand-maintained mirror of Rust enum coverage). A WASM-only language could be silently added to the registry without a corresponding Rust extractor, causing the native engine to degrade silently without triggering CI warnings.

Why This Approach: Justified architectural fix. The existing drift guard only covered NATIVE_SUPPORTED_EXTENSIONS ↔ parser_registry.rs (link 2↔3), but the critical link 1↔2 (LANGUAGE_REGISTRY → NATIVE_SUPPORTED_EXTENSIONS) was unguarded. This would allow regressions exactly as described in #1071.

Risk Assessment: Low risk - pure additive testing with well-defined allowlist mechanism for intentional exceptions.

Phase 2: Configuration Change Detection ✅

HIGH-RISK Changes: None detected. This PR only adds tests - no linting rules disabled, no timeouts increased, no build configuration modified, no TypeScript strict mode changes.

Assessment: Clean - purely additive test coverage with no configuration relaxation.

Phase 3: Test Integrity ✅

RED FLAGS: None detected. The tests are properly structured and strict:

✅ Strong assertions: Tests use .toEqual([]) which fails on any unexpected entries
✅ Comprehensive coverage: Three complementary tests cover all drift scenarios
✅ No weakened expectations: Tests maintain strict parity requirements
✅ Clear documentation: Extensive comments explain the architectural purpose

Test Quality Assessment:

Test 1 (lines 239-256): Ensures every LANGUAGE_REGISTRY extension either has native support OR is explicitly allowlisted
Test 2 (lines 258-266): Prevents stale allowlist entries after Rust ports are completed
Test 3 (lines 268-282): Catches typos and orphaned allowlist entries

All tests are properly isolated and test exactly what they claim to test.

Phase 4: Code Quality ✅

Implementation Quality: Excellent

✅ Clean TypeScript: Proper typing with ReadonlySet<string>, correct imports
✅ Logical structure: Tests follow natural dependency order (parity → staleness → orphans)
✅ Error messages: Comprehensive error messages with actionable guidance (lines 250-254, 263-264, 279-280)
✅ Code organization: Well-integrated into existing test structure
✅ Documentation: Extensive inline documentation explaining architectural context

Minor Code Quality Notes:

Line 237: Empty allowlist is intentional and well-documented
Line 243: Proper case normalization with .toLowerCase()
Lines 246-249: Efficient filtering chains that are readable

Security/Performance: No concerns - pure test code with no external dependencies or side effects.

Phase 5: Backlog Compliance ✅

Zero-dep: ✓ — No new runtime dependencies added. Uses existing test infrastructure (vitest, existing imports)

Foundation-aligned: ✓ — Fully aligned. Strengthens the dual-engine architecture (P1: always-current) by ensuring native engine coverage matches the JS registry. No cloud APIs, external DBs, or agent behavior.

Problem-fit: 5/5 — Directly prevents AI agent confusion from inconsistent engine results. When the native engine silently degrades due to missing extractors, agents receive inconsistent dependency graphs depending on engine selection, leading to hallucinated dependencies and incorrect blast radius analysis. This gate prevents exactly that scenario.

Breaking: No — Purely additive test coverage with no behavior changes to public APIs, CLI output, or database schema.

Tier: 1 (Zero-dep ✓ + Foundation-aligned ✓)

Critical Concerns

None. This is a high-quality defensive improvement that strengthens the core architecture.

Final Recommendation

Root Cause Analysis

Problem Being Solved: CI gap allowing silent native engine degradation via WASM-only language additions
Why This Approach: Closes architectural drift detection gap with proper allowlist mechanism
Risk Assessment: Very low - additive testing only

Backlog Compliance

Zero-dep: ✓ — No new runtime dependencies
Foundation-aligned: ✓ — Strengthens dual-engine architecture
Problem-fit: 5 — Directly prevents AI agent confusion from engine inconsistency
Breaking: No — Purely additive
Tier: 1

Critical Concerns

None

Final Recommendation

Rating: ⭐⭐⭐⭐⭐ (5/5)
Action: APPROVE
Reasoning: Exemplary defensive programming. Addresses real architectural vulnerability with comprehensive test coverage, clear documentation, and proper escape hatch design. The allowlist mechanism is well-designed for the rare case where WASM-only support is temporarily needed. This directly improves AI agent reliability by preventing silent engine parity drift.

greptile-apps · 2026-05-18T04:42:54Z

Greptile Summary

Adds a new describe block to native-drop-classification.test.ts that gates the LANGUAGE_REGISTRY ↔ NATIVE_SUPPORTED_EXTENSIONS link — the previously untested segment that would allow a WASM-only language to silently degrade native parsing. An empty WASM_ONLY_ALLOWLIST with clear contribution guidance is included for future intentional exceptions.

Test 1 verifies every registry extension has a Rust extractor or is on the allowlist; it applies toLowerCase() to registry extensions but does not normalise the allowlist before the Set.has() lookup, unlike the parallel fix already applied in tests 2 and 3.
Tests 2 & 3 correctly normalise allowlist entries before comparison (addressing feedback from the prior review round), catching stale and orphaned allowlist entries respectively.
registryExts is computed identically in both test 1 and test 3; hoisting it to the describe scope would remove the duplication.

Confidence Score: 5/5

Test-only change adding three new CI gates; no production code is modified and all three tests pass today with an empty allowlist.

The change is purely additive test code that closes a documented gap. The new gates correctly catch forward drift and the normalisation fixes from the prior round are in place for tests 2 and 3. One minor asymmetry remains in test 1's allowlist lookup, but it would only produce a false positive rather than silently missing a regression.

No files require special attention; the single changed file is low-risk test code.

Important Files Changed

Filename	Overview
tests/parsers/native-drop-classification.test.ts	Adds three new tests gating LANGUAGE_REGISTRY ↔ NATIVE_SUPPORTED_EXTENSIONS parity; logic is sound with one minor asymmetry in the allowlist normalization for test 1.

_{Reviews (4): Last reviewed commit: "Merge branch 'main' into feat/1121-langu..." | Re-trigger Greptile}

greptile-apps · 2026-05-18T04:42:57Z

+  it('WASM_ONLY_ALLOWLIST does not list extensions that already have a Rust extractor', () => {
+    // Catches stale allowlist entries: once a language is ported to Rust the
+    // allowlist line should be deleted, not left behind as dead config.
+    const stale = [...WASM_ONLY_ALLOWLIST].filter((ext) => NATIVE_SUPPORTED_EXTENSIONS.has(ext));


Allowlist entries in tests 2 and 3 are compared as-is, while registryExts (used in test 1 and test 3) and NATIVE_SUPPORTED_EXTENSIONS are all lowercase. If a future contributor adds a mixed-case entry such as .TS to the allowlist, test 2 would silently miss it as stale (NATIVE_SUPPORTED_EXTENSIONS.has('.TS') → false) and test 3 would wrongly report it as an orphan (registryExts.has('.TS') → false because registryExts was built with .toLowerCase()). Normalising allowlist entries to lowercase in these two filters keeps all three tests consistent.

Suggested change

const stale = [...WASM_ONLY_ALLOWLIST].filter((ext) => NATIVE_SUPPORTED_EXTENSIONS.has(ext));

const stale = [...WASM_ONLY_ALLOWLIST]

.map((ext) => ext.toLowerCase())

.filter((ext) => NATIVE_SUPPORTED_EXTENSIONS.has(ext));

Fixed in 57eb769 — added .map((ext) => ext.toLowerCase()) before the NATIVE_SUPPORTED_EXTENSIONS filter so a mixed-case allowlist entry can't slip past the staleness check.

greptile-apps · 2026-05-18T04:42:58Z

+        registryExts.add(ext.toLowerCase());
+      }
+    }
+    const orphans = [...WASM_ONLY_ALLOWLIST].filter((ext) => !registryExts.has(ext));


Same normalisation gap as in test 2 — without .toLowerCase() here, a capitalised allowlist entry would be falsely reported as an orphan even though a matching (lowercase) extension exists in registryExts.

Suggested change

const orphans = [...WASM_ONLY_ALLOWLIST].filter((ext) => !registryExts.has(ext));

const orphans = [...WASM_ONLY_ALLOWLIST]

.map((ext) => ext.toLowerCase())

.filter((ext) => !registryExts.has(ext));

Fixed in 57eb769 — added .map((ext) => ext.toLowerCase()) before the registryExts filter so the orphan check matches the lowercase form actually stored in registryExts.

Lowercase allowlist entries before comparing against `registryExts` and `NATIVE_SUPPORTED_EXTENSIONS` so a mixed-case future entry can't slip past the staleness and orphan checks. Both reference sets are already lowercased, so without this normalization a capitalized allowlist entry would be silently skipped by test 2 and falsely flagged by test 3.

carlos-alm · 2026-05-18T06:30:18Z

@greptileai

greptile-apps Bot reviewed May 18, 2026

View reviewed changes

Merge branch 'main' into feat/1121-language-registry-native-parity

3a5c700

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(parsers): gate LANGUAGE_REGISTRY ↔ NATIVE_SUPPORTED_EXTENSIONS parity#1154

test(parsers): gate LANGUAGE_REGISTRY ↔ NATIVE_SUPPORTED_EXTENSIONS parity#1154
carlos-alm wants to merge 3 commits into
mainfrom
feat/1121-language-registry-native-parity

carlos-alm commented May 18, 2026

Uh oh!

claude Bot commented May 18, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 18, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot May 18, 2026

Uh oh!

carlos-alm May 18, 2026

Uh oh!

greptile-apps Bot May 18, 2026

Uh oh!

carlos-alm May 18, 2026

Uh oh!

carlos-alm commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

carlos-alm commented May 18, 2026

Summary

Test plan

Uh oh!

claude Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review - Final Assessment

Todo List

Phase 1: Root Cause Analysis ✅

Phase 2: Configuration Change Detection ✅

Phase 3: Test Integrity ✅

Phase 4: Code Quality ✅

Phase 5: Backlog Compliance ✅

Critical Concerns

Final Recommendation

Root Cause Analysis

Backlog Compliance

Critical Concerns

Final Recommendation

Uh oh!

greptile-apps Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Uh oh!

greptile-apps Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm May 18, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm May 18, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude Bot commented May 18, 2026 •

edited

Loading

greptile-apps Bot commented May 18, 2026 •

edited

Loading