DFA: AST matcher, first-token index, parity tests, benchmarks#1976
Merged
DFA: AST matcher, first-token index, parity tests, benchmarks#1976
Conversation
Add matchDFAToAST — a DFA matcher that produces a structural MatchAST instead of using slot-based value computation. Uses minimal munch with backtracking: wildcards consume as few tokens as possible, decision points are recorded when literals are preferred over wildcards. Key changes: - dfa.ts: MatchAST types (TokenMatchNode, WildcardMatchNode, etc.) - dfaMatcher.ts: matchDFAToAST, matchDFAToASTWithSplitting, evaluateMatchAST (bottom-up value computation from grammar ValueNodes), isRuntimeChecked (fixes wildcard type checked/unchecked at match time) - nfaDfaParity.spec.ts: AST parity tests covering unchecked wildcards, two-wildcard rules, number/Ordinal/Cardinal entities, priority - dfaBenchmark.spec.ts: Added AST matcher timing + 5 real-world agent grammars (list, desktop, calendar, weather, browser) Benchmark results across 6 grammars: - DFA state ratio: 0.17x–0.54x (fewer states than NFA) - DFA memory: 1.5x–3.2x larger per state (transitions + captureInfo) - AST matcher: ~88x avg speedup (matched), ~158x (unmatched) vs NFA Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
In AGR, "string" and "wildcard" are synonyms for untyped wildcards. The NFA interpreter and completion code already checked for both, but the DFA compiler and slot-based matcher only excluded "string". This caused $(track:wildcard) to be incorrectly marked as a checked entity. Fixed in: dfaCompiler.ts (compile-time isChecked), dfaMatcher.ts (runtime entryIsChecked + entity conversion guard), nfaCompiler.ts (deprecated compileWildcardPart path). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tor tests - Add O(1) first-token rejection to DFA/AST matchers (WeakMap-cached index built from DFA start state transitions). Unmatched requests now reject as fast as NFA+index (~0.02μs). - Fix weather grammar: remove double-quoted phrases (AGR parser treats quotes as literal chars), switch $(location:string) to wildcard. - Update grammarGenerator tests to match current output: wildcard not string, Cardinal not number, bare variable names in value expressions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…r nested RulesPart The merge overwrote evaluateMatchAST with a version that assumed rule.value exists directly on the matched rule. But for grammars with alternatives like <Start> = <play> | <stop>, the DFA AST matcher inlines alternatives (producing token/wildcard nodes directly, not ruleRef nodes), so the value expression lives on the nested alternative inside the RulesPart, not on the wrapper rule. Restores findValueExpression + matchesRuleStructure to search through nested RulesPart structures and find the correct value expression by structural comparison. Also removes duplicate exports in index.ts from the merge. All 1102 tests pass (167 parity, 28 suites). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
3 tasks
github-merge-queue bot
pushed a commit
that referenced
this pull request
Mar 4, 2026
## Summary Execution contexts (`DFAExecutionContext[]`) were **48-77% of all DFA memory** but only needed during compilation — not at match time. This PR extracts the needed fields (`ruleIndex`, `activeRuleIndices`) directly onto `DFAState`, then frees the contexts arrays via `DFABuilder.compact()`. Depends on #1976 (currently in merge queue). ## Space: Before vs After | Grammar | DFA States | Before (KB) | After (KB) | Reduction | DFA / NFA | |---------|-----------|-------------|------------|-----------|-----------| | player | 62 | 80.6 | **14.8** | **82%** | 0.28x | | list | 120 | 88.8 | **28.3** | **68%** | 0.70x | | desktop | 724 | 677.1 | **135.4** | **80%** | 0.51x | | calendar | 99 | 41.8 | **19.2** | **54%** | 0.80x | | weather | 84 | 131.1 | **19.9** | **85%** | 0.44x | | browser | 76 | 46.8 | **14.7** | **69%** | 0.51x | DFA now uses **0.28–0.80x** the NFA's memory (previously 1.5–2.9x). ## Speed: DFA/AST vs NFA (μs/call, 1000 iterations) | Grammar | Request | NFA | NFA+idx | DFA/AST | AST speedup | |---------|---------|-----|---------|---------|-------------| | player | pause | 41.2 | 1.0 | **1.5** | 27x | | player | play Shake It Off by Taylor... | 238.5 | 190.2 | **3.5** | 68x | | desktop | open chrome | 120.2 | 22.9 | **0.6** | 189x | | desktop | tile notepad and calculator | 263.3 | 139.5 | **1.0** | 278x | | weather | forecast for Chicago... | 1397.6 | 1183.3 | **2.0** | 694x | | browser | open google.com | 60.6 | 7.6 | **0.6** | 99x | | *unmatched* | install visual studio | 80.6 | 0.1 | **0.02** | 3222x | **Avg matched speedup: ~96x. Avg unmatched speedup: ~1161x.** ## Changes - **`dfa.ts`**: Add `ruleIndex`, `activeRuleIndices` to `DFAState`; remove `contextIndex` from `bestPriority`; add `DFABuilder.compact()` static method - **`dfaCompiler.ts`**: Call `DFABuilder.compact(dfa)` after build - **`dfaMatcher.ts`**: Replace all `contexts[bestPriority.contextIndex]` lookups with direct `state.ruleIndex`; use `state.activeRuleIndices` for completions ## Test plan - [x] All 1102 local tests pass (28 suites, 167 parity tests) - [x] Policy check passes (2659 checks) - [x] Benchmark confirms space reduction and no speed regression 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
matchDFAToAST/matchDFAToASTWithSplittingproduce aMatchASTparse tree from DFA matching, withevaluateMatchASTcomputing action values via bottom-up traversal using grammar ValueNode expressionsmatchDFAWithSplittingandmatchDFAToASTWithSplittingisRuntimeCheckeddetermines at match time whether wildcards are truly "checked" (have a registered entity validator), with per-token wildcard counting matching NFA semanticsweatherSchema.agrto bare-word syntax (removing double-quoted phrases that embedded quotes into tokens)Test plan
🤖 Generated with Claude Code