Skip to content

DFA: AST matcher, first-token index, parity tests, benchmarks#1976

Merged
steveluc merged 5 commits intomainfrom
dfa-ast-parity-benchmark
Mar 3, 2026
Merged

DFA: AST matcher, first-token index, parity tests, benchmarks#1976
steveluc merged 5 commits intomainfrom
dfa-ast-parity-benchmark

Conversation

@steveluc
Copy link
Contributor

@steveluc steveluc commented Mar 3, 2026

Summary

  • AST-based DFA matcher: matchDFAToAST / matchDFAToASTWithSplitting produce a MatchAST parse tree from DFA matching, with evaluateMatchAST computing action values via bottom-up traversal using grammar ValueNode expressions
  • DFA first-token index: O(1) pre-filter (analogous to NFA first-token index) that rejects tokens whose first word can't start any DFA path — applied to both matchDFAWithSplitting and matchDFAToASTWithSplitting
  • Runtime entity validation: isRuntimeChecked determines at match time whether wildcards are truly "checked" (have a registered entity validator), with per-token wildcard counting matching NFA semantics
  • Multi-grammar benchmarks: Comprehensive timing/state-count benchmarks across player, calendar, desktop, list, weather, and browser grammars comparing NFA, NFA+index, DFA hybrid, and DFA/AST matchers
  • Weather grammar fix: Rewrote weatherSchema.agr to bare-word syntax (removing double-quoted phrases that embedded quotes into tokens)
  • 167 NFA/DFA parity tests ensuring the DFA matcher produces identical results to NFA for all grammars

Test plan

  • All 1102 local tests pass (28 suites)
  • 167 NFA/DFA parity tests (value equality, priority counters, completion parity)
  • Benchmark suite covers 6 agent grammars with matched + unmatched requests
  • Policy check passes (2659 checks, 3002 files)
  • Weather grammar matches correctly after bare-word rewrite

🤖 Generated with Claude Code

steveluc and others added 5 commits March 3, 2026 12:55
Add matchDFAToAST — a DFA matcher that produces a structural MatchAST
instead of using slot-based value computation. Uses minimal munch with
backtracking: wildcards consume as few tokens as possible, decision
points are recorded when literals are preferred over wildcards.

Key changes:
- dfa.ts: MatchAST types (TokenMatchNode, WildcardMatchNode, etc.)
- dfaMatcher.ts: matchDFAToAST, matchDFAToASTWithSplitting,
  evaluateMatchAST (bottom-up value computation from grammar ValueNodes),
  isRuntimeChecked (fixes wildcard type checked/unchecked at match time)
- nfaDfaParity.spec.ts: AST parity tests covering unchecked wildcards,
  two-wildcard rules, number/Ordinal/Cardinal entities, priority
- dfaBenchmark.spec.ts: Added AST matcher timing + 5 real-world agent
  grammars (list, desktop, calendar, weather, browser)

Benchmark results across 6 grammars:
- DFA state ratio: 0.17x–0.54x (fewer states than NFA)
- DFA memory: 1.5x–3.2x larger per state (transitions + captureInfo)
- AST matcher: ~88x avg speedup (matched), ~158x (unmatched) vs NFA

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
In AGR, "string" and "wildcard" are synonyms for untyped wildcards.
The NFA interpreter and completion code already checked for both, but
the DFA compiler and slot-based matcher only excluded "string". This
caused $(track:wildcard) to be incorrectly marked as a checked entity.

Fixed in: dfaCompiler.ts (compile-time isChecked), dfaMatcher.ts
(runtime entryIsChecked + entity conversion guard), nfaCompiler.ts
(deprecated compileWildcardPart path).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tor tests

- Add O(1) first-token rejection to DFA/AST matchers (WeakMap-cached index
  built from DFA start state transitions). Unmatched requests now reject
  as fast as NFA+index (~0.02μs).
- Fix weather grammar: remove double-quoted phrases (AGR parser treats
  quotes as literal chars), switch $(location:string) to wildcard.
- Update grammarGenerator tests to match current output: wildcard not
  string, Cardinal not number, bare variable names in value expressions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…r nested RulesPart

The merge overwrote evaluateMatchAST with a version that assumed rule.value
exists directly on the matched rule. But for grammars with alternatives like
<Start> = <play> | <stop>, the DFA AST matcher inlines alternatives (producing
token/wildcard nodes directly, not ruleRef nodes), so the value expression
lives on the nested alternative inside the RulesPart, not on the wrapper rule.

Restores findValueExpression + matchesRuleStructure to search through nested
RulesPart structures and find the correct value expression by structural
comparison. Also removes duplicate exports in index.ts from the merge.

All 1102 tests pass (167 parity, 28 suites).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@steveluc steveluc temporarily deployed to development-fork March 3, 2026 22:33 — with GitHub Actions Inactive
@steveluc steveluc temporarily deployed to development-fork March 3, 2026 22:33 — with GitHub Actions Inactive
@steveluc steveluc added this pull request to the merge queue Mar 3, 2026
Merged via the queue into main with commit 3104aa0 Mar 3, 2026
21 checks passed
github-merge-queue bot pushed a commit that referenced this pull request Mar 4, 2026
## Summary

Execution contexts (`DFAExecutionContext[]`) were **48-77% of all DFA
memory** but only needed during compilation — not at match time. This PR
extracts the needed fields (`ruleIndex`, `activeRuleIndices`) directly
onto `DFAState`, then frees the contexts arrays via
`DFABuilder.compact()`.

Depends on #1976 (currently in merge queue).

## Space: Before vs After

| Grammar | DFA States | Before (KB) | After (KB) | Reduction | DFA /
NFA |

|---------|-----------|-------------|------------|-----------|-----------|
| player | 62 | 80.6 | **14.8** | **82%** | 0.28x |
| list | 120 | 88.8 | **28.3** | **68%** | 0.70x |
| desktop | 724 | 677.1 | **135.4** | **80%** | 0.51x |
| calendar | 99 | 41.8 | **19.2** | **54%** | 0.80x |
| weather | 84 | 131.1 | **19.9** | **85%** | 0.44x |
| browser | 76 | 46.8 | **14.7** | **69%** | 0.51x |

DFA now uses **0.28–0.80x** the NFA's memory (previously 1.5–2.9x).

## Speed: DFA/AST vs NFA (μs/call, 1000 iterations)

| Grammar | Request | NFA | NFA+idx | DFA/AST | AST speedup |
|---------|---------|-----|---------|---------|-------------|
| player | pause | 41.2 | 1.0 | **1.5** | 27x |
| player | play Shake It Off by Taylor... | 238.5 | 190.2 | **3.5** |
68x |
| desktop | open chrome | 120.2 | 22.9 | **0.6** | 189x |
| desktop | tile notepad and calculator | 263.3 | 139.5 | **1.0** | 278x
|
| weather | forecast for Chicago... | 1397.6 | 1183.3 | **2.0** | 694x |
| browser | open google.com | 60.6 | 7.6 | **0.6** | 99x |
| *unmatched* | install visual studio | 80.6 | 0.1 | **0.02** | 3222x |

**Avg matched speedup: ~96x.  Avg unmatched speedup: ~1161x.**

## Changes

- **`dfa.ts`**: Add `ruleIndex`, `activeRuleIndices` to `DFAState`;
remove `contextIndex` from `bestPriority`; add `DFABuilder.compact()`
static method
- **`dfaCompiler.ts`**: Call `DFABuilder.compact(dfa)` after build
- **`dfaMatcher.ts`**: Replace all `contexts[bestPriority.contextIndex]`
lookups with direct `state.ruleIndex`; use `state.activeRuleIndices` for
completions

## Test plan
- [x] All 1102 local tests pass (28 suites, 167 parity tests)
- [x] Policy check passes (2659 checks)
- [x] Benchmark confirms space reduction and no speed regression

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant