Skip to content

v1.13.0

Choose a tag to compare

@yhirose yhirose released this 01 Jul 17:40

✨ New Features

Grammar Serialization (GrammarBlob) — fast startup

A compiled Grammar can now be serialized to a portable byte blob and restored later, letting an application embed a prebuilt blob and skip the meta-parse entirely at startup. Deserialization is ~40× faster than load_grammar().

  • New APIs: GrammarBlob::serialize / deserialize, and parser::serialize_grammar() / load_blob().
  • peglint --blob option emits a blob from a grammar file.
  • Structure only — semantic callbacks are not serialized and must be re-applied after loading. References resolve by name, and first-sets are recomputed in O(N).
  • The precedence instruction is supported in blobs. Grammars using capture/back-reference or a custom User operator are rejected as non-serializable.
  • Covered by test/test_serialize.cc.

🐛 Bug Fixes

  • Character-class / literal escapes: fixed handling of escaped \-, \^, \f, and \v inside character classes and literals.
  • Default start-rule selection: %-directives (e.g. %whitespace, %word) are no longer eligible to be picked as the default start rule.
  • Packrat safety: prevent infinite recursion over Holder cycles during packrat caching.
  • load_blob(): restore the parser-level packrat flag when loading a blob.

⚡ Performance

  • First-sets are now computed in O(N) by sharing the SetupFirstSets visitor across rules.
  • The bootstrap meta-grammar now uses First-Set filtering and left-factored Definition/Primary rules, cutting grammar-load time.

🧹 Internal / Cleanup

  • Removed dead Symbol Table support remnants.
  • Extensive spec / conformance test additions (Unicode identifiers, AST conformance, macro + left-recursion interactions, semantic-value edge cases, serialization round-trip oracle).