Skip to content

Target-agnostic emitter: emitParser(grammar, target) for Go/Rust/native #6

@johnsoncodehk

Description

@johnsoncodehk

Goal

emitParser(grammar) emits JS only. To make the emitted parser target-agnostic (Go / Rust / native — the original "the generated parser need not be JS" vision), add a second parameter: emitParser(grammar, target), with all JS-specific emission behind a Target config.

What's already agnostic vs not

  • analyze(grammar) (precedence / FIRST / NUD-LED / nullability) — target-independent, reused as-is.
  • JS-specific: the per-arm matcher emission (matchInto emits JS statements), the runtime (peek / matchLiteral / Pratt + left-rec + memo cores, copied as JS), data baking (JSON literals), module wiring (imports/exports).

Design fork (decide first)

  • (a) primitive-method Target interface (~30 methods: declMatch / matchLiteral / push / star / makeNode / matcherFn / …). The JS implementation ≈ today's emitter strings reorganized into methods; fastest path to a working jsTarget that proves the API.
  • (b) IR: matchInto builds a statement/expression IR and each target implements one render(node). Cleaner for adding targets, but a bigger upfront rewrite.

Recommendation: (a) first; move to (b) if the method set grows unwieldy.

Hard parts

  • Types — Go/Rust need explicit types; the matcher contract is monomorphic (OptChildren = null | Child[] in JS, Option<Vec<Child>> in Rust, ([]Child, bool) in Go) → fixed and known per target.
  • Mutable parse state — JS uses module-level let; Go/Rust need a Parser struct / &mut self context the matchers thread through. This is the biggest structural difference.
  • null / Option / (val, ok) failure convention, behind Target methods.

First step

Extract a Target interface + jsTarget; emitParser(grammar, target = jsTarget) must produce byte-identical output to today (re-verify 100% + bench unchanged), establishing the API. Go/Rust then implement the interface.

Depends on

Pairs with the token-IR / emitted-lexer issue: a Go/Rust parser also needs an emitted lexer (it can't import the JS createLexer), so the lexer must be emitted per-target from the same token IR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions