Skip to content

Wip/v2 grammar 2026 05 16#109

Closed
hyperpolymath wants to merge 10 commits into
mainfrom
wip/v2-grammar-2026-05-16
Closed

Wip/v2 grammar 2026 05 16#109
hyperpolymath wants to merge 10 commits into
mainfrom
wip/v2-grammar-2026-05-16

Conversation

@hyperpolymath
Copy link
Copy Markdown
Owner

No description provided.

hyperpolymath and others added 10 commits May 14, 2026 22:50
Phase A of #43 (v2 grammar). Lands the two pieces
that hyperpolymath/hypatia's build-gossamer-gui workflow blocks on
first:

- **Slash-segmented module paths.** `qualified_name` now accepts both
  `.` and `/` separators. `module hypatia/ui/bridge` and
  `import hypatia/ui/gui` parse; existing dot-form (`module GSA.Core.Types`)
  is unchanged.

- **`extern "abi" { ... }` blocks.** New grammar rule `extern_block` with
  `extern_type` (`type Foo`) and `extern_fn` (`fn name(..): R`) items.
  Surface AST gains `SurfaceDecl::Extern(ExternBlock)` carrying the ABI
  string + items. `extern` is now a reserved keyword.

Surface parser materialises the new declaration. Desugar drops extern
items from the core module for this phase — ambient-binding registration
into the typechecker env and wasm import-directive emit in codegen are
the next phase's work (still tracked in #43).

Adds `tests/v2-grammar/fixtures/{minimal-module,minimal-extern}.eph`
plus integration tests at `src/ephapax-parser/tests/v2_grammar.rs`. The
full hypatia `bridge.eph` fixture is deferred — it also needs
`@tail_recursive` annotations and tuple destructuring in `let!`/`let`
bindings, which are separate grammar additions.

No regressions: 70 existing tests across ephapax-parser /
ephapax-surface / ephapax-desugar still pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase B of #43, on top of Phase A in the same PR.
Extends the core AST and the typechecker so an `extern "abi" { ... }`
block doesn't just parse but is reachable from the function body via
the normal name-resolution path.

Changes
-------

- **`ephapax-syntax`**: new `Decl::Extern { name, abi, params, ret_ty }`
  variant. Carries the same shape as a `Fn` signature, with no body and
  no polymorphism (extern items declare at concrete types).

- **`ephapax-desugar`**: `SurfaceDecl::Extern(block)` lowers to one
  `Decl::Extern` per `ExternItem::Fn`. `ExternItem::Type` is still
  dropped — abstract-type registration lands later (tracked in #43).

- **`ephapax-typing`**: the first pass of `type_check_module_inner`
  registers `Decl::Extern` names in the typechecker env with their
  declared function type; the body-check pass skips them (no body).
  `ModuleRegistry::register` exports extern items as public so importers
  can call them.

- **`ephapax-wasm`**: extern items are skipped in `collect_user_fns` and
  `append_user_funcs` for now; wasm import-directive emit is Phase C.

- **`ephapax-ir`** / **`ephapax-lsp`** / **`ephapax-linear`**: round-trip
  the new variant through SExpr encoding/decoding, LSP DeclInfo, and
  the affine/linear discipline walkers. The walkers treat externs as
  no-ops — they have no body to enforce discipline against.

Tests
-----

- New fixture `tests/v2-grammar/fixtures/extern-callsite.eph` declaring
  `extern "wasm" { fn host_identity(x: I32): I32 }` and calling it from
  a user fn `entry`. Without Phase B, this fails with
  `UnboundVariable(host_identity)`.

- New integration test `src/ephapax-cli/tests/v2_grammar_phase_b.rs`:
  parse → desugar → type-check, asserts both `Decl::Extern` and
  `Decl::Fn` are present in the core module and the whole thing
  type-checks.

- Full `cargo test --workspace` passes (no regressions in any of the
  ~38 test binaries across the workspace).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase C of #43, completing the parser → typecheck
→ codegen pipeline for `extern "abi" { fn name(..): R }` blocks. After
this commit, a program declaring and calling an extern fn compiles all
the way to a valid wasm module whose import section carries the
declaration; the call site emits a wasm `Call(import_idx)`.

The wasm function index space
-----------------------------

Wasm requires all imports to come before any function bodies in the
function-index space. To make room for K user extern imports between
the existing host imports (print_i32, print_string) and the runtime
helpers (bump_alloc, string_new, …), the runtime helper indices and
user-fn indices now shift by K.

That meant turning the previously-static `FN_BUMP_ALLOC = 2` family
and `FIRST_USER_FN = NUM_IMPORTS + NUM_RUNTIME_FNS` constants into
methods on `Codegen` (`self.fn_bump_alloc()`, `self.first_user_fn()`,
etc.) that consult `self.extern_imports.len()` for the offset. ~30
call sites swapped from `Call(FN_X)` to `Call(self.fn_x())`. The two
unconditionally-static constants left are `NUM_IMPORTS` and
`NUM_RUNTIME_FNS`.

Mechanics
---------

- New `Codegen::extern_imports: Vec<ExternImportInfo>` populated by a
  first pass over `module.decls` in `collect_user_fns`. Each entry
  carries the ABI string, name, and a wasm-type-section index for
  the signature.
- `emit_imports` appends one `imports.import(abi, name, EntityType::Function(type_idx))`
  per extern, right after the two host imports.
- `user_fns` gets an entry for each extern as well, with `wasm_fn_idx`
  set to the import index. This lets the existing `compile_app` path
  (which already does `user_fns.get(name)` for direct calls) resolve
  an extern callsite to `Call(import_idx)` with zero new code.

Tests
-----

- New `src/ephapax-cli/tests/v2_grammar_phase_c.rs`:
  - `extern_callsite_emits_valid_wasm` — parses extern-callsite.eph,
    desugars, typechecks, codegens, and runs the bytes through
    `wasmparser::validate`. (Pre-Phase-C, the bytes would reference
    a non-existent function index → invalid wasm.)
  - `extern_callsite_emits_import_directive` — walks the wasm
    ImportSection and asserts there's exactly one user-extern entry
    `(import "wasm" "host_identity" (func ...))` plus the 2 host
    imports, total 3.

- `wasmparser` added as a dev-dep on ephapax-cli at the same version
  as ephapax-wasm (0.221).

- `cargo test --workspace` — all ~38 test binaries pass, including
  the wasm-encoder tests whose function indices implicitly shift
  through the runtime-helper renumbering.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds clap aliases on the existing `compile` subcommand for the two
names hypatia's `build-gossamer-gui` workflow probes for. All three
spellings route to the same surface-parse → desugar → typecheck → wasm
pipeline that Phases A-C wired up:

    ephapax compile          input.eph
    ephapax compile-eph      input.eph     # alias
    ephapax compile-affine   input.eph     # alias

Why this is safe now (and was not before)
-----------------------------------------

The 2026-05-13 comment on #36 argued the aliases shouldn't land until
end-to-end compilation worked, otherwise they'd short-circuit hypatia's
fallback to `compile` with the same parse failure and replace the
existing structured `::warning::` with a silent error.

After Phases A-C in this PR, that argument is satisfied for the
`extern "abi"` portion of the v2 grammar — extern blocks parse,
typecheck, and codegen to valid wasm. `bridge.eph` itself still
needs the remaining grammar pieces (annotations + tuple destructuring
in `let!`), which is the next phase of work on #43. Once that lands,
the aliases pick up `bridge.eph` automatically — no further CLI
changes needed.

Tests
-----

- `src/ephapax-cli/tests/v2_grammar_phase_d_aliases.rs`:
  - `compile_eph_alias_routes_to_compile`
  - `compile_affine_alias_routes_to_compile`

  Both spawn the built `ephapax` binary and assert it compiles
  `extern-callsite.eph` to a wasm file with the `\0asm` magic bytes.

- New dev-deps: `tempfile = "3"`.

- `cargo test --workspace` — all binaries pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Next chunk of the v2-grammar work for #43, on top
of #45's Phase A→D. This PR drops three more bridge.eph blockers:

1. **`@identifier` decorator annotations** on top-level declarations.
   `@tail_recursive`, `@inline`, `@no_mangle`, etc. now parse without
   semantic effect — they're stored at the parser layer so the surface
   AST round-trips. Both the surface parser (`parse_surface_declaration`)
   and the core parser (`parse_declaration`) skip past annotation pairs
   before dispatching to the actual decl handler.

2. **Tuple destructuring in `let` / `let!` binders.** A new `let_binder`
   grammar rule accepts either a single identifier (current behaviour)
   or a `tuple_binder` of the form `(a, b, c, ...)`. Tuple binders lower
   at parse time to a 1-arm `match e of | (a, b, ...) => body end`,
   reusing the existing `Pattern::Pair` infrastructure.

   The desugar pass gains a fast-path: a 1-arm match whose pattern is
   not a `Constructor` delegates straight to `bind_single_pattern`,
   bypassing the sum-type case-tree builder that previously required
   constructor patterns.

3. **`(T1, T2)` resolves to `SurfaceTy::Prod`** (binary product) when
   exactly two element types, matching the value-side `paren_or_pair`
   convention (`(1, 2)` parses as `Pair{left, right}`). Three-or-more
   element types still become `SurfaceTy::Tuple`. Without this fix,
   `let p: (I32, I32) = (1, 2)` failed type-checking because the
   annotation type and the value's inferred type were structurally
   different.

What's *still* deferred (out of scope for this PR):

- **Implicit `in`** between sequential `let` bindings inside a fn body
  (bridge.eph chains `let! ... let ... let! ...` without `in` keywords).
  This is a deeper grammar change — needs a `block_expr` form and a
  rewire of how `fn_body` parses.
- **Abstract extern type registration** (`extern "abi" { type Window }`
  → typechecker-visible opaque type). Phase B still drops `ExternItem::Type`.
- **`Unit` as a type-name alias** for `()`.

Tests
-----

- New fixture `tests/v2-grammar/fixtures/let-pair-explicit-in.eph`
- New integration test file `src/ephapax-cli/tests/v2_grammar_phase_e.rs`:
  - `annotation_on_fn_decl_parses`
  - `multiple_annotations_on_fn_decl_parse`
  - `let_pair_binder_compiles_end_to_end` (parse → desugar → typecheck → wasm validate)
  - `let_pair_lin_binder_parses` (covers the `let!` form bridge.eph uses)

- `cargo test --workspace` clean — no regressions in the existing
  match / pair / desugar tests.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three more v2-grammar pieces for #43, on top of the
phase E PR. After this commit, hypatia's bridge.eph parses end-to-end
and desugars cleanly through every `extern "gossamer"` signature; the
remaining gate is cross-module type resolution for `Model`/`Msg`/etc.
imported from `hypatia/ui/gui` (#43 follow-up).

Phase F — implicit `in` between sequential lets
-----------------------------------------------

New grammar rule:

  block_expr      = { sequential_let+ ~ expression }
  sequential_let  = { ("let!" | "let") ~ let_binder ~ (":" ~ ty)? ~ "=" ~ block_rhs }
  block_rhs       = { lambda_expr | if_expr | region_expr | match_expr | case_expr | handle_expr | or_expr }

added at the top of the `single_expr` choice so it's tried before the
existing `let_expr` / `let_lin_expr`. PEG ordering means the legacy
`let x = e in body` form still parses — if the parser sees `in` after
the rhs, `block_expr` rolls back and `let_expr` matches.

Folded at parse time into nested `Let` / `LetLin` AST nodes (`Let{a, ..,
body: Let{b, .., body: <trailing>}}`). Tuple binders reuse the Phase E
`match_arm_from_tuple_binder` lowering, so

  let! (ch2, msg) = ipc_recv(ch)
  let new_model = update(msg, model)
  run(ch2, new_model)

works with no `in` keywords and a destructured first binding — exactly
the shape bridge.eph uses on its TEA loop.

Phase G — abstract extern types
-------------------------------

`ExternItem::Type` items in `extern "abi" { type Foo }` blocks now
register in the `DataRegistry`'s new `extern_types` map. The
`desugar_named_type` path checks that map first; opaque extern types
resolve to `Ty::Base(BaseTy::I32)` (host handle / pointer
representation, matching the existing wasm import convention).

Type arguments on opaque types are rejected (`Window(T)` is an error)
since extern types are monomorphic by construction.

Phase H — Unit / Bytes built-in type aliases
--------------------------------------------

`Unit` resolves to `Ty::Base(BaseTy::Unit)` (the type-position spelling
of the literal `()`); `Bytes` resolves to `Ty::Base(BaseTy::I32)` as
the conventional host-managed buffer handle.

These should eventually migrate to a stdlib prelude — for now they're
hard-coded fast-paths in `desugar_named_type`, sitting before the data
registry / extern-type lookups.

Tests
-----

- `tests/v2-grammar/fixtures/implicit-in.eph`
- `tests/v2-grammar/fixtures/implicit-in-tuple.eph`
- `tests/v2-grammar/fixtures/extern-abstract-types.eph`
- `src/ephapax-cli/tests/v2_grammar_phase_f.rs` — 4 tests:
  - `implicit_in_chain_compiles`
  - `implicit_in_with_tuple_binders_compiles`
  - `legacy_explicit_in_still_compiles` (regression for the `in` form)
  - `implicit_in_let_lin_chain_parses`
- `src/ephapax-cli/tests/v2_grammar_phase_gh.rs` — 3 tests:
  - `extern_abstract_types_desugar_to_i32_handles` (parse → desugar →
    typecheck → wasm validate on a fixture using `Window`, `Channel`,
    `Bytes`, `Unit` opaquely)
  - `unit_alias_resolves_to_base_unit`
  - `bytes_alias_resolves_to_i32`

- `cargo test --workspace` clean — no regressions in existing
  match/pair/let/desugar/typing tests.

bridge.eph status
-----------------

After this commit, `cargo run -- compile-eph bridge.eph` fails with

  Desugar error: unknown type `Model`

— i.e. the parser, surface AST, and `extern "gossamer"` signature
desugar are all working. `Model` / `Msg` / `Department` etc. come from
`import hypatia/ui/gui`, which requires cross-module type resolution.
That's a separate piece of work tracked on #43.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Closes the last v2-grammar gap for #43.
`compile-eph foo.eph` now resolves `import a/b/c` declarations against
the file system, parses + desugars + typechecks each module in
topological order, and emits a single wasm output that links them.

After this commit, hypatia's bridge.eph stops being a *language* problem
— `cargo run -- compile-eph bridge.eph` fails with a clean

  cannot read hypatia/ui/gui.eph: No such file or directory

i.e. the only remaining ingredient is the imported module's source file.

What changed
------------

- **`ephapax-surface`**: `SurfaceModule` gains an `imports: Vec<SurfaceImport>`
  field (the surface parser had been silently dropping `Rule::import_decl`
  pairs). `SurfaceDecl::Fn` and `SurfaceDecl::Type` gain a `visibility`
  field; new `SurfaceVisibility { Public, Private }` enum.

- **`ephapax-parser`**: `parse_surface_module` now collects `import_decl`
  and `module_decl` pairs in addition to declarations. `parse_fn_decl`
  and `parse_type_decl` detect the optional `Rule::visibility` (`pub`)
  pair and propagate it.

- **`ephapax-desugar`**: visibility flows from surface to core via a new
  `lower_visibility` helper. Surface imports propagate to
  `Module.imports`. `Desugarer` exposes `registry()` and
  `take_registry()` so callers can chain a single `DataRegistry` across
  multiple modules.

- **`ephapax-cli`**: new `import_resolver` module — walks the import
  graph DFS-style, detects cycles, returns modules in topological order
  (dependencies first, root last). `compile_file` is rewritten to:
    1. load the program via the resolver
    2. desugar each module with a chained `DataRegistry` so `data` and
       `extern type` declarations from imported modules resolve in the
       importer
    3. type-check each module against a chained `ModuleRegistry` so the
       importer's `Var(name)` lookups find public items from imports
    4. **merge** all imported modules' `Fn` / `Extern` decls into the
       root module before codegen — wasm produces a single binary, so
       each imported function becomes a regular function body in the
       output. Duplicate names across modules are dropped (first-seen
       wins).

Tests
-----

- `tests/v2-grammar/fixtures/multi-module/app.eph` + `multi-module/lib/math.eph`
  — minimal 2-module program: `app.eph` imports `lib/math` and calls
  its public `double` fn.
- `src/ephapax-cli/tests/v2_grammar_phase_i.rs:cross_module_imports_compile_end_to_end`
  — spawns `ephapax compile-eph`, asserts the binary reports
  `Resolved 2 module(s) in import graph`, the output starts with the
  `\0asm` magic, and `wasmparser::validate` accepts it.

- `cargo test --workspace` — clean, no regressions across the
  ~40 test binaries (including the existing core-parser tests that
  declare functions without the new `visibility` field on
  `SurfaceDecl::Fn`/`Type`, which now defaults to `Private` via serde).

bridge.eph status
-----------------

End-to-end, the bridge.eph blocker chain is now:

  ✅ slash module paths (Phase A)
  ✅ extern blocks (Phase A-C)
  ✅ @decorator annotations (Phase E)
  ✅ tuple destructuring in let / let! (Phase E)
  ✅ implicit `in` between sequential lets (Phase F)
  ✅ abstract extern types (Phase G)
  ✅ Unit / Bytes type aliases (Phase H)
  ✅ cross-module type resolution (this commit)
  ⏳ hypatia/ui/gui.eph source file (out of ephapax's hands)

Future work (not in scope here):

  - Per-module name-mangling in codegen so duplicate fn names across
    modules don't require dedup-by-first-seen (today's behaviour).
  - `--include-path` flag for multiple search roots.
  - Package-manifest awareness (ephapax-package crate exists but isn't
    wired into the resolver yet).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The bridge.eph integration target for #43.
After this commit, `cargo run -- compile-eph bridge.eph` produces
~2.2KB of wasm output for the vendored hypatia bridge fixture
(`tests/v2-grammar/fixtures/hypatia-port/bridge.eph`).

Resolver
--------

- **Module-declaration index.** The import resolver now scans `base_dir`
  for every `.eph` file at startup, reads the first `module a/b/c` line
  of each, and builds a name → file-path map. `import a/b/c` first
  tries the literal `<base>/a/b/c.eph` path; on miss, falls back to the
  index. Lets corpora like hypatia's `src/ui/gossamer/` keep flat
  filenames (`hypatia_gui.eph`) while their declared module name
  (`hypatia/ui/gui`) drives import resolution.

Language additions
------------------

- **`pub data Foo = ...`.** Grammar now accepts `visibility?` before
  `data` declarations.
- **Record/sum type aliases.** `type Foo = { f1: T1, f2: T2 }` and
  `type Bar = | A | B(I32)` previously failed at parse. Records lower
  to right-nested binary products; sums lower to right-nested binary
  sums.
- **Record literal field separators.** `record_field_assign` accepts
  three surface forms — `f: ty = v`, `f = v`, and `f: v` (ML-style
  shorthand). Records lower to positional pairs / tuples.
- **`type Foo = T` alias resolution in desugar.** New `type_aliases`
  map on `DataRegistry` captures alias bodies in surface form;
  `desugar_named_type` looks them up and recursively expands.
- **`pub` keyword in `parse_data_decl`** — was being eaten as the data
  type name.
- **Match-on-literal lowering.** `match n of | 0 => a | 1 => b | _ => c`
  desugars to nested `if scrutinee == lit then arm else next` ending
  in the default branch. Required by bridge.eph's `int_to_department`
  and `decode_msg`.
- **Bare string literals.** Desugar wraps `Literal::String(s)` as
  `StringNew { region: "_", value: s }`. The typechecker's region-
  activation gate exempts `_` (the wildcard region for inferred String
  types).
- **Nullary fn signatures expose as `() -> T`.** `fn foo(): T = ...` was
  previously registered as having type `T` directly, making `foo()` at
  a call site fail to unify. Three pre-pass / registry call sites
  updated.

Vendored fixture
----------------

`tests/v2-grammar/fixtures/hypatia-port/{bridge,hypatia_gui}.eph` are
local adaptations of hypatia's upstream files. Four changes versus
upstream (all documented in the file headers):

  1. `module hypatia/ui/gui` header on hypatia_gui.eph
  2. `pub` keywords on items bridge.eph imports
  3. `model.field_name` rewritten as positional `.0` / `.1` (named
     field access remains future work)
  4. `decode_msg` reparses bytes per use so each linear `String` is
     consumed exactly once

Tests
-----

- `src/ephapax-cli/tests/v2_grammar_phase_j.rs::bridge_eph_compiles_end_to_end`
  — spawns `ephapax compile-eph`, asserts return code 0, output ≥1KB
  starting with wasm magic bytes.

  Note: full `wasmparser::validate` does not yet pass on the bridge
  output — ADT-constructor / match-arm codegen produces a stack
  mismatch ("expected i32, nothing on stack") which is left for a
  follow-up. Parse + typecheck + binary emission are all covered.

- `cargo test --workspace` clean, no regressions on the existing
  ~40 test binaries.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@hyperpolymath
Copy link
Copy Markdown
Owner Author

Conflict triage 2026-05-20 evening

This branch has 10 commits ahead of main. git merge origin/main produces 8 conflicts across 3 critical Rust files (1 in ephapax-surface, 5 in ephapax-typing, 2 in ephapax-wasm) plus 59 additive estate-wide changes that auto-merged cleanly (CI/governance/SPDX, etc.).

The 8 typing/parser/wasm conflicts are in code I haven't read closely enough to resolve safely — they need the v2-grammar author's judgement about which side wins where main has overlapping changes to AST/typing/codegen. Trying accept theirs would silently overwrite estate-wide refactors that landed on main; trying accept ours would silently overwrite the v2-grammar Phases B–J work that this branch was built for.

Recommended next step: a focused session that (a) rebases this branch on main with manual resolution of the 8 conflicts, (b) verifies cargo test + just golden post-rebase, (c) pushes a clean branch. Without that, the PR will keep accumulating drift with each merge to main.

Closing as triage_needed is also an option if the v2-grammar work is being superseded by a different approach.

@hyperpolymath
Copy link
Copy Markdown
Owner Author

Closing as superseded.

Audit of this branch (wip/v2-grammar-2026-05-16) against current main:

Trying to resolve conflicts here would have meant either dragging the pre-re-port phase commits back over the cleaner versions on main, or doing a massive -Xtheirs main resolution that effectively reverts to main anyway. Closing in favour of #112.

Refs #80.

hyperpolymath added a commit that referenced this pull request May 20, 2026
Cherry-pick of `a99dc21` from #109 — the only commit on that branch
worth keeping after the v2 grammar work was re-landed on main via PRs
#46/#54/#57/#58/#62-#65/#69-#74/#76/#78/#80.

Adds `examples/v2/hello.eph` (13 lines) demonstrating the v2 grammar
surface: `module std/io`, `extern "wasm" { fn ... }`, `pub fn main`,
`let!` linear bindings.

Closes #109 (the rest was superseded — see close comment on #109).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@hyperpolymath hyperpolymath deleted the wip/v2-grammar-2026-05-16 branch May 21, 2026 06:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant