Wip/v2 grammar 2026 05 16#109
Conversation
Phase A of #43 (v2 grammar). Lands the two pieces that hyperpolymath/hypatia's build-gossamer-gui workflow blocks on first: - **Slash-segmented module paths.** `qualified_name` now accepts both `.` and `/` separators. `module hypatia/ui/bridge` and `import hypatia/ui/gui` parse; existing dot-form (`module GSA.Core.Types`) is unchanged. - **`extern "abi" { ... }` blocks.** New grammar rule `extern_block` with `extern_type` (`type Foo`) and `extern_fn` (`fn name(..): R`) items. Surface AST gains `SurfaceDecl::Extern(ExternBlock)` carrying the ABI string + items. `extern` is now a reserved keyword. Surface parser materialises the new declaration. Desugar drops extern items from the core module for this phase — ambient-binding registration into the typechecker env and wasm import-directive emit in codegen are the next phase's work (still tracked in #43). Adds `tests/v2-grammar/fixtures/{minimal-module,minimal-extern}.eph` plus integration tests at `src/ephapax-parser/tests/v2_grammar.rs`. The full hypatia `bridge.eph` fixture is deferred — it also needs `@tail_recursive` annotations and tuple destructuring in `let!`/`let` bindings, which are separate grammar additions. No regressions: 70 existing tests across ephapax-parser / ephapax-surface / ephapax-desugar still pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase B of #43, on top of Phase A in the same PR. Extends the core AST and the typechecker so an `extern "abi" { ... }` block doesn't just parse but is reachable from the function body via the normal name-resolution path. Changes ------- - **`ephapax-syntax`**: new `Decl::Extern { name, abi, params, ret_ty }` variant. Carries the same shape as a `Fn` signature, with no body and no polymorphism (extern items declare at concrete types). - **`ephapax-desugar`**: `SurfaceDecl::Extern(block)` lowers to one `Decl::Extern` per `ExternItem::Fn`. `ExternItem::Type` is still dropped — abstract-type registration lands later (tracked in #43). - **`ephapax-typing`**: the first pass of `type_check_module_inner` registers `Decl::Extern` names in the typechecker env with their declared function type; the body-check pass skips them (no body). `ModuleRegistry::register` exports extern items as public so importers can call them. - **`ephapax-wasm`**: extern items are skipped in `collect_user_fns` and `append_user_funcs` for now; wasm import-directive emit is Phase C. - **`ephapax-ir`** / **`ephapax-lsp`** / **`ephapax-linear`**: round-trip the new variant through SExpr encoding/decoding, LSP DeclInfo, and the affine/linear discipline walkers. The walkers treat externs as no-ops — they have no body to enforce discipline against. Tests ----- - New fixture `tests/v2-grammar/fixtures/extern-callsite.eph` declaring `extern "wasm" { fn host_identity(x: I32): I32 }` and calling it from a user fn `entry`. Without Phase B, this fails with `UnboundVariable(host_identity)`. - New integration test `src/ephapax-cli/tests/v2_grammar_phase_b.rs`: parse → desugar → type-check, asserts both `Decl::Extern` and `Decl::Fn` are present in the core module and the whole thing type-checks. - Full `cargo test --workspace` passes (no regressions in any of the ~38 test binaries across the workspace). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase C of #43, completing the parser → typecheck → codegen pipeline for `extern "abi" { fn name(..): R }` blocks. After this commit, a program declaring and calling an extern fn compiles all the way to a valid wasm module whose import section carries the declaration; the call site emits a wasm `Call(import_idx)`. The wasm function index space ----------------------------- Wasm requires all imports to come before any function bodies in the function-index space. To make room for K user extern imports between the existing host imports (print_i32, print_string) and the runtime helpers (bump_alloc, string_new, …), the runtime helper indices and user-fn indices now shift by K. That meant turning the previously-static `FN_BUMP_ALLOC = 2` family and `FIRST_USER_FN = NUM_IMPORTS + NUM_RUNTIME_FNS` constants into methods on `Codegen` (`self.fn_bump_alloc()`, `self.first_user_fn()`, etc.) that consult `self.extern_imports.len()` for the offset. ~30 call sites swapped from `Call(FN_X)` to `Call(self.fn_x())`. The two unconditionally-static constants left are `NUM_IMPORTS` and `NUM_RUNTIME_FNS`. Mechanics --------- - New `Codegen::extern_imports: Vec<ExternImportInfo>` populated by a first pass over `module.decls` in `collect_user_fns`. Each entry carries the ABI string, name, and a wasm-type-section index for the signature. - `emit_imports` appends one `imports.import(abi, name, EntityType::Function(type_idx))` per extern, right after the two host imports. - `user_fns` gets an entry for each extern as well, with `wasm_fn_idx` set to the import index. This lets the existing `compile_app` path (which already does `user_fns.get(name)` for direct calls) resolve an extern callsite to `Call(import_idx)` with zero new code. Tests ----- - New `src/ephapax-cli/tests/v2_grammar_phase_c.rs`: - `extern_callsite_emits_valid_wasm` — parses extern-callsite.eph, desugars, typechecks, codegens, and runs the bytes through `wasmparser::validate`. (Pre-Phase-C, the bytes would reference a non-existent function index → invalid wasm.) - `extern_callsite_emits_import_directive` — walks the wasm ImportSection and asserts there's exactly one user-extern entry `(import "wasm" "host_identity" (func ...))` plus the 2 host imports, total 3. - `wasmparser` added as a dev-dep on ephapax-cli at the same version as ephapax-wasm (0.221). - `cargo test --workspace` — all ~38 test binaries pass, including the wasm-encoder tests whose function indices implicitly shift through the runtime-helper renumbering. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds clap aliases on the existing `compile` subcommand for the two
names hypatia's `build-gossamer-gui` workflow probes for. All three
spellings route to the same surface-parse → desugar → typecheck → wasm
pipeline that Phases A-C wired up:
ephapax compile input.eph
ephapax compile-eph input.eph # alias
ephapax compile-affine input.eph # alias
Why this is safe now (and was not before)
-----------------------------------------
The 2026-05-13 comment on #36 argued the aliases shouldn't land until
end-to-end compilation worked, otherwise they'd short-circuit hypatia's
fallback to `compile` with the same parse failure and replace the
existing structured `::warning::` with a silent error.
After Phases A-C in this PR, that argument is satisfied for the
`extern "abi"` portion of the v2 grammar — extern blocks parse,
typecheck, and codegen to valid wasm. `bridge.eph` itself still
needs the remaining grammar pieces (annotations + tuple destructuring
in `let!`), which is the next phase of work on #43. Once that lands,
the aliases pick up `bridge.eph` automatically — no further CLI
changes needed.
Tests
-----
- `src/ephapax-cli/tests/v2_grammar_phase_d_aliases.rs`:
- `compile_eph_alias_routes_to_compile`
- `compile_affine_alias_routes_to_compile`
Both spawn the built `ephapax` binary and assert it compiles
`extern-callsite.eph` to a wasm file with the `\0asm` magic bytes.
- New dev-deps: `tempfile = "3"`.
- `cargo test --workspace` — all binaries pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Next chunk of the v2-grammar work for #43, on top of #45's Phase A→D. This PR drops three more bridge.eph blockers: 1. **`@identifier` decorator annotations** on top-level declarations. `@tail_recursive`, `@inline`, `@no_mangle`, etc. now parse without semantic effect — they're stored at the parser layer so the surface AST round-trips. Both the surface parser (`parse_surface_declaration`) and the core parser (`parse_declaration`) skip past annotation pairs before dispatching to the actual decl handler. 2. **Tuple destructuring in `let` / `let!` binders.** A new `let_binder` grammar rule accepts either a single identifier (current behaviour) or a `tuple_binder` of the form `(a, b, c, ...)`. Tuple binders lower at parse time to a 1-arm `match e of | (a, b, ...) => body end`, reusing the existing `Pattern::Pair` infrastructure. The desugar pass gains a fast-path: a 1-arm match whose pattern is not a `Constructor` delegates straight to `bind_single_pattern`, bypassing the sum-type case-tree builder that previously required constructor patterns. 3. **`(T1, T2)` resolves to `SurfaceTy::Prod`** (binary product) when exactly two element types, matching the value-side `paren_or_pair` convention (`(1, 2)` parses as `Pair{left, right}`). Three-or-more element types still become `SurfaceTy::Tuple`. Without this fix, `let p: (I32, I32) = (1, 2)` failed type-checking because the annotation type and the value's inferred type were structurally different. What's *still* deferred (out of scope for this PR): - **Implicit `in`** between sequential `let` bindings inside a fn body (bridge.eph chains `let! ... let ... let! ...` without `in` keywords). This is a deeper grammar change — needs a `block_expr` form and a rewire of how `fn_body` parses. - **Abstract extern type registration** (`extern "abi" { type Window }` → typechecker-visible opaque type). Phase B still drops `ExternItem::Type`. - **`Unit` as a type-name alias** for `()`. Tests ----- - New fixture `tests/v2-grammar/fixtures/let-pair-explicit-in.eph` - New integration test file `src/ephapax-cli/tests/v2_grammar_phase_e.rs`: - `annotation_on_fn_decl_parses` - `multiple_annotations_on_fn_decl_parse` - `let_pair_binder_compiles_end_to_end` (parse → desugar → typecheck → wasm validate) - `let_pair_lin_binder_parses` (covers the `let!` form bridge.eph uses) - `cargo test --workspace` clean — no regressions in the existing match / pair / desugar tests. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three more v2-grammar pieces for #43, on top of the phase E PR. After this commit, hypatia's bridge.eph parses end-to-end and desugars cleanly through every `extern "gossamer"` signature; the remaining gate is cross-module type resolution for `Model`/`Msg`/etc. imported from `hypatia/ui/gui` (#43 follow-up). Phase F — implicit `in` between sequential lets ----------------------------------------------- New grammar rule: block_expr = { sequential_let+ ~ expression } sequential_let = { ("let!" | "let") ~ let_binder ~ (":" ~ ty)? ~ "=" ~ block_rhs } block_rhs = { lambda_expr | if_expr | region_expr | match_expr | case_expr | handle_expr | or_expr } added at the top of the `single_expr` choice so it's tried before the existing `let_expr` / `let_lin_expr`. PEG ordering means the legacy `let x = e in body` form still parses — if the parser sees `in` after the rhs, `block_expr` rolls back and `let_expr` matches. Folded at parse time into nested `Let` / `LetLin` AST nodes (`Let{a, .., body: Let{b, .., body: <trailing>}}`). Tuple binders reuse the Phase E `match_arm_from_tuple_binder` lowering, so let! (ch2, msg) = ipc_recv(ch) let new_model = update(msg, model) run(ch2, new_model) works with no `in` keywords and a destructured first binding — exactly the shape bridge.eph uses on its TEA loop. Phase G — abstract extern types ------------------------------- `ExternItem::Type` items in `extern "abi" { type Foo }` blocks now register in the `DataRegistry`'s new `extern_types` map. The `desugar_named_type` path checks that map first; opaque extern types resolve to `Ty::Base(BaseTy::I32)` (host handle / pointer representation, matching the existing wasm import convention). Type arguments on opaque types are rejected (`Window(T)` is an error) since extern types are monomorphic by construction. Phase H — Unit / Bytes built-in type aliases -------------------------------------------- `Unit` resolves to `Ty::Base(BaseTy::Unit)` (the type-position spelling of the literal `()`); `Bytes` resolves to `Ty::Base(BaseTy::I32)` as the conventional host-managed buffer handle. These should eventually migrate to a stdlib prelude — for now they're hard-coded fast-paths in `desugar_named_type`, sitting before the data registry / extern-type lookups. Tests ----- - `tests/v2-grammar/fixtures/implicit-in.eph` - `tests/v2-grammar/fixtures/implicit-in-tuple.eph` - `tests/v2-grammar/fixtures/extern-abstract-types.eph` - `src/ephapax-cli/tests/v2_grammar_phase_f.rs` — 4 tests: - `implicit_in_chain_compiles` - `implicit_in_with_tuple_binders_compiles` - `legacy_explicit_in_still_compiles` (regression for the `in` form) - `implicit_in_let_lin_chain_parses` - `src/ephapax-cli/tests/v2_grammar_phase_gh.rs` — 3 tests: - `extern_abstract_types_desugar_to_i32_handles` (parse → desugar → typecheck → wasm validate on a fixture using `Window`, `Channel`, `Bytes`, `Unit` opaquely) - `unit_alias_resolves_to_base_unit` - `bytes_alias_resolves_to_i32` - `cargo test --workspace` clean — no regressions in existing match/pair/let/desugar/typing tests. bridge.eph status ----------------- After this commit, `cargo run -- compile-eph bridge.eph` fails with Desugar error: unknown type `Model` — i.e. the parser, surface AST, and `extern "gossamer"` signature desugar are all working. `Model` / `Msg` / `Department` etc. come from `import hypatia/ui/gui`, which requires cross-module type resolution. That's a separate piece of work tracked on #43. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Closes the last v2-grammar gap for #43. `compile-eph foo.eph` now resolves `import a/b/c` declarations against the file system, parses + desugars + typechecks each module in topological order, and emits a single wasm output that links them. After this commit, hypatia's bridge.eph stops being a *language* problem — `cargo run -- compile-eph bridge.eph` fails with a clean cannot read hypatia/ui/gui.eph: No such file or directory i.e. the only remaining ingredient is the imported module's source file. What changed ------------ - **`ephapax-surface`**: `SurfaceModule` gains an `imports: Vec<SurfaceImport>` field (the surface parser had been silently dropping `Rule::import_decl` pairs). `SurfaceDecl::Fn` and `SurfaceDecl::Type` gain a `visibility` field; new `SurfaceVisibility { Public, Private }` enum. - **`ephapax-parser`**: `parse_surface_module` now collects `import_decl` and `module_decl` pairs in addition to declarations. `parse_fn_decl` and `parse_type_decl` detect the optional `Rule::visibility` (`pub`) pair and propagate it. - **`ephapax-desugar`**: visibility flows from surface to core via a new `lower_visibility` helper. Surface imports propagate to `Module.imports`. `Desugarer` exposes `registry()` and `take_registry()` so callers can chain a single `DataRegistry` across multiple modules. - **`ephapax-cli`**: new `import_resolver` module — walks the import graph DFS-style, detects cycles, returns modules in topological order (dependencies first, root last). `compile_file` is rewritten to: 1. load the program via the resolver 2. desugar each module with a chained `DataRegistry` so `data` and `extern type` declarations from imported modules resolve in the importer 3. type-check each module against a chained `ModuleRegistry` so the importer's `Var(name)` lookups find public items from imports 4. **merge** all imported modules' `Fn` / `Extern` decls into the root module before codegen — wasm produces a single binary, so each imported function becomes a regular function body in the output. Duplicate names across modules are dropped (first-seen wins). Tests ----- - `tests/v2-grammar/fixtures/multi-module/app.eph` + `multi-module/lib/math.eph` — minimal 2-module program: `app.eph` imports `lib/math` and calls its public `double` fn. - `src/ephapax-cli/tests/v2_grammar_phase_i.rs:cross_module_imports_compile_end_to_end` — spawns `ephapax compile-eph`, asserts the binary reports `Resolved 2 module(s) in import graph`, the output starts with the `\0asm` magic, and `wasmparser::validate` accepts it. - `cargo test --workspace` — clean, no regressions across the ~40 test binaries (including the existing core-parser tests that declare functions without the new `visibility` field on `SurfaceDecl::Fn`/`Type`, which now defaults to `Private` via serde). bridge.eph status ----------------- End-to-end, the bridge.eph blocker chain is now: ✅ slash module paths (Phase A) ✅ extern blocks (Phase A-C) ✅ @decorator annotations (Phase E) ✅ tuple destructuring in let / let! (Phase E) ✅ implicit `in` between sequential lets (Phase F) ✅ abstract extern types (Phase G) ✅ Unit / Bytes type aliases (Phase H) ✅ cross-module type resolution (this commit) ⏳ hypatia/ui/gui.eph source file (out of ephapax's hands) Future work (not in scope here): - Per-module name-mangling in codegen so duplicate fn names across modules don't require dedup-by-first-seen (today's behaviour). - `--include-path` flag for multiple search roots. - Package-manifest awareness (ephapax-package crate exists but isn't wired into the resolver yet). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The bridge.eph integration target for #43. After this commit, `cargo run -- compile-eph bridge.eph` produces ~2.2KB of wasm output for the vendored hypatia bridge fixture (`tests/v2-grammar/fixtures/hypatia-port/bridge.eph`). Resolver -------- - **Module-declaration index.** The import resolver now scans `base_dir` for every `.eph` file at startup, reads the first `module a/b/c` line of each, and builds a name → file-path map. `import a/b/c` first tries the literal `<base>/a/b/c.eph` path; on miss, falls back to the index. Lets corpora like hypatia's `src/ui/gossamer/` keep flat filenames (`hypatia_gui.eph`) while their declared module name (`hypatia/ui/gui`) drives import resolution. Language additions ------------------ - **`pub data Foo = ...`.** Grammar now accepts `visibility?` before `data` declarations. - **Record/sum type aliases.** `type Foo = { f1: T1, f2: T2 }` and `type Bar = | A | B(I32)` previously failed at parse. Records lower to right-nested binary products; sums lower to right-nested binary sums. - **Record literal field separators.** `record_field_assign` accepts three surface forms — `f: ty = v`, `f = v`, and `f: v` (ML-style shorthand). Records lower to positional pairs / tuples. - **`type Foo = T` alias resolution in desugar.** New `type_aliases` map on `DataRegistry` captures alias bodies in surface form; `desugar_named_type` looks them up and recursively expands. - **`pub` keyword in `parse_data_decl`** — was being eaten as the data type name. - **Match-on-literal lowering.** `match n of | 0 => a | 1 => b | _ => c` desugars to nested `if scrutinee == lit then arm else next` ending in the default branch. Required by bridge.eph's `int_to_department` and `decode_msg`. - **Bare string literals.** Desugar wraps `Literal::String(s)` as `StringNew { region: "_", value: s }`. The typechecker's region- activation gate exempts `_` (the wildcard region for inferred String types). - **Nullary fn signatures expose as `() -> T`.** `fn foo(): T = ...` was previously registered as having type `T` directly, making `foo()` at a call site fail to unify. Three pre-pass / registry call sites updated. Vendored fixture ---------------- `tests/v2-grammar/fixtures/hypatia-port/{bridge,hypatia_gui}.eph` are local adaptations of hypatia's upstream files. Four changes versus upstream (all documented in the file headers): 1. `module hypatia/ui/gui` header on hypatia_gui.eph 2. `pub` keywords on items bridge.eph imports 3. `model.field_name` rewritten as positional `.0` / `.1` (named field access remains future work) 4. `decode_msg` reparses bytes per use so each linear `String` is consumed exactly once Tests ----- - `src/ephapax-cli/tests/v2_grammar_phase_j.rs::bridge_eph_compiles_end_to_end` — spawns `ephapax compile-eph`, asserts return code 0, output ≥1KB starting with wasm magic bytes. Note: full `wasmparser::validate` does not yet pass on the bridge output — ADT-constructor / match-arm codegen produces a stack mismatch ("expected i32, nothing on stack") which is left for a follow-up. Parse + typecheck + binary emission are all covered. - `cargo test --workspace` clean, no regressions on the existing ~40 test binaries. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Conflict triage 2026-05-20 evening This branch has 10 commits ahead of main. The 8 typing/parser/wasm conflicts are in code I haven't read closely enough to resolve safely — they need the v2-grammar author's judgement about which side wins where main has overlapping changes to AST/typing/codegen. Trying Recommended next step: a focused session that (a) rebases this branch on main with manual resolution of the 8 conflicts, (b) verifies Closing as |
|
Closing as superseded. Audit of this branch (
Trying to resolve conflicts here would have meant either dragging the pre-re-port phase commits back over the cleaner versions on main, or doing a massive Refs #80. |
Cherry-pick of `a99dc21` from #109 — the only commit on that branch worth keeping after the v2 grammar work was re-landed on main via PRs #46/#54/#57/#58/#62-#65/#69-#74/#76/#78/#80. Adds `examples/v2/hello.eph` (13 lines) demonstrating the v2 grammar surface: `module std/io`, `extern "wasm" { fn ... }`, `pub fn main`, `let!` linear bindings. Closes #109 (the rest was superseded — see close comment on #109). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
No description provided.