Wip/v2 grammar 2026 05 16#120
Closed
hyperpolymath wants to merge 10 commits into
Closed
Conversation
Phase A of #43 (v2 grammar). Lands the two pieces that hyperpolymath/hypatia's build-gossamer-gui workflow blocks on first: - **Slash-segmented module paths.** `qualified_name` now accepts both `.` and `/` separators. `module hypatia/ui/bridge` and `import hypatia/ui/gui` parse; existing dot-form (`module GSA.Core.Types`) is unchanged. - **`extern "abi" { ... }` blocks.** New grammar rule `extern_block` with `extern_type` (`type Foo`) and `extern_fn` (`fn name(..): R`) items. Surface AST gains `SurfaceDecl::Extern(ExternBlock)` carrying the ABI string + items. `extern` is now a reserved keyword. Surface parser materialises the new declaration. Desugar drops extern items from the core module for this phase — ambient-binding registration into the typechecker env and wasm import-directive emit in codegen are the next phase's work (still tracked in #43). Adds `tests/v2-grammar/fixtures/{minimal-module,minimal-extern}.eph` plus integration tests at `src/ephapax-parser/tests/v2_grammar.rs`. The full hypatia `bridge.eph` fixture is deferred — it also needs `@tail_recursive` annotations and tuple destructuring in `let!`/`let` bindings, which are separate grammar additions. No regressions: 70 existing tests across ephapax-parser / ephapax-surface / ephapax-desugar still pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase B of #43, on top of Phase A in the same PR. Extends the core AST and the typechecker so an `extern "abi" { ... }` block doesn't just parse but is reachable from the function body via the normal name-resolution path. Changes ------- - **`ephapax-syntax`**: new `Decl::Extern { name, abi, params, ret_ty }` variant. Carries the same shape as a `Fn` signature, with no body and no polymorphism (extern items declare at concrete types). - **`ephapax-desugar`**: `SurfaceDecl::Extern(block)` lowers to one `Decl::Extern` per `ExternItem::Fn`. `ExternItem::Type` is still dropped — abstract-type registration lands later (tracked in #43). - **`ephapax-typing`**: the first pass of `type_check_module_inner` registers `Decl::Extern` names in the typechecker env with their declared function type; the body-check pass skips them (no body). `ModuleRegistry::register` exports extern items as public so importers can call them. - **`ephapax-wasm`**: extern items are skipped in `collect_user_fns` and `append_user_funcs` for now; wasm import-directive emit is Phase C. - **`ephapax-ir`** / **`ephapax-lsp`** / **`ephapax-linear`**: round-trip the new variant through SExpr encoding/decoding, LSP DeclInfo, and the affine/linear discipline walkers. The walkers treat externs as no-ops — they have no body to enforce discipline against. Tests ----- - New fixture `tests/v2-grammar/fixtures/extern-callsite.eph` declaring `extern "wasm" { fn host_identity(x: I32): I32 }` and calling it from a user fn `entry`. Without Phase B, this fails with `UnboundVariable(host_identity)`. - New integration test `src/ephapax-cli/tests/v2_grammar_phase_b.rs`: parse → desugar → type-check, asserts both `Decl::Extern` and `Decl::Fn` are present in the core module and the whole thing type-checks. - Full `cargo test --workspace` passes (no regressions in any of the ~38 test binaries across the workspace). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase C of #43, completing the parser → typecheck → codegen pipeline for `extern "abi" { fn name(..): R }` blocks. After this commit, a program declaring and calling an extern fn compiles all the way to a valid wasm module whose import section carries the declaration; the call site emits a wasm `Call(import_idx)`. The wasm function index space ----------------------------- Wasm requires all imports to come before any function bodies in the function-index space. To make room for K user extern imports between the existing host imports (print_i32, print_string) and the runtime helpers (bump_alloc, string_new, …), the runtime helper indices and user-fn indices now shift by K. That meant turning the previously-static `FN_BUMP_ALLOC = 2` family and `FIRST_USER_FN = NUM_IMPORTS + NUM_RUNTIME_FNS` constants into methods on `Codegen` (`self.fn_bump_alloc()`, `self.first_user_fn()`, etc.) that consult `self.extern_imports.len()` for the offset. ~30 call sites swapped from `Call(FN_X)` to `Call(self.fn_x())`. The two unconditionally-static constants left are `NUM_IMPORTS` and `NUM_RUNTIME_FNS`. Mechanics --------- - New `Codegen::extern_imports: Vec<ExternImportInfo>` populated by a first pass over `module.decls` in `collect_user_fns`. Each entry carries the ABI string, name, and a wasm-type-section index for the signature. - `emit_imports` appends one `imports.import(abi, name, EntityType::Function(type_idx))` per extern, right after the two host imports. - `user_fns` gets an entry for each extern as well, with `wasm_fn_idx` set to the import index. This lets the existing `compile_app` path (which already does `user_fns.get(name)` for direct calls) resolve an extern callsite to `Call(import_idx)` with zero new code. Tests ----- - New `src/ephapax-cli/tests/v2_grammar_phase_c.rs`: - `extern_callsite_emits_valid_wasm` — parses extern-callsite.eph, desugars, typechecks, codegens, and runs the bytes through `wasmparser::validate`. (Pre-Phase-C, the bytes would reference a non-existent function index → invalid wasm.) - `extern_callsite_emits_import_directive` — walks the wasm ImportSection and asserts there's exactly one user-extern entry `(import "wasm" "host_identity" (func ...))` plus the 2 host imports, total 3. - `wasmparser` added as a dev-dep on ephapax-cli at the same version as ephapax-wasm (0.221). - `cargo test --workspace` — all ~38 test binaries pass, including the wasm-encoder tests whose function indices implicitly shift through the runtime-helper renumbering. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds clap aliases on the existing `compile` subcommand for the two
names hypatia's `build-gossamer-gui` workflow probes for. All three
spellings route to the same surface-parse → desugar → typecheck → wasm
pipeline that Phases A-C wired up:
ephapax compile input.eph
ephapax compile-eph input.eph # alias
ephapax compile-affine input.eph # alias
Why this is safe now (and was not before)
-----------------------------------------
The 2026-05-13 comment on #36 argued the aliases shouldn't land until
end-to-end compilation worked, otherwise they'd short-circuit hypatia's
fallback to `compile` with the same parse failure and replace the
existing structured `::warning::` with a silent error.
After Phases A-C in this PR, that argument is satisfied for the
`extern "abi"` portion of the v2 grammar — extern blocks parse,
typecheck, and codegen to valid wasm. `bridge.eph` itself still
needs the remaining grammar pieces (annotations + tuple destructuring
in `let!`), which is the next phase of work on #43. Once that lands,
the aliases pick up `bridge.eph` automatically — no further CLI
changes needed.
Tests
-----
- `src/ephapax-cli/tests/v2_grammar_phase_d_aliases.rs`:
- `compile_eph_alias_routes_to_compile`
- `compile_affine_alias_routes_to_compile`
Both spawn the built `ephapax` binary and assert it compiles
`extern-callsite.eph` to a wasm file with the `\0asm` magic bytes.
- New dev-deps: `tempfile = "3"`.
- `cargo test --workspace` — all binaries pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Next chunk of the v2-grammar work for #43, on top of #45's Phase A→D. This PR drops three more bridge.eph blockers: 1. **`@identifier` decorator annotations** on top-level declarations. `@tail_recursive`, `@inline`, `@no_mangle`, etc. now parse without semantic effect — they're stored at the parser layer so the surface AST round-trips. Both the surface parser (`parse_surface_declaration`) and the core parser (`parse_declaration`) skip past annotation pairs before dispatching to the actual decl handler. 2. **Tuple destructuring in `let` / `let!` binders.** A new `let_binder` grammar rule accepts either a single identifier (current behaviour) or a `tuple_binder` of the form `(a, b, c, ...)`. Tuple binders lower at parse time to a 1-arm `match e of | (a, b, ...) => body end`, reusing the existing `Pattern::Pair` infrastructure. The desugar pass gains a fast-path: a 1-arm match whose pattern is not a `Constructor` delegates straight to `bind_single_pattern`, bypassing the sum-type case-tree builder that previously required constructor patterns. 3. **`(T1, T2)` resolves to `SurfaceTy::Prod`** (binary product) when exactly two element types, matching the value-side `paren_or_pair` convention (`(1, 2)` parses as `Pair{left, right}`). Three-or-more element types still become `SurfaceTy::Tuple`. Without this fix, `let p: (I32, I32) = (1, 2)` failed type-checking because the annotation type and the value's inferred type were structurally different. What's *still* deferred (out of scope for this PR): - **Implicit `in`** between sequential `let` bindings inside a fn body (bridge.eph chains `let! ... let ... let! ...` without `in` keywords). This is a deeper grammar change — needs a `block_expr` form and a rewire of how `fn_body` parses. - **Abstract extern type registration** (`extern "abi" { type Window }` → typechecker-visible opaque type). Phase B still drops `ExternItem::Type`. - **`Unit` as a type-name alias** for `()`. Tests ----- - New fixture `tests/v2-grammar/fixtures/let-pair-explicit-in.eph` - New integration test file `src/ephapax-cli/tests/v2_grammar_phase_e.rs`: - `annotation_on_fn_decl_parses` - `multiple_annotations_on_fn_decl_parse` - `let_pair_binder_compiles_end_to_end` (parse → desugar → typecheck → wasm validate) - `let_pair_lin_binder_parses` (covers the `let!` form bridge.eph uses) - `cargo test --workspace` clean — no regressions in the existing match / pair / desugar tests. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three more v2-grammar pieces for #43, on top of the phase E PR. After this commit, hypatia's bridge.eph parses end-to-end and desugars cleanly through every `extern "gossamer"` signature; the remaining gate is cross-module type resolution for `Model`/`Msg`/etc. imported from `hypatia/ui/gui` (#43 follow-up). Phase F — implicit `in` between sequential lets ----------------------------------------------- New grammar rule: block_expr = { sequential_let+ ~ expression } sequential_let = { ("let!" | "let") ~ let_binder ~ (":" ~ ty)? ~ "=" ~ block_rhs } block_rhs = { lambda_expr | if_expr | region_expr | match_expr | case_expr | handle_expr | or_expr } added at the top of the `single_expr` choice so it's tried before the existing `let_expr` / `let_lin_expr`. PEG ordering means the legacy `let x = e in body` form still parses — if the parser sees `in` after the rhs, `block_expr` rolls back and `let_expr` matches. Folded at parse time into nested `Let` / `LetLin` AST nodes (`Let{a, .., body: Let{b, .., body: <trailing>}}`). Tuple binders reuse the Phase E `match_arm_from_tuple_binder` lowering, so let! (ch2, msg) = ipc_recv(ch) let new_model = update(msg, model) run(ch2, new_model) works with no `in` keywords and a destructured first binding — exactly the shape bridge.eph uses on its TEA loop. Phase G — abstract extern types ------------------------------- `ExternItem::Type` items in `extern "abi" { type Foo }` blocks now register in the `DataRegistry`'s new `extern_types` map. The `desugar_named_type` path checks that map first; opaque extern types resolve to `Ty::Base(BaseTy::I32)` (host handle / pointer representation, matching the existing wasm import convention). Type arguments on opaque types are rejected (`Window(T)` is an error) since extern types are monomorphic by construction. Phase H — Unit / Bytes built-in type aliases -------------------------------------------- `Unit` resolves to `Ty::Base(BaseTy::Unit)` (the type-position spelling of the literal `()`); `Bytes` resolves to `Ty::Base(BaseTy::I32)` as the conventional host-managed buffer handle. These should eventually migrate to a stdlib prelude — for now they're hard-coded fast-paths in `desugar_named_type`, sitting before the data registry / extern-type lookups. Tests ----- - `tests/v2-grammar/fixtures/implicit-in.eph` - `tests/v2-grammar/fixtures/implicit-in-tuple.eph` - `tests/v2-grammar/fixtures/extern-abstract-types.eph` - `src/ephapax-cli/tests/v2_grammar_phase_f.rs` — 4 tests: - `implicit_in_chain_compiles` - `implicit_in_with_tuple_binders_compiles` - `legacy_explicit_in_still_compiles` (regression for the `in` form) - `implicit_in_let_lin_chain_parses` - `src/ephapax-cli/tests/v2_grammar_phase_gh.rs` — 3 tests: - `extern_abstract_types_desugar_to_i32_handles` (parse → desugar → typecheck → wasm validate on a fixture using `Window`, `Channel`, `Bytes`, `Unit` opaquely) - `unit_alias_resolves_to_base_unit` - `bytes_alias_resolves_to_i32` - `cargo test --workspace` clean — no regressions in existing match/pair/let/desugar/typing tests. bridge.eph status ----------------- After this commit, `cargo run -- compile-eph bridge.eph` fails with Desugar error: unknown type `Model` — i.e. the parser, surface AST, and `extern "gossamer"` signature desugar are all working. `Model` / `Msg` / `Department` etc. come from `import hypatia/ui/gui`, which requires cross-module type resolution. That's a separate piece of work tracked on #43. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Closes the last v2-grammar gap for #43. `compile-eph foo.eph` now resolves `import a/b/c` declarations against the file system, parses + desugars + typechecks each module in topological order, and emits a single wasm output that links them. After this commit, hypatia's bridge.eph stops being a *language* problem — `cargo run -- compile-eph bridge.eph` fails with a clean cannot read hypatia/ui/gui.eph: No such file or directory i.e. the only remaining ingredient is the imported module's source file. What changed ------------ - **`ephapax-surface`**: `SurfaceModule` gains an `imports: Vec<SurfaceImport>` field (the surface parser had been silently dropping `Rule::import_decl` pairs). `SurfaceDecl::Fn` and `SurfaceDecl::Type` gain a `visibility` field; new `SurfaceVisibility { Public, Private }` enum. - **`ephapax-parser`**: `parse_surface_module` now collects `import_decl` and `module_decl` pairs in addition to declarations. `parse_fn_decl` and `parse_type_decl` detect the optional `Rule::visibility` (`pub`) pair and propagate it. - **`ephapax-desugar`**: visibility flows from surface to core via a new `lower_visibility` helper. Surface imports propagate to `Module.imports`. `Desugarer` exposes `registry()` and `take_registry()` so callers can chain a single `DataRegistry` across multiple modules. - **`ephapax-cli`**: new `import_resolver` module — walks the import graph DFS-style, detects cycles, returns modules in topological order (dependencies first, root last). `compile_file` is rewritten to: 1. load the program via the resolver 2. desugar each module with a chained `DataRegistry` so `data` and `extern type` declarations from imported modules resolve in the importer 3. type-check each module against a chained `ModuleRegistry` so the importer's `Var(name)` lookups find public items from imports 4. **merge** all imported modules' `Fn` / `Extern` decls into the root module before codegen — wasm produces a single binary, so each imported function becomes a regular function body in the output. Duplicate names across modules are dropped (first-seen wins). Tests ----- - `tests/v2-grammar/fixtures/multi-module/app.eph` + `multi-module/lib/math.eph` — minimal 2-module program: `app.eph` imports `lib/math` and calls its public `double` fn. - `src/ephapax-cli/tests/v2_grammar_phase_i.rs:cross_module_imports_compile_end_to_end` — spawns `ephapax compile-eph`, asserts the binary reports `Resolved 2 module(s) in import graph`, the output starts with the `\0asm` magic, and `wasmparser::validate` accepts it. - `cargo test --workspace` — clean, no regressions across the ~40 test binaries (including the existing core-parser tests that declare functions without the new `visibility` field on `SurfaceDecl::Fn`/`Type`, which now defaults to `Private` via serde). bridge.eph status ----------------- End-to-end, the bridge.eph blocker chain is now: ✅ slash module paths (Phase A) ✅ extern blocks (Phase A-C) ✅ @decorator annotations (Phase E) ✅ tuple destructuring in let / let! (Phase E) ✅ implicit `in` between sequential lets (Phase F) ✅ abstract extern types (Phase G) ✅ Unit / Bytes type aliases (Phase H) ✅ cross-module type resolution (this commit) ⏳ hypatia/ui/gui.eph source file (out of ephapax's hands) Future work (not in scope here): - Per-module name-mangling in codegen so duplicate fn names across modules don't require dedup-by-first-seen (today's behaviour). - `--include-path` flag for multiple search roots. - Package-manifest awareness (ephapax-package crate exists but isn't wired into the resolver yet). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The bridge.eph integration target for #43. After this commit, `cargo run -- compile-eph bridge.eph` produces ~2.2KB of wasm output for the vendored hypatia bridge fixture (`tests/v2-grammar/fixtures/hypatia-port/bridge.eph`). Resolver -------- - **Module-declaration index.** The import resolver now scans `base_dir` for every `.eph` file at startup, reads the first `module a/b/c` line of each, and builds a name → file-path map. `import a/b/c` first tries the literal `<base>/a/b/c.eph` path; on miss, falls back to the index. Lets corpora like hypatia's `src/ui/gossamer/` keep flat filenames (`hypatia_gui.eph`) while their declared module name (`hypatia/ui/gui`) drives import resolution. Language additions ------------------ - **`pub data Foo = ...`.** Grammar now accepts `visibility?` before `data` declarations. - **Record/sum type aliases.** `type Foo = { f1: T1, f2: T2 }` and `type Bar = | A | B(I32)` previously failed at parse. Records lower to right-nested binary products; sums lower to right-nested binary sums. - **Record literal field separators.** `record_field_assign` accepts three surface forms — `f: ty = v`, `f = v`, and `f: v` (ML-style shorthand). Records lower to positional pairs / tuples. - **`type Foo = T` alias resolution in desugar.** New `type_aliases` map on `DataRegistry` captures alias bodies in surface form; `desugar_named_type` looks them up and recursively expands. - **`pub` keyword in `parse_data_decl`** — was being eaten as the data type name. - **Match-on-literal lowering.** `match n of | 0 => a | 1 => b | _ => c` desugars to nested `if scrutinee == lit then arm else next` ending in the default branch. Required by bridge.eph's `int_to_department` and `decode_msg`. - **Bare string literals.** Desugar wraps `Literal::String(s)` as `StringNew { region: "_", value: s }`. The typechecker's region- activation gate exempts `_` (the wildcard region for inferred String types). - **Nullary fn signatures expose as `() -> T`.** `fn foo(): T = ...` was previously registered as having type `T` directly, making `foo()` at a call site fail to unify. Three pre-pass / registry call sites updated. Vendored fixture ---------------- `tests/v2-grammar/fixtures/hypatia-port/{bridge,hypatia_gui}.eph` are local adaptations of hypatia's upstream files. Four changes versus upstream (all documented in the file headers): 1. `module hypatia/ui/gui` header on hypatia_gui.eph 2. `pub` keywords on items bridge.eph imports 3. `model.field_name` rewritten as positional `.0` / `.1` (named field access remains future work) 4. `decode_msg` reparses bytes per use so each linear `String` is consumed exactly once Tests ----- - `src/ephapax-cli/tests/v2_grammar_phase_j.rs::bridge_eph_compiles_end_to_end` — spawns `ephapax compile-eph`, asserts return code 0, output ≥1KB starting with wasm magic bytes. Note: full `wasmparser::validate` does not yet pass on the bridge output — ADT-constructor / match-arm codegen produces a stack mismatch ("expected i32, nothing on stack") which is left for a follow-up. Parse + typecheck + binary emission are all covered. - `cargo test --workspace` clean, no regressions on the existing ~40 test binaries. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Owner
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.