engine feature-support audit: 49-row corpus + cross-engine harness#413
Merged
Conversation
49 small .ilo programs each exercising one feature, plus a bash harness that runs every program through tree / VM / Cranelift JIT / Cranelift AOT and prints a markdown matrix. Used to populate the audit doc at research/engine-audit-2026-05.md (research/ is gitignored). Harness passes `main` as the AOT entry function explicitly. The default "first declared function in the file" entry is rarely correct for real programs and silently segfaults; flagged as a gap in the audit doc.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
This was referenced May 18, 2026
danieljohnmorris
added a commit
that referenced
this pull request
May 19, 2026
Engine audit PR #413 found AOT silently returns nil for HOF / closure dispatch (map over a lambda, fld, grp, uniqby, fn-ref return). Root cause: AOT never publishes ACTIVE_PROGRAM, so jit_call_dyn and jit_call_builtin_tree hit their null-program guards and return TAG_NIL for every user-fn callback. To fix, the AOT binary needs a CompiledProgram at runtime. Add a postcard wire format (chunks + func_names + is_tool + type_registry + ast) gated by schema_version. Chunk constants encode only the variants the compiler emits today (Nil / Number / Text / Bool / List); a future variant trips the From<&Value> guard in the test suite before any binary ships. The AST is serialised as JSON inside the postcard envelope because the Program::serialize_decls custom impl uses serialize_seq(None) which postcard rejects with "The length of a sequence must be known". serde_json handles unsized seqs and the AST already serialises cleanly via JSON for --ast. Round-trip unit tests cover the empty program, a map-lambda program, an fld user-fn program, type-registry preservation, and schema-version mismatch detection.
danieljohnmorris
added a commit
that referenced
this pull request
May 19, 2026
Engine audit PR #413 found AOT silently returns nil for HOF / closure dispatch (map over a lambda, fld, grp, uniqby, fn-ref return). Root cause: AOT never publishes ACTIVE_PROGRAM, so jit_call_dyn and jit_call_builtin_tree hit their null-program guards and return TAG_NIL for every user-fn callback. To fix, the AOT binary needs a CompiledProgram at runtime. Add a postcard wire format (chunks + func_names + is_tool + type_registry + ast) gated by schema_version. Chunk constants encode only the variants the compiler emits today (Nil / Number / Text / Bool / List); a future variant trips the From<&Value> guard in the test suite before any binary ships. The AST is serialised as JSON inside the postcard envelope because the Program::serialize_decls custom impl uses serialize_seq(None) which postcard rejects with "The length of a sequence must be known". serde_json handles unsized seqs and the AST already serialises cleanly via JSON for --ast. Round-trip unit tests cover the empty program, a map-lambda program, an fld user-fn program, type-registry preservation, and schema-version mismatch detection.
danieljohnmorris
added a commit
that referenced
this pull request
May 19, 2026
Engine audit PR #413 found AOT silently returns nil for HOF / closure dispatch (map over a lambda, fld, grp, uniqby, fn-ref return). Root cause: AOT never publishes ACTIVE_PROGRAM, so jit_call_dyn and jit_call_builtin_tree hit their null-program guards and return TAG_NIL for every user-fn callback. To fix, the AOT binary needs a CompiledProgram at runtime. Add a postcard wire format (chunks + func_names + is_tool + type_registry + ast) gated by schema_version. Chunk constants encode only the variants the compiler emits today (Nil / Number / Text / Bool / List); a future variant trips the From<&Value> guard in the test suite before any binary ships. The AST is serialised as JSON inside the postcard envelope because the Program::serialize_decls custom impl uses serialize_seq(None) which postcard rejects with "The length of a sequence must be known". serde_json handles unsized seqs and the AST already serialises cleanly via JSON for --ast. Round-trip unit tests cover the empty program, a map-lambda program, an fld user-fn program, type-registry preservation, and schema-version mismatch detection.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Investigation-only PR for adoption brief 5. Lands the test corpus and harness needed to run the cross-engine feature audit, plus the empirical data needed for
reference/engines.md(engines-as-contracts). No engine source changed..iloprograms undertests/engine-matrix/, one per featuretests/engine-matrix/run-matrix.shruns every program through tree / VM / Cranelift JIT / Cranelift AOT and prints a markdown matrixresearch/engine-audit-2026-05.mdin the worktree (local-only per.gitignoreforresearch/, but informs the public engines-as-contracts page)Results
Tree / VM / Cranelift JIT are at feature parity across the audited surface (46 of 47 non-skipped rows OK on each). The lone shared miss is returning a capturing closure from a function (
>F n nbody of a lambda) which the parser rejects on every engine — call it a parser surface gap, not an engine bug.AOT diverges on 9 rows, every failure closure / HOF / sum-type related:
map (x:n>n;*x 2) [1,2,3]returns[nil, nil, nil]from an AOT binarymap (x:n>n;+x k) [1,2,3](capture) same shapemap fn ctx xs(3-arg closure-bind) returnsnilfld add [1,2,3,4] 0returnsnilgrpanduniqby(PR feature: srt/grp/uniqby off tree-bridge (Phase 2 PR3c) #391 surface) returnnil>F n n;sqreturnsnilS a b csum-type variant: AOT compile error (Duplicate definition of identifier: ilo_strconst_1)Every other AOT row matches its peers.
Bugs / gaps surfaced (full list in research doc)
map (\x. *x 2) xswill deploy a binary that returns nils, no warning. ProposedILO-R014.main—ilo compile file.ilo -o out && ./outsilently segfaults (exit 139) for any program that declares a helper beforemain. The harness passesmainexplicitly to dodge this; recommend the dispatcher pickmainwhen it exists.ILO-R015.nil, exits 0 — tree/VM/JIT returninfper f64 semantics. Three engines, three answers.Duplicate definition of identifierin cranelift backend's string-constant interning when sum tags collide with other string constants.ILO-R012with auto-fallback. Empirically,--run-vmand--jithandle Phase 2 captures natively now (likely since PRs leading up to feature: srt/grp/uniqby off tree-bridge (Phase 2 PR3c) #391). SPEC.md, ai.txt, skills/ilo/SKILL.md all need a pass.mkadd k:n>F n n;x:n>n;+x kis rejected by the parser on every engine.--targetflag does not exist.What's in the diff
One commit:
add engine-matrix test corpus + run harness for cross-engine audit— 49.iloprograms + bash harness + README undertests/engine-matrix/. Each program has-- feature:+-- expected:header lines so future audits can re-run via the harness.Test plan
cargo build --release --features cranelift(used by harness)bash tests/engine-matrix/run-matrix.sh— produces the matrix; 9 AOT FAIL cells reproduce the bugs abovesrc/touchedFollow-ups
Per the brief, this PR does not dispatch fixes. Top three to prioritise from the gap list:
mainresolution — cheapest fix, removes a SIGSEGV trap