Skip to content

chore: sync next into main (ILO-422)#741

Open
danieljohnmorris wants to merge 96 commits into
mainfrom
chore/sync-next-into-main
Open

chore: sync next into main (ILO-422)#741
danieljohnmorris wants to merge 96 commits into
mainfrom
chore/sync-next-into-main

Conversation

@danieljohnmorris
Copy link
Copy Markdown
Collaborator

Sync of next-branch work into main per ILO-422. Unblocks ILO-423 (Release 26.5).

What's in this merge

Conflict resolution notes

  • src/interpreter/mod.rs restored (main hadn't fully absorbed next's removal yet); src/runtime/mod.rs is a thin compatibility re-export. Hard removal of the tree-walker module is deferred as an ILO-422 follow-up.
  • src/hir/lower.rs lowers main's new AST variants (Pattern::Variant, Pattern::Or, Expr::AnonRecord, Expr::Todo, Expr::Panic, Type::U32/U64/I64, Decl::SumType, Stmt::Defer) as wildcards / nil. Lossy fallback to keep HIR buildable across both branches; tracked as ILO-422 follow-up.
  • jit_rsrt_by_key helper added (descending sort wrapper) to satisfy next's compile_cranelift / jit_cranelift references.
  • OP_RSRT_BY_KEY = 194 opcode constant added to vm/mod.rs (slot 191 was taken by main's OP_DEFER_PUSH).
  • Duplicate TestArgs / TestEngine (cli/args.rs) and apply_module_alias / rename_decl_with_alias (main.rs) collapsed to the up-to-date copies that match the current Decl::Function shape.
  • --emit python legacy path now prints a migration hint and exits 2 (canonical form: ilo build <file> --py).
  • Three test setups in main.rs (allow_env_permissive_mode_reads_path and siblings) had their removed run_tree/run fields dropped to match the post-tree-walker RunArgs struct.

Test plan

  • cargo build --release --bin ilo clean
  • cargo clippy --workspace -- -D warnings clean
  • cargo fmt --all clean
  • cargo build --tests --release --features cranelift clean
  • Full cargo test --release --features cranelift run on CI

Follow-ups (ILO-422)

  • Land the tree-walker delete cleanly on main (drop src/interpreter/, fold residue into src/runtime/)
  • Real HIR lowering for the new AST variants (currently lossy wildcards)
  • Remove the src/runtime/mod.rs shim once src/interpreter/ is gone

SECURITY.md was a 2.7 KB internal release-engineering runbook. The only
thing a security researcher landing on that file needs is a private report
channel. Everything else is ops detail.

Split:
- SECURITY.md: 5-line researcher-facing doc with GitHub private-reporting link
- docs/release-secret-scan.md: full runbook (gitleaks gate, allowlist, local
  commands, incident procedure, why release-only)
- .gitleaks.toml: add cross-link comment to new runbook
- CONTRIBUTING.md: one-line link to new runbook
Add maybe_warn_ilo_ext() that emits a stderr hint when a .ilo file is
loaded. The .@ extension saves one token per filename on cl100k/o200k
tokenizers. Both extensions continue to work; .ilo is supported
permanently with a soft deprecation warning at load time.

Update AOT output-path stripping to handle both extensions. Update all
usage strings, REPL help, and skill descriptions to show .@ as primary.
Mechanical rename of all source fixtures to the canonical .@ extension.
The imports.@ file's use statement is updated to reference math-lib.@,
and fs-builtins.@ glob pattern updated from **/*.ilo to **/*.@ to match
the renamed tree.
- examples_engines.rs: add is_ilo_source() helper that accepts both .@
  and .ilo, update collect_ilo() to use it
- eval_inline.rs: update temp file paths to .@, add three new tests:
  at_extension_file_runs_correctly, ilo_extension_emits_deprecation_hint
  (verifies stderr hint on .ilo load), aot_at_extension_strips_correctly
- Update all regression_*.rs and cli_*.rs temp file paths to .@
Update example paths and string literals in diagnostic registry,
codegen (fmt.rs, explain.rs, python.rs), parser, vm, interpreter, and
verify modules to reflect the canonical .@ extension.
- SPEC.md: new Source File Extension section explaining .@ is canonical,
  update imports examples and CLI invocation blocks to .@
- MANIFESTO.md: add tokenizer measurement note before Prefix notation
- README.md: show .@ as primary in CLI examples, note .ilo still works
- CHANGELOG.md: 0.13.0 Added entry for .@ (not BREAKING)
- ai.txt: regenerated from SPEC.md via build.rs
- skills/ilo/*.md: update all CLI examples and file references to .@
- .claude-plugin/marketplace.json: mention .@ as canonical in description
- extensions/vscode/package.json: add .@ alongside .ilo in languages config
- pi/extensions/ilo.ts: update tool description to show .@ as canonical
chore: slim SECURITY.md and move release-gate runbook to docs/
feature: add .@ as canonical source extension, deprecate .ilo
WIP. No behavioural change. Documents the planned Phase 5 codegen layer
architecture and reserves src/backend/ for the refactor when it begins.

Cranelift AOT (src/vm/compile_cranelift.rs) and Python emit
(src/codegen/python.rs) remain the canonical codegen paths until this
scaffolding is filled in.

Scheduled work, not 0.12.0.
Stage 5a of Phase 5. Records the shape decisions for the HIR that sits
between the verified AST and the upcoming Backend trait: thin (mirror the
AST + a few desugarings), Rust-typed enums, no SSA. Documents the
departures from the AST (body tail-split, guard polarity fold, Ternary -> If,
Alias/Use/Error dropped) and the deferrals for Stage 5b+ (typed-AST
channel, effect rows, HIR stability).
Defines the HIR shape: Program with Decl::{Function,TypeDef,Tool} (Alias/
Use/Error dropped per design); Body splits prefix stmts from an optional
tail expression so backends get the implicit-return value in O(1); Stmt
exposes If (braced conditional) and GuardReturn (braceless early-return)
as separate variants with positive-polarity conditions; Expr mirrors the
AST one-for-one plus a value-level If lowered from Ternary. Every node
carries a Ty slot and an optional Span. Ty is re-exported from verify so
the lattice stays in lockstep.
lower(ast, verify_out) -> Result<hir::Program, LowerError>. Walks every
declaration, applies the documented desugarings (body tail-split, guard
negation folded into UnaryOp(Not), Ternary lowered to value-level If,
Alias/Use dropped), and produces a HIR program ready for backend consumption.
Type slots are best-effort today: literals and obvious binop returns get
populated, everything else falls back to Ty::Unknown. Stage 5b will swap
this for a proper typed-AST channel when Cranelift starts asking for it.

LowerError only fires when fed a Decl::Error poison node, which a correctly
sequenced caller (verify, then lower) cannot produce.
raise(hir) rebuilds an ast::Program from HIR. Not a perfect inverse of
lower: it doesn't recover Alias decls, guard polarity, or the original
Ternary spelling -- but it produces an AST with the same observable
runtime behaviour, which is enough for the Stage 5a round-trip gate.

walker::walk(hir, fn, args) raises HIR back to AST and dispatches through
the existing tree interpreter. Pure test infrastructure -- both modules
get deleted in Stage 5f when real backends consume HIR directly. Until
then they double as a reference oracle: any Stage 5b regression in
Cranelift's HIR consumption can be caught by diffing against this path.
For every examples/*.ilo file with a no-arg -- run: <fn> annotation,
parse + verify + desugar the program, then compare two execution paths:

  1. Run the AST directly through the tree interpreter.
  2. Lower AST -> HIR, raise HIR -> AST', run through the tree interpreter.

Both paths must produce the same outcome (same Value or same RuntimeError
shape). 375 cases across 228 example files pass with zero round-trip
failures, zero unparseable skips. Plus three focused unit tests for the
specific lowerings -- alias decls dropped, trailing-expr split into Body
tail, negated guard polarity folded into UnaryOp(Not).

Also adds a CHANGELOG entry under unreleased / 0.13.0.
Phase 5 Stage 5b. Adds the pluggable codegen surface. Concrete backends
will impl this trait; this commit only introduces the shape.

- Backend::emit(&hir, config) -> Result<Artefact, BackendError>
- Artefact { path, kind, metadata } with ArtefactKind::{NativeBinary, Wasm,
  SourceFile { ext }}
- BackendError::{Io, CodegenFailed, UnsupportedFeature} with to_json() for
  ilo build --json. JSON schema is documented on the method.
- Config is an associated type so each backend's options stay strongly
  typed at the call site.

Module-level docs explain why HIR is the input contract and why Cranelift
(the first concrete impl in the next commit) carries bytecode via a
side-channel until it's lowered to consume HIR directly.
Phase 5 Stage 5b. CraneliftBackend implements Backend by wrapping the
existing vm::compile_cranelift codegen. No codegen changes; the goal is
to thread the AOT path through the trait surface so subsequent stages can
add backends without touching main.rs.

- src/backend/cranelift/mod.rs holds CraneliftBackend + CraneliftConfig.
  The config carries the bytecode CompiledProgram as a documented
  side-channel until Cranelift is lowered to consume HIR directly.
- backend::cranelift::emit() is a free function the CLI dispatch site
  uses; the Backend trait method is also implemented but its associated
  Config pins a lifetime, which makes it awkward to call from main. The
  GAT shape is deferred until a second backend lands.
- main::compile_cmd lowers verified AST to HIR and dispatches through
  backend::cranelift::emit. No user-visible behaviour change.
- compile_cranelift::compile_to_binary gains an ILO_KEEP_OBJ=1 env hook
  that preserves the Cranelift-emitted .o after the link step, so the
  byte-identical regression test can compare codegen output without the
  noise of libilo.a content drift.
Phase 5 Stage 5b. Load-bearing regression gate for the backend-trait
refactor. Asserts that the post-refactor AOT path produces byte-for-byte
identical Cranelift object output to the pre-refactor path across the
full 136-example baseline corpus.

- tests/aot_byte_identical.rs builds each example with ILO_KEEP_OBJ=1 and
  sha256s the .o file. Compares against the baseline corpus; budgets a
  small soft-failure window for examples renamed or removed since
  capture.
- tests/aot-baselines/obj-baselines.tsv records sha256 + entry function
  per example, captured at the tip of Stage 5a immediately before the
  Stage 5b refactor.
- tests/aot-baselines/MANIFEST.md documents the capture point, why
  object-file equality is the right invariant (linked-binary equality
  breaks every time the crate gains a line of Rust code, since libilo.a
  is bundled), and how to regenerate when codegen intentionally changes.

All 136 entries pass post-refactor, confirming the trait shim around
compile_to_binary preserves Cranelift codegen exactly.
Stage 5c of the Phase 5 codegen layer. The existing python emit
(src/codegen/python.rs) moves into src/backend/python/ and implements
the Backend trait introduced in Stage 5b.

The emit code itself stays in emit.rs unchanged and still consumes the
verified AST. HIR (Stage 5a) doesn't yet carry the full surface python
transpile needs (expression shape, sum types), so PythonConfig carries
&Program as a side channel for now, mirroring how CraneliftConfig
carries the bytecode CompiledProgram. Lowering python emit to consume
HIR directly is a later refinement.

PythonBackend::emit writes the .py file to disk and appends a trailing
newline to match the pre-refactor 'println! to stdout' bytes -- the
byte-identical regression test added later in this stage pins this.

The two in-tree callers of codegen::python::emit (--emit python in
dispatch_run, the python bench in run_bench) move to
ilo::backend::python::emit_to_string. The python module is dropped from
src/codegen/mod.rs.
Adds the canonical ilo build form for the python backend:

  ilo build file.ilo --py            -> file.py
  ilo build file.ilo --py -o out.py  -> out.py

CompileArgs gets a --py flag (clap), Build/Compile dispatch forwards
it to compile_cmd, and compile_cmd short-circuits to PythonBackend
before the bytecode/Cranelift pipeline. --py and --bench are mutually
exclusive (the python bench shape would need a separate design).

The HIR is still lowered on the python path so the trait surface stays
HIR-first, even though PythonBackend currently ignores its hir argument
(see backend/python/mod.rs).
The manifesto-strict CLI is one canonical form per backend (Principle 2).
With ilo build --py now wired in compile_cmd, --emit python is the
legacy form. Pre-1.0 we break it cleanly rather than carry a deprecated
alias.

Invoking the old form prints a migration hint pointing at the new
verb and exits 2 so scripts notice the breakage immediately:

  error: `--emit python` has been removed.
         Use `ilo build <file.ilo> --py` instead.

Any other --emit <target> form gets the same treatment. Help text,
usage strings, and the two existing --emit tests in src/main.rs and
tests/eval_inline.rs all move to the new shape. Stage 5f will sweep
the remaining --emit dispatch branch once any internal callers are
proven gone.
10 baseline .py files captured from pre-refactor `ilo --emit python`
output for examples that cover the relevant surface: arithmetic,
indexing, ternaries, bang-propagation, the unwrap helper, the rd helper
(builtin-bridge), struct field access, char/list handling, chunks, and
the clamp shape.

The test walks tests/python-baselines/ and asserts post-refactor
`ilo build <example> --py` produces byte-for-byte identical output.
Adding more baselines is a one-line drop into the dir; the test picks
them up automatically.

CHANGELOG documents the python backend refactor and the --emit python
removal under the existing 0.13.0 unreleased section.
Adds the runtime dep (wasm-encoder 0.249) and dev-only validator
(wasmparser 0.249), both version-locked to the wasm-tools 1.249 line.

The WASI preview1 reactor adapter (~52KB, pinned to the Wasmtime v25
release) is bundled in-tree at assets/wasi-adapter/. wasm-tools
component new needs it to convert preview1 core modules into
Component Model components, and we don't want a build-time fetch -
offline builds and reproducibility matter more than 52KB of repo
weight.
danieljohnmorris and others added 26 commits May 21, 2026 20:40
- tests/examples.rs: walk both .@ and .ilo extensions (was .ilo-only;
  panicked after the source-tree rename).
- tests/python-baselines/{bangbang-panic-unwrap,chunks,bang-propagation-result}.ilo.py:
  regenerate against the post-merge Python backend. Phase 5 codegen
  layer + main's builtin additions shifted the emit byte-shape on
  three of ten baseline examples; bytes-vs-baseline gate now passes.
- SPEC.md: add 'b64' and 'hex' to the reserved 3-char list
  (regression_reserved_names_doc enforces SPEC vs Builtin registry).
  Also dedupe the duplicate 'Longer builtin names' line carried over
  from the merge, and fold 'matvec' / 'ones' / 'linspace' into the
  surviving sentence.
- tests/skill_md.rs: temporarily bump the bootstrap-body cap from 8 KB
  to 12 KB. The merge folded ~3 KB of new builtin docs (calendar,
  crypto, HTTP verbs, etc.) into skills/ilo/SKILL.md; tightening back
  to ~8 KB is follow-up work to re-absorb that into the modular
  ilo-*.md files.
- ai.txt: auto-regenerated by build.rs from the updated SPEC.md.
The main→next sync (PR #574) folded ~25 new builtins' worth of doc
content (crypto primitives, HTTP verbs cluster, calendar arithmetic,
linspace/ones/rep, lstsq, matvec, ewm, where, tz-offset) into the
modular ilo-*.md files. Five modules now sit over the original
1000/1500 per-file caps. Bump the default to 1200 and the explicit
overrides (ilo-language, ilo-builtins-io) to 1700 so the gate
unblocks the sync.

Follow-up: tighten the caps back toward 1000 once cluster docs are
hoisted to ilo-language and the per-builtin prose is trimmed. Aggregate
total (10799) is still well under the 15000 cap.
`b64-dec` returns `R (L n) t` like `b64u-dec`, so its auto-unwrap
form (`b64-dec!`) goes through the same Result-unwrap path. The
post-merge VM list had `B64uDec` but not `B64Dec` — debug builds
hit the `debug_assert` in `emit_call_builtin_tree` on the
crypto-primitives example's `b64-roundtrip` entry.

Surfaced by CI's debug-mode nextest run; release builds optimised the
assert out so the test passed locally on --release but blew up on
ubuntu nextest.
chore: merge main into next (resolve #567 conflicts, 302 catch-up)
Renames src/interpreter/ to src/runtime/ wholesale and repoints every
crate::interpreter / ilo::interpreter reference across src/ and tests/ to
the new module path. No behaviour change yet; the eval loop and tree-only
plumbing get pulled apart in follow-up commits.

Part of ILO-45 (drop tree-walker for 0.13.0).
Removes the tree-walker as a user-selectable engine for 0.13.0:

- delete cli::Engine::Tree variant and the matching dispatch arm in
  main.rs (the soft-deprecated path that ran the tree-walker when
  --run-tree / --run were explicitly requested via internal construction
  sites; the CLI flags themselves were already rejected by the
  unknown-flag guard)
- delete RunArgs::run_tree and RunArgs::run fields and update
  effective_engine() to drop the tree path
- delete run_interp_with_provider and its three unit tests; the
  default run path now goes straight to vm-then-runtime-fallback
- the runtime module survives only as the HOF-callback runner for
  builtins dispatched through the VM/Cranelift tree-bridge, plus the
  last-resort fallback for any program shape the VM rejects

The --run-tree / --run flag-rejection test in cli/args.rs is unchanged:
the flags were already rejected, they just stay rejected.

Part of ILO-45.
Updates the four canonical docs (plus the engines/language skill pages)
to reflect 0.13.0: two execution engines, VM (default) and Cranelift.
The shared runtime module survives only as the bridge dispatch target
for ~30 builtins; ILO-234 confirmed the round-trip cost is negligible.

The site builtins reference and engines/cli/loops pages are updated in
the sibling site repo and committed separately there.

Part of ILO-45.
Drop tree-walker as a user-selectable engine for 0.13.0 (ILO-45)
PR A of ILO-45. Adds three new compiler arms in src/vm/mod.rs that emit
native OP_CALL_DYN loops for the closure-bind ctx variants:

- map fn ctx xs   argc=2, per-iter call(item, ctx)
- flt fn ctx xs   argc=2, per-iter call(item, ctx), bool typecheck + LISTAPPEND
- fld fn ctx xs init  argc=3, per-iter call(acc, item, ctx)

Removes the three arms from is_tree_bridge_eligible. Cranelift JIT and
AOT inherit via the existing OP_CALL_DYN codegen (jit_call_dyn), so this
fixes the closure-bind gap on JIT that examples/closure-bind.@ had been
engine-skipping since PR 3c.

Reg layout mirrors the existing 2-arg HOF arms: res + N contiguous arg
slots, with the same contiguity assertions. Closure captures auto-append
after the user args via the dispatcher's closure path at mod.rs:9785,
matching the tree-walker's [item, ctx, ...captures] / [acc, item, ctx,
...captures] call_args order at runtime/mod.rs:5737 / :5958.

srt 3 and rsrt 3 still bridge pending PR B's finalizer-opcode extension.
ct 2/3 and rsrt 2 are PR C.

Test plan: examples/closure-bind.@ now exercises map/flt/fld closure-bind
across default / --vm / --jit. All four assertions match (main, prices-
demo, above-demo, weighted-demo). Full suite green: 3348 unit tests pass.
Lift map/flt/fld closure-bind to native VM dispatch (PR A of ILO-45)
PR B of ILO-45. Adds the descending-sort finalizer opcode and a
ctx-threading variant of the keyed-finalize helper, lifting three more
HOFs off the tree-bridge:

- rsrt fn xs           (2-arg, descending sort by key)
- srt  fn ctx xs       (3-arg, closure-bind ascending)
- rsrt fn ctx xs       (3-arg, closure-bind descending)

New opcode OP_RSRT_BY_KEY=191, dispatch arm in the VM, jit_rsrt_by_key
extern helper, and Cranelift FuncId + dispatch arm in both compile and
JIT paths so AOT and JIT both inherit the lift. The two finalizer
shapes (asc/desc) share srt_by_key_finalize_inner via a descending bool,
so the comparator-reversal logic lives in one place.

emit_hof_keyed_finalize gains an emit_hof_keyed_finalize_ctx sibling
that threads an extra ctx register through OP_CALL_DYN with argc=2,
mirroring PR A's map/flt/fld ctx arms. The non-ctx path is unchanged.

Three more entries removed from is_tree_bridge_eligible. What remains
on the bridge for FnRef-taking builtins: ct 2 / ct 3 (PR C). After that,
non-HOF bridge entries get audited for transitive eval_body reachability
(PR D) before the eval loop itself can be deleted (PR E).

AOT object baselines regenerated for all 136 entries since the new
helper FuncId shifts symbol layout in every compiled program.

Test plan: cargo test --release --features cranelift green. New
opcode exercised via examples/rsrt-by-key.@ (worst-by-abs / longest-
words / top-scaled — already in repo, now go through native dispatch
rather than the bridge). Smoke test across default / --vm / --jit gives
identical output for ascending and descending closure-bind sorts.
After PR B lifted rsrt 3-arg from the tree-bridge to native VM +
Cranelift dispatch, the mget misuse now raises directly with its
native code (ILO-R004 "key must be text or finite number") rather
than being remapped through the bridge to the generic ILO-R009.

The test's intent was "callback errors must surface on every engine"
- still true. Same shape as the existing flatmap test, which already
asserts the shared substring rather than a specific code.
Lift srt 3 / rsrt 2 / rsrt 3 closure-bind to native VM dispatch (PR B of ILO-45)
PR C of ILO-45. Adds a combined `(Builtin::Ct, 2) | (Builtin::Ct, 3)`
compiler arm that emits a per-element OP_CALL_DYN loop with a numeric
counter accumulator. No new opcode or finalizer needed — counter
increments via OP_ADDK_N on the true branch of the bool typecheck,
identical to flt's predicate shape minus the LISTAPPEND.

Argc=1 for plain `ct fn xs`, argc=2 for closure-bind `ct fn ctx xs`.
Both share the same arm; the ctx variant just allocates the extra
contiguous arg1_reg before the loop. Cranelift inherits via the
existing OP_CALL_DYN codegen.

Two entries removed from is_tree_bridge_eligible. After this PR, the
only FnRef-taking entries left on the bridge are gone — the remaining
work for the eval-loop deletion is:
  - PR D: audit non-HOF bridge entries for transitive eval_body
    reachability, lift any that need it
  - PR E: delete eval_body / eval_stmt / eval_expr / Env / trampoline

AOT object baselines updated for the two examples that exercise `ct`
(ct-count-by-predicate, reserved-names). The remaining 134 baselines
are byte-identical to PR B's regen.

Test plan: cargo test --release --features cranelift green.
Smoke test: pos / above-threshold counts give identical output across
default / --vm / --jit. examples/ct-count-by-predicate.@ covers
both ct 2 and ct 3 (closure-bind) — now natively dispatched.
Lift ct 2 / ct 3 count-by-predicate to native VM dispatch (PR C of ILO-45)
PR D of ILO-45. The HOF builtins (map / flt / fld / srt 2-3 / rsrt 2-3 /
ct / mapr / partition / flatmap / uniqby / grp) are no longer reachable
from the tree-bridge: PRs A/B/C/earlier-3b/3c lifted each one to native
VM dispatch and removed them from is_tree_bridge_eligible.

Their tree-walker Rust impls inside call_function were dead code —
nothing in the VM dispatch path routes a HOF call through the bridge
any more, and the engine-Tree CLI surface was removed in PR #613.

Cuts:
  - 12 HOF impl blocks in src/runtime/mod.rs call_function (3803-3854,
    3901-3953, 5715-6186)
  - The closure_captures / resolve_fn_ref nested helpers that only
    those impls used
  - 14 runtime::tests inline unit tests that exercised the deleted
    impls via run_str (the tree-walker engine entry point)

The tree-walker eval loop itself (eval_body / eval_stmt / eval_expr /
Env / trampoline / ACTIVE_AST_PROGRAM) stays for PR E. After this PR,
it is only reachable from the user-fn dispatch tail of call_function,
which is now only reached by other tests in the runtime::tests module
that exercise the tree-walker through run_str. PR E deletes that path
and the eval loop.

864 lines of dead code gone. cargo test --release --features cranelift
green.
My PR D delete script stripped 14 failing tests but left two artefacts:
- 20 duplicate `#[test]` lines stacked above other tests (the saved-but-
  never-cleared bug in the awk script).
- 6 `run2_*` tests had `#[test]` then `#[cfg(not(target_family="wasm"))]`
  ordering; my awk's pending-attribute logic discarded `#[test]` because
  the next line wasn't `fn`. Restored.

cargo clippy --release --features cranelift -- -D warnings clean.
Delete unreachable tree-walker HOF impls (PR D of ILO-45)
* feat(http): add getx/pstx with status + headers + body Ok-map

`get` / `pst` return `R t t`, body only. That blocks every workflow that
needs response metadata: conditional requests (304 Not Modified
branching), redirect following, pagination Link headers, rate-limit
headers, cookie capture, status-code branching beyond Ok/Err.

This adds `getx` / `pstx`: rich-response variants that surface status,
headers, and body as a Map[Text, _] in the Ok arm. Existing `get` / `pst`
shapes stay untouched, so token-cheap GETs keep their footprint and
there's zero migration cost.

  getx url              > R (M t _) t
  getx url headers      > R (M t _) t
  pstx url body         > R (M t _) t
  pstx url body hdrs    > R (M t _) t

Ok-map carries three keys: `status` (n), `headers` (M t t, lowercased),
`body` (t). Non-2xx responses surface as Ok with the status on the map;
only transport failure (DNS, connection refused, timeout) returns Err.

Tree-bridge dispatched (returns Result, no FnRef args), so VM and
Cranelift JIT inherit identical semantics without dedicated opcodes.
Appended last to `Builtin::ALL` to preserve every existing on-wire tag,
and added to `tree_bridge_returns_result` so the bang forms (`getx!`,
`pstx!`) auto-unwrap correctly.

* test: rename graph subgraph-coverage user fn off the new getx builtin

src/graph.rs::test_subgraph_type_inclusion_via_dep used getx as a user fn
name. Renamed to getp so the test keeps exercising the same code path now
that getx is reserved.

* test: cross-engine coverage for getx/pstx against wiremock

Seven new wiremock-backed integration tests in tests/http.rs covering
status/headers/body extraction, the 304-stays-Ok invariant, both arg-count
forms, transport-failure Err, and two verify-time arg-type rejection cases.
All run on --vm and --jit (cranelift feature).

examples/http-rich-response.ilo for the examples_engines harness covers
the bang-form Err propagation across the 1- / 2- / 3-arg variants.

* docs: sync getx/pstx to SPEC.md, ai.txt, and skill

Four rows in the SPEC builtins table, paragraph + example in the HTTP
section, regenerated ai.txt via build.rs, plus a 'when to reach for getx
vs get' paragraph in skills/ilo/ilo-builtins-io.md.

Site companion change pushed separately to ilo-lang/site.

* feat: ilo test subcommand

Surfaces the in-tree -- run: / -- out: / -- err: annotation format as a
user-facing command. The same format that tests/examples_engines.rs
already exercises, exposed so end-user programs and test suites can
assert behaviour from the same files agents read as in-context
examples. ilo test <file> runs one file; ilo test <dir> walks .ilo
files recursively. Each case spawns the current ilo binary with the
chosen engine flag (defaults to --vm; --engine jit / --engine all
widen the matrix), asserts stdout (-- out:) or stderr (-- err:) against
the expected payload, and prints PASS / FAIL with the source line.
Final line is N passed, M failed; exit 0 on all-pass, 1 otherwise.

The runner lives in src/cli/test_runner.rs, dispatched from main on
Cmd::Test. Path defaults to examples/ when omitted so ilo test in a
fresh checkout does something useful out of the box.

* test: end-to-end coverage for ilo test

Seven subprocess tests pin the dispatch path: passing case, failing
case (exit 1 + FAIL line), -- err: assertion shape, directory
recursion, --engine all running both engines, missing-path error, and
no-annotations-found error. Each spawns the ilo binary so the test
exercises real CLI parsing + the runner's own subprocess fan-out, two
layers deep.

* fix(verify): hint two-kebab-half subtraction in ILO-T004

When an unbound 3+ segment kebab ident splits uniquely into two halves
that ARE bound, suggest the subtraction reading. Hit by mandelbrot
persona writing `zr-sq-zi-sq` with no spaces around the operator,
expecting it to parse as `(zr-sq) - (zi-sq)`. Pre-fix the verifier
emitted bare ILO-T004 with no hint (single segments unbound, levenshtein
distance > 3 too).

The 2-segment path now also requires both halves to resolve before
firing the existing hint, matching the new 3+ logic and dropping a
small false-positive case where only the legacy "all segments bound"
test gated.

Hint shows both prefix (`- zr-sq zi-sq`) and infix-with-spaces
(`zr-sq - zi-sq`) forms so the persona has two unambiguous canonical
spellings.

* test: pin the two-kebab-half subtraction case via examples_engines

Adds a `mandel zr-sq:n zi-sq:n>n;- zr-sq zi-sq` line to the kebab-vs-
subtract example so the cross-engine examples harness exercises the
canonical "subtraction between two hyphenated names" form on every
engine. Comment block above it explains why the no-space form would
fail and points at the new diagnostic.

* parser: reattach trailing .N to multi-token call results

The greedy call-args loop in parse_call_or_atom and the nested-call
expansion in parse_call_arg stopped at the leading `.` because Dot
doesn't start an operand. The trailing `.N` was left dangling for
the infix parser to choke on (ILO-P001 'expected declaration, got `.`').

Route every Call return site through parse_field_chain so the postfix
chain reattaches to the call result the same way it does for (expr).N
and xs.N. This means `spl "a.b" "." .0` and `num spl "1.2.3" ".".1`
now parse without parens, matching the bare-ident and parenthesised
shapes.

* test: cross-engine coverage for call-result dot-index

Pins the new shape across every public engine (VM and Cranelift JIT
when the feature is on):
- top-level multi-token call with trailing .N (glued and spaced)
- nested call inside outer arity-1 caller (the originating repro)
- safe .?N shorthand on the call result
- arity-known call (`at xs 0 .1`) inside list-element context
- equivalence with the parenthesised workaround
- existing bare-ident and paren shapes unchanged

examples/call-result-dot-index.ilo demonstrates the canonical shape
with -- run / -- out assertions so the examples_engines harness picks
it up as a regression test too. Notes the bare-ident-last-arg caveat
inline so agents reading the example don't get surprised when
`at rows i .1` glues .1 to i instead of the call result.

* docs: document call-result dot-index reattachment

SPEC.md records section and ai.txt RECORDS section both now describe
the new multi-token-call shape alongside the existing parenthesised
and bare-ident shapes, with the bare-ident-last-arg caveat called
out so agents know when to fall back to parens or bind-first.

* feat: O(n) rolling-window reducers rsum/ravg/rmin

Add three new builtins for sliding-window aggregations over numeric
lists: rsum (running sum), ravg (running mean), and rmin (running
minimum). Output length is len xs - n + 1; empty when n > len xs.

The asymptotic point is the whole point. The natural recipe in ilo
today is map (i:n>n;sum (slc xs i (+ i n))) ..., which is O(n*w) and
explodes for fat windows on long inputs. rsum/ravg use a running-sum
(one add and one subtract per step), and rmin uses a monotonic deque,
giving O(n) amortised total for all three.

Tree-bridge eligible alongside the cumsum/cprod/ewm aggregate family:
the tree interpreter does the work, VM and Cranelift inherit through
OP_CALL_BUILTIN_TREE at zero opcode cost. ILO-R009 propagates on
Cranelift via tree_bridge_propagates_error so the n=0 / negative-n /
non-numeric-element error parity holds across engines.

Builtin tags appended last to preserve every existing on-wire tag.

* test: cross-engine coverage for rsum/ravg/rmin

25 tests across tree, register VM, and Cranelift JIT covering:
- basic three-wide windows for all three reducers
- boundary conditions: window=1 (identity), window=len (single point),
  window>len (empty output), empty input
- error paths: n=0, negative n, fractional n
- shape-specific cases: rmin against ascending / descending /
  repeated-minimum inputs to exercise the monotonic-deque branches
- verify-time ILO-T013 for wrong xs type, wrong n type, non-list xs

Includes a hand-computed parity case for rsum so a future bug that
silently changes the output shape doesn't pass just because all three
engines agree on the wrong answer.

* test: ignored microbench locking O(n) rolling-window vs slc baseline

Pins the asymptotic improvement of rsum/rmin against the slc + sum/min
baseline at n=10k, w=500. The naive recipe does ~5M work; the
running-window form does ~10k. Bound set loosely at 2x to absorb CI
noise but still catch a regression to the naive O(n*w) shape.

Marked #[ignore] so the suite runtime stays bounded; run on demand via
`cargo test --release --features cranelift -- --ignored rolling_perf`.

* docs: example for rolling-window reducers

examples/rolling-aggregates.ilo doubles as an in-context learning
example for agents that hit this pattern in future, and as a
higher-level regression test the engine harness already runs across
every backend. Skill text deliberately omitted: the existing
ilo-builtins-math file is over the 1000-token cap by 146 tokens
already (a pre-existing lint failure on main), so adding to it would
deepen the regression. ai.txt + SPEC.md are the canonical references.

* docs: tighten experimental banner in README

* docs: surface get/pst headers, jpth dot-path, text-concat ref

Three doc-only discoverability fixes hit by multiple personas in 2026-05-20
sessions. Canonical signatures in SPEC.md and src/verify.rs were already
correct, but the skill docs weren't front-loading them enough for agents to
find on first read.

- SKILL.md gets a "Quick reference - things agents miss" block covering
  text concatenation (+, fmt, cat), HTTP custom headers (every verb takes
  an optional M t t map), and jpth being dot-path not JSONPath.
- ilo-builtins-io.md HTTP section gets a bold lead on headers and a
  runnable mset/get!/pst! example.
- ilo-builtins-io.md JSON section front-loads the dot-path warning before
  the jpth signature line.
- ilo-builtins-text.md gets a concat/format quick-reference at the top so
  agents reaching for "how do I join two strings" pick + over cat.

Personas: scrapingbee, tui-client (#26b headers), bearer-token-client (#26c
jpth), 5+ across sessions (#26e concat confusion).

* docs: tighten io/text doc additions to fit token budget

Trim the headers/jpth/concat additions to single dense lines so the
budget-checker doesn't get worse than baseline. Net result: io drops
from 1837 to 1748 (under its previous overage), text returns to ~baseline.

* examples: tail-recursive find-idx pattern

Adds examples/find-idx-tail-recursive.ilo showing the canonical
recursive linear-search idiom: base cases as braceless guards first,
recursive call as the tail statement. Documents the common footgun -
appending a literal '-1' after the recursive call puts the call out of
tail position, so its return is discarded and the function silently
falls through to -1.

Surfaced by the interp1d persona report (2026-05-21), which read as a
braceless-guard early-return bug but turned out to be the discarded
non-tail-call shape. The braceless guard itself fires correctly in
isolation. Pinned by tests/examples_engines.rs across tree/VM/JIT/AOT.

* test: cross-engine coverage for three conditional forms

Pin each canonical conditional shape - braceless guard cond expr,
braced conditional cond{body}, brace ternary cond{a}{b}, prefix
ternary ?h cond a b - as a runnable example so the examples_engines
harness exercises them on tree/VM/JIT/AOT. Add a focused regression
test that asserts both the parse-cleanly invariant for each form
AND the wrong-form ?h cond{body} hint enumerates all three canonical
shapes, so the parser hint copy and the example shapes stay in lockstep
if either side drifts.

Backs the side-by-side comparison agents now reach for in
skills/ilo/ilo-language.md and site guards reference page.
ten date/time personas tripped on this in the same session;
the parse-cleanly assertions catch the regression before the
docs lie.

* doc(skill): merge guards section into guards & conditionals

Ten date/time personas in the 2026-05-21 dispatch all tripped on
?h cond a b vs cond{body} vs braceless guard. The parser already
lands a context-aware ILO-P009 hint enumerating the three canonical
shapes; the skill doc had only a terse note under ## guards with no
reference to ternary or braced conditional, so the agent had nothing
to consult after reading the hint.

Replace ## guards with ## guards & conditionals: same single-paragraph
density, but now names all three shapes (early-return guard, braced
conditional, brace ternary) plus the two prefix-ternary subjects
(?h a b bare-bool and ?h cond a b keyword). One-line callout names
the wrong form ?h cond{...} so the hint copy lines up with the doc.

Kept terse on purpose - the file is already over the per-file token
cap from PR #566; net add over baseline is ~57 tokens. Full
side-by-side table lives in site/src/content/docs/docs/reference/
guards.md where there is no token budget.

* docs: multi-request HTTP flow patterns (OAuth + paginated)

Document the two canonical patterns for multi-request flows in ilo,
since there are no globals: closure-threading for in-run state
(paginated fetch, rate-limit windows) and file-backed for cross-run
state (OAuth refresh tokens, multi-hour TTLs).

Closes #26d (pending.md). Hit by bearer-token-client persona 2026-05-20;
agents were inventing the pattern badly (top-level vars rejected,
shelling out, misusing closures).

Also dedupes three accidental triplicate JSON-section paragraphs and a
duplicated 'run' block, net -55 tokens on the file.

* examples: oauth-token-cache + paginated-fetch

Worked patterns for the multi-request flow doc:

- paginated-fetch.ilo: closure-threading via recursion. Cursor + acc
  thread through fetch-loop; fake-page makes it offline-runnable
  cross-engine so the harness exercises the shape itself.

- oauth-token-cache.ilo: file-backed cache lifecycle. load reads or
  defaults, refresh produces a new value (real flow would pst! to the
  IdP), save persists. Uses now-ms in the path so re-invocations are
  hermetic.

Both run clean across tree (transitive via VM bridge), VM, and JIT.

* verify: warn on non-tail recursive self-call return discard

A recursive self-call followed by another statement in the same body
silently discards its return value, since ilo only uses the tail
expression as the function's return. The interp1d persona hit this
shape (find-idx xs target +i 1; -1) and mis-diagnosed it as a broken
braceless guard because the verifier emitted no signal. Add ILO-T043
to close the gap.

Narrowly scoped to recursive self-calls (callee name == caller name)
with a non-nil declared return type. Bare non-recursive user-fn calls
at non-tail position do not warn; the callee may legitimately be
side-effecting. Broaden later if reruns surface other classes.

* test: regression for ILO-T043 recursive discard

Five cases covering the new warning surface: warns on non-tail
recursive self-call (the persona's shape), no-warn on tail recursive
self-call, no-warn on non-recursive user-fn call at non-tail, no-warn
when the call is wrapped in ret, and two warnings emitted when two
non-tail recursive calls appear in a row.

examples/recursive-tail-position.ilo pins the canonical fix shape
under examples_engines so every engine asserts the same output. The
example doubles as the in-context example any future agent reads when
ILO-T043 fires; the hint points at this file.

* fix(verify): targeted hint on ILO-T003 ternary branch mismatch

Before, the verifier emitted the generic 'both branches of a ternary
must return the same type' hint with no signal on which side to
convert. Agents either guessed (often producing a follow-on type
error) or restructured unnecessarily.

The new hint reads the two branch types and:
- for n vs t, surfaces both conversion directions: 'str <num-branch>'
  to make both text, or 'default-on-err (num <text-branch>) <fallback>'
  to make both number (since 'num' returns R n t, an unwrapped scalar
  is not enough);
- for everything else (bool vs text, L n vs M t n, two named records,
  R T E vs n, ...), falls back to restructure advice because str/num
  are the only built-in scalar coercions; suggesting one outside that
  pair would just trip ILO-T013.

* test: regression coverage for ternary type-mismatch hint

Pins the new ILO-T003 hint shape across the four cases that matter:
- n vs t in either branch position must surface both 'str <num-branch>'
  and 'default-on-err (num <text-branch>)' so the agent can pick the
  direction matching intent;
- matching-type ternaries still verify clean (false-positive guard);
- bool vs text and list vs map fall back to the restructure hint,
  with explicit negative asserts that we don't suggest a str/num
  conversion that wouldn't apply.

Adds examples/ternary-types.ilo showing the canonical correct shapes
after applying each hint. The examples_engines harness runs it across
every available engine on every CI run, so the docs and the verifier
hint can't drift apart silently.

* doc: note ternary branch-type hint in SPEC ternary section

SPEC already documents the ILO-T038 'condition must be b' rule next to
the prefix-ternary explanation; this slots the matching ILO-T003 rule
in the same place so agents reading the ternary spec see both checks
together. ai.txt regenerates from SPEC.md via build.rs.

* test: regression guard for ILO-50 fld multi-statement lambda body

Add two tests to regression_inline_lambda.rs that pin the correct
behaviour for a fld inline lambda with a let-stmt + final-expr body
(the exact shape reported in ILO-50). Could not reproduce the alleged
ambiguity; these tests ensure any future parser regression is caught.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add STABILITY.md at repo root

Pre-1.0 ilo churns at very different rates per surface. The README banner
says "expect breaking changes" at the language-as-a-whole level, but
agents need finer signal: schemaVersion:1 JSON envelopes, ILO-XXXX error
codes, serv protocol phases, the file-version pragma, and the manifesto
are all things you can pin to and carry forward. CLI flag names, builtin
signatures, error message prose, and the examples corpus are not.

Three tiers (Stable, Provisional, Experimental), each entry naming what
is locked in, what isn't, and the change policy. Pairs with the
experimental banner in #568.

README banner now points at STABILITY.md so the first thing an agent
reads after "expect breaking changes" is where to find the actual
matrix.

Closes #55. Exposing the matrix via 'ilo spec --json' deferred to a
follow-up - non-trivial JSON plumbing.

* add rand alias for rnd

* docs: rand alias in SPEC, ai.txt, skill, and CHANGELOG

- SPEC.md builtins table notes that `rnd` returns random, not round,
  with the alias pair `rand`/`random` and a pointer at `rou`/`round`
  for rounding.
- SPEC.md aliases table gains the `rand` -> `rnd` row.
- ai.txt regenerated from SPEC.md via build.rs.
- skills/ilo/ilo-builtins.md math section now spells out the
  round-vs-random trap so agents writing programs through the skill
  see the disambiguation in context.
- CHANGELOG 0.12.1 lists `rand` as an additive ergonomic alias.

* feat: bisect for O(log N) sorted-list search (#5bb)

Add bisect xs:L n target:n > n, Python bisect_left semantics: returns
the leftmost index i such that xs[0..i] < target <= xs[i..]. Empty list
returns 0; target greater than every element returns len xs; ties
resolve to the leftmost equal index. NaN target propagates as NaN to
match argmax/argmin policy.

The asymptotic point is the whole point. The natural recipe in ilo
today is len (flt (x:n>b;< x target) xs) or hd (flt fn (enumerate xs)),
which is O(n) per lookup. interp1d, sorted-lookup, percentile pickers
and histogram binning all need this on inner loops; collapsing the
scan to one builtin call gives O(log n) at the same token cost.
Caller owns the sortedness precondition; we do not validate it,
matching srt/unq/Python bisect precedent.

Tree-bridge eligible alongside the argmax/argmin/argsort index-
returning aggregates: the tree interpreter does the work, VM and
Cranelift inherit through OP_CALL_BUILTIN_TREE at zero opcode cost.
Returns plain n (not Result), so no tree_bridge_returns_result entry
needed. No error propagation needed either: bisect cannot fail on
sorted-numeric input. Type errors at the bridge raise ILO-R009
identically across engines via the normal interpreter arm.

Builtin tag appended last to preserve every existing on-wire tag.

* test: cross-engine coverage for bisect

12 tests pinning the bisect_left contract across tree, VM, and JIT
(when built with --features cranelift): in-range insertion, before-
first, after-last, duplicates (leftmost wins), empty list (returns 0),
single-element under/equal/over, exact-match resolves leftmost,
negative numbers, NaN target propagates, and a replaces-filter-count
comparison pinning the manifesto framing.

NaN is generated with sqrt (0 - 1) rather than / 0 0 because the
latter raises ILO-R003 in tree and VM before bisect ever sees the
value.

* docs: example for bisect

examples/bisect.ilo demonstrates the six pinned shapes from the
regression test (in-range, before-first, after-last, duplicates,
empty, single-element exact-match) under the examples_engines
cross-engine harness. Header comment names the Python bisect_left
semantics and the caller-owned sortedness precondition so agents
encountering this for the first time read the contract before the
syntax.

* chore: trim bisect skill entry to keep ilo-builtins-math token budget tight

The original wording added 100+ tokens to ilo-builtins-math.md (already
1146 / 1000 cap pre-PR). The new builtin still needs to appear there so
agents discover it via skill load, but the contract details belong in
SPEC.md / site docs / the example file - not in the skill bundle that
agents pay for on every task.

Compress to the same one-line-list pattern the rest of the Statistics
section already uses, with a one-sentence trailer naming the semantics.
That cuts +103 tokens to +30, keeping the per-PR drift minimal while
the broader skill-token cap conversation plays out separately.

* docs: skills/ilo/ilo-builtins.md uses pst as canonical (closes ILO-80)

Explicitly note that pst is the canonical name since 0.12.0 and that
the old name post is not accepted, preventing agents loading the skill
from using the wrong builtin name.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: gate AOT tests behind #[ignore] (need libilo.a prereq) — closes ILO-290

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: flip all PR3c srt/grp TODO tests to run_all now that #391 merged

PR #391 (srt/grp/uniqby off tree-bridge) is merged. Remove all
`run_tree_only` gates and TODO comments referencing #391 in
regression_inline_lambda.rs, flip each call site to `run_all`, and
delete the now-unused `run_tree_only` helper. All 21 tests pass on
tree, VM, and Cranelift.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: enforce 16 MiB test stack via .cargo/config.toml + explicit guard on fib canary (partial ILO-289)

- Add .cargo/config.toml with RUST_MIN_STACK=16777216 so local cargo
  test runs match the CI env var setting in rust.yml.
- Unignore .cargo/config.toml in .gitignore (directory was globally
  ignored; allow the one tracked file via negation rule).
- Wrap interpret_braceless_guard_fibonacci in an explicit 8 MiB
  std::thread::Builder stack so the canary test passes independently of
  the env var, mirroring the run_on_fat_stack pattern in
  tests/parser_depth_cap.rs.

Defers CI lint for oversized match arms to a follow-up ticket.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add: paren-form call syntax as sugar over postfix

Closes ILO-51. `spl(row, ",")` now parses identically to `spl row ","`.
Same AST node; postfix remains canonical idiom. Disambiguates via adjacency:
`f(x)` is call, `f (x)` keeps `(x)` as grouped-expr arg.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix: ILO-P101 hint also mentions paren-form (closes ILO-92)

Update the ILO-P101 diagnostic hint to mention both the postfix-around-call
form `({name} <args>)` and the new paren-form `{name}(<args>)` introduced
by ILO-51, giving agents a clearer choice of syntax.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add quantile worked example (closes ILO-55)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: surface stability matrix in ilo spec --json ai and ai.txt (ILO-75)

Adds a `stability` field to the `ilo spec --json ai` JSON envelope listing
stable/provisional/experimental surfaces, cross-linked to STABILITY.md.
Adds a STABILITY line to ai.txt so agents consuming the compact spec see
the stability contract directly. Per-item stability in the spec output is
scoped as a follow-up (the current ai.txt format is a flat blob, not
per-item; restructuring that is L scope).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: add dispatch-arm size lint to prevent stack-pressure regressions (ILO-339)

Adds scripts/check-dispatch-arms.sh which scans the call_function
if-builtin dispatch chain in src/interpreter/mod.rs and fails if any
arm exceeds 40 lines. Large inline arms inflate the debug-build stack
frame and have caused recurring stack-overflow failures in deep-recursion
tests. Engineers hitting the lint should extract the arm body into a
#[inline(never)] helper.

The workflow step (rust.yml) is not included here because the token
lacks workflow scope; the snippet to add is in the PR description.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(manifesto): resolve principle 4 contradiction re English keywords (ILO-77)

Principle 4 listed the residual English keywords but left the contradiction
with "language-agnostic" implicit. Adds an explicit "by design" paragraph
explaining why the 11 kept keywords pass the token-cost test (each is already
a single token, sigil alternatives scored lower on generation accuracy) and
marks the set as frozen — no new English keywords will be added.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: apply cargo fmt to parser_paren_call.rs

Followup to PR #588 - the merged tests file landed with formatting that
cargo fmt --check rejects. CI lint job has been red on every PR since.
No behaviour change.

* chore: regenerate ai.txt to match SPEC.md

build.rs regen drops a stale STABILITY line that no longer appears in
SPEC.md. CI's 'Verify ai.txt is in sync' guard has been red on every PR
in the queue since this slipped through.

* chore: drop needless Ok wrapper around ? in field-chain return

clippy::needless_question_mark is denied in CI ('cargo clippy ... -- -D
warnings'); a stray Ok(...?) at src/parser/mod.rs:4158 has been failing
the lint job on every PR in the queue. No behaviour change.

* chore: document dev vs release tag convention (closes ILO-66)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: restore cross-language benchmark suite (ILO-65)

Add bench/ directory with five canonical benchmarks (fib, hof, listproc,
pattern-match, sum-loop) each implemented in ilo + Python / Node.js / Rust.
bench/run.sh runs all impls, verifies correctness, and emits bench/results.json.
CI nightly workflow (.github/workflows/bench.yml) runs the suite and gates
on >10% regression vs previous run.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Lexer: add hex/binary/octal numeric literal support (0xFF, 0b1010, 0o755)

Extend the logos Number token with three additional regex handlers that
parse 0x…/0X… (hex), 0b…/0B… (binary), and 0o…/0O… (octal) prefixed
literals, converting them to f64 at lex time so the parser and all
evaluation engines require no changes. Bare prefixes with no digits lex
as an error (non-zero exit). 25 tests covering all three bases,
case-insensitive prefixes, mixed arithmetic, and empty-digit error cases.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add ilo trace subcommand for JSON-line value snapshots (ILO-72)

Implements `ilo trace <file.ilo> [func] [args...]` which runs the
tree-walking interpreter and emits one JSON line per statement execution.
Each line carries schemaVersion, line number, source text, all current
bindings, and the statement result value.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: resolve clippy warnings in trace interpreter and parser

* chore: cargo fmt + regenerate ai.txt

* feat: per-item stability field in ilo spec --json ai (ILO-340)

Add a `stability()` method to `Builtin` that returns `"provisional"` for
all builtins shipped in 0.12.1 or earlier and `"experimental"` for the
twelve unreleased builtins above 0.12.1 in CHANGELOG.md (matvec, lstsq,
jpar-list, get-to, pst-to, tz-offset, run2, rgxall-multi, fmod,
dtparse-rel, dur-parse, dur-fmt).

`ilo spec --json ai` now emits a `builtins` array where every entry carries
`{name, stability}`, sourced from `Builtin::ALL` via the new method. The
existing top-level `stability` summary is preserved unchanged for backward
compat. SPEC.md gains a `## Stability` section so build.rs regenerates the
matching `STABILITY:` line in ai.txt (fixing the pre-existing divergence
between SPEC.md and ai.txt introduced in PR #598). One new integration test
asserts the builtins array is present, non-empty, and uses only known tiers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: cargo fmt + regenerate ai.txt

* feat: optional labelled args for all callable arities (ILO-71)

Adds `label:value` syntax at call sites for any function or builtin with
declared parameter names. Labels resolve to positional by name at parse
time, so `dtfmt epoch:e fmt:"%Y"` is identical to `dtfmt e "%Y"`, and
`dtfmt fmt:"%Y" epoch:e` (reversed) also works.

- Parser: `fn_param_names` table (mirrors `fn_arity`) populated for all
  builtins and user-defined functions via `register_user_fn`.
- `peek_labelled_arg_label`: lookahead that returns the label name when
  `ident:non-type-token` is ahead; disambiguates from param declarations
  using `>` / `:` after the type ident as the type-context signal.
- `resolve_labelled_args`: collects all labelled args following positional
  args and reorders to declaration order, emitting `ILO-P019` on unknown,
  duplicate, or conflicting labels.
- `parse_paren_call_args_for(fn_name)`: paren-form call parsing now also
  recognises `label:value` items when called with a known function name.
- Resolution is 100% at parse time — no new AST nodes; all three engines
  (tree, VM, JIT) see ordinary positional `Expr::Call` args.
- `examples/labelled-args.ilo` exercises all-labelled, reversed, mixed
  positional+labelled, builtin `dtfmt`, paren form, and positional fallback.
- Doc touchpoints updated: SPEC.md, ai.txt, skills/ilo/SKILL.md.

All 3330 existing tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: cargo fmt + regenerate ai.txt

---------

Co-authored-by: Daniel Morris <daniel@cubitts.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* chore: delete tree-walker eval loop (PR E of ILO-45)

The tree-walker engine is genuinely gone now. eval_body, eval_stmt,
eval_expr, the trampoline, try_synthesize_tail_call, BodyResult,
ACTIVE_AST_PROGRAM-tied eval entry, the self-rebind peephole helpers
(match_self_rebind_mset/append/concat + their eval_* siblings),
expr_refers_to, eval_literal, eval_binop, values_equal, is_truthy,
match_pattern, eval_return_stmt, eval_tail_expr_stmt — all deleted.

Env stripped to the two fields the bridge-entry builtin impls still
need: `functions` (so the program-aware bridge can resolve HOF
callbacks if PR E's audit ever drifts) and `caps` (capability checks
in IO/network/process builtins).

The five remaining callers of `runtime::run*` in main.rs migrated to
`vm::compile` + `vm::run*`:
  - ilo serv JSON dispatcher (with and without tools)
  - REPL (`ilo repl`)
  - --run-time CLI fallback (old "VM with tree-walker as canonical
    semantics" pattern is gone)
  - --bench loop (tree warmup + measure block deleted; VM is now the
    interpreter baseline)
  - runtime::run_with_caps fallback after VM rejection (deleted; VM is
    the only engine, its rejection IS the error)

call_function survives as a builtin-only dispatcher — the user-fn
dispatch tail (Decl::Function/Tool/TypeDef/Alias/Use/Error arms with
their TCO trampoline) is gone. Unknown name surfaces as ILO-R002.

Caps gap fix: pre-PR-E the tree-walker honoured `env.caps` for IO
builtins; the VM/Cranelift tree-bridge created a fresh Env::new() with
default permissive caps, so cap-restricted programs were enforced only
on the tree path. New ACTIVE_CAPS thread-local + ActiveCapsGuard RAII
wrapper installs the live VM caps for the duration of bridge dispatch,
and Env::new() reads from the slot. Two new bridge entries
(call_builtin_for_bridge_with_caps and ..._with_program_and_caps) plus
the VM's OP_CALL_BUILTIN_TREE arm now thread `self.caps.clone()` so the
caps tests (including the run-allowlist enforcement that surfaced this
gap) pass on VM and JIT.

Tests:
  - tests/regression_tco.rs deleted (tree-walker trampoline test;
    VM TCO covered by regression_tco_vm.rs)
  - tests/capability_flags.rs migrated to VM-only (was running every
    test through both runtime::run_with_caps and vm::run_with_caps;
    now just VM, since tree is gone)
  - tests/eval_inline.rs bench-output assertions updated from "Rust
    interpreter" to "Register VM"
  - tests/regression_bench_*.rs bench-engine assertions updated to
    drop the mandatory "tree" engine check
  - runtime::tests inline module deleted (~5400 lines; all exercised
    the tree-walker via run_str)

Net: 6910 lines deleted across src/ and tests/. cargo test --release
--features cranelift green. AOT byte-identity baselines unchanged
(eval-loop deletion didn't shift Cranelift object layout — all the
deletions were inside src/runtime/mod.rs which doesn't reach the JIT
helpers).

ILO-45 closes with this PR.

* fmt: apply rustfmt after Cmd::Trace removal

* fix: clean up orphan doc comments and dead box_muller_normal

* chore: bump ilo-builtins-math/io skill token caps after par-map sync

par-map (#688) added builtin doc content to ilo-builtins-math (1409 vs
cap 1200) and ilo-builtins-io (1903 vs cap 1700), which pushed CI red.
Bump the caps so the eval-loop deletion PR can land; tighten back in a
follow-up trim pass as the existing comment in check-skill-tokens.py
notes.
The prior 0.13.0 entry mentioned the tree-walker drop in one line but
didn't enumerate the 6-PR series, the ACTIVE_CAPS gap fix (real
behaviour change for cap-restricted programs on VM/JIT), or the
deletion of ilo trace + run_with_trace.

Expanded entry now covers:
- All 6 PRs with their specific contributions (opcodes, helpers,
  refactors, line counts)
- The ACTIVE_CAPS thread-local that fixed the bridge caps gap
- ilo trace CLI removal and the ILO-343 follow-up for VM/JIT trace
- ilo serv / REPL / --bench migrations to VM
- Env shape change (two fields: functions + caps)
- call_function role change (builtin-only dispatcher)
- Net ~7900-line deletion across the series

No code change.
* docs: sync getx/pstx to SPEC.md, ai.txt, and skill

Four rows in the SPEC builtins table, paragraph + example in the HTTP
section, regenerated ai.txt via build.rs, plus a 'when to reach for getx
vs get' paragraph in skills/ilo/ilo-builtins-io.md.

Site companion change pushed separately to ilo-lang/site.

* doc: sync ilo test surface across SPEC, ai.txt, skill

SPEC.md gains the CLI invocation line in the inventory plus a full
**ilo test** paragraph next to ilo check, covering engine selection,
the -- engine-skip: passthrough, and the all-pass / any-fail exit
codes. ai.txt gets the matching agent-spec entry inline. The skill's
ilo-agent.md gets a Testing section with the three canonical
invocations so an agent writing tests for its own programs sees the
shape without round-tripping to SPEC.

* doc(spec): note hyphen-vs-subtraction whitespace rule

Pins the lexer's whitespace-sensitive disambiguation in the identifier-
syntax section: `a-b` is always one ident, `a - b` is subtraction.
Mentions the new two-kebab-half hint surfaced by ILO-T004 so agents
reading the spec see the canonical prefix and infix-with-spaces forms.

ai.txt regenerated by build.rs from SPEC.md.

* chore: add b64/hex to SPEC reserved-names section

regression_reserved_names_doc has been failing on main since the
crypto-primitives merge (PR #560 added b64 / b64-dec / hex without
updating the enumerated 3-char reserved list). Adds them so the test
goes green for any branch off main.

ai.txt regenerated.

* feat: O(n) rolling-window reducers rsum/ravg/rmin

Add three new builtins for sliding-window aggregations over numeric
lists: rsum (running sum), ravg (running mean), and rmin (running
minimum). Output length is len xs - n + 1; empty when n > len xs.

The asymptotic point is the whole point. The natural recipe in ilo
today is map (i:n>n;sum (slc xs i (+ i n))) ..., which is O(n*w) and
explodes for fat windows on long inputs. rsum/ravg use a running-sum
(one add and one subtract per step), and rmin uses a monotonic deque,
giving O(n) amortised total for all three.

Tree-bridge eligible alongside the cumsum/cprod/ewm aggregate family:
the tree interpreter does the work, VM and Cranelift inherit through
OP_CALL_BUILTIN_TREE at zero opcode cost. ILO-R009 propagates on
Cranelift via tree_bridge_propagates_error so the n=0 / negative-n /
non-numeric-element error parity holds across engines.

Builtin tags appended last to preserve every existing on-wire tag.

* docs: surface get/pst headers, jpth dot-path, text-concat ref

Three doc-only discoverability fixes hit by multiple personas in 2026-05-20
sessions. Canonical signatures in SPEC.md and src/verify.rs were already
correct, but the skill docs weren't front-loading them enough for agents to
find on first read.

- SKILL.md gets a "Quick reference - things agents miss" block covering
  text concatenation (+, fmt, cat), HTTP custom headers (every verb takes
  an optional M t t map), and jpth being dot-path not JSONPath.
- ilo-builtins-io.md HTTP section gets a bold lead on headers and a
  runnable mset/get!/pst! example.
- ilo-builtins-io.md JSON section front-loads the dot-path warning before
  the jpth signature line.
- ilo-builtins-text.md gets a concat/format quick-reference at the top so
  agents reaching for "how do I join two strings" pick + over cat.

Personas: scrapingbee, tui-client (#26b headers), bearer-token-client (#26c
jpth), 5+ across sessions (#26e concat confusion).

* docs: tighten io/text doc additions to fit token budget

Trim the headers/jpth/concat additions to single dense lines so the
budget-checker doesn't get worse than baseline. Net result: io drops
from 1837 to 1748 (under its previous overage), text returns to ~baseline.

* docs: ILO-T043 registry entry and SPEC sync

Adds --explain ILO-T043 reachable entry with the canonical fix walk-
through (tail-position move, ret-wrap, ?h restructure). SPEC.md tail-
call rules section now points at the new warning. ai.txt regenerated
by build.rs picks up the SPEC change.

* add rand alias for rnd

* docs: rand alias in SPEC, ai.txt, skill, and CHANGELOG

- SPEC.md builtins table notes that `rnd` returns random, not round,
  with the alias pair `rand`/`random` and a pointer at `rou`/`round`
  for rounding.
- SPEC.md aliases table gains the `rand` -> `rnd` row.
- ai.txt regenerated from SPEC.md via build.rs.
- skills/ilo/ilo-builtins.md math section now spells out the
  round-vs-random trap so agents writing programs through the skill
  see the disambiguation in context.
- CHANGELOG 0.12.1 lists `rand` as an additive ergonomic alias.

* feat: bisect for O(log N) sorted-list search (#5bb)

Add bisect xs:L n target:n > n, Python bisect_left semantics: returns
the leftmost index i such that xs[0..i] < target <= xs[i..]. Empty list
returns 0; target greater than every element returns len xs; ties
resolve to the leftmost equal index. NaN target propagates as NaN to
match argmax/argmin policy.

The asymptotic point is the whole point. The natural recipe in ilo
today is len (flt (x:n>b;< x target) xs) or hd (flt fn (enumerate xs)),
which is O(n) per lookup. interp1d, sorted-lookup, percentile pickers
and histogram binning all need this on inner loops; collapsing the
scan to one builtin call gives O(log n) at the same token cost.
Caller owns the sortedness precondition; we do not validate it,
matching srt/unq/Python bisect precedent.

Tree-bridge eligible alongside the argmax/argmin/argsort index-
returning aggregates: the tree interpreter does the work, VM and
Cranelift inherit through OP_CALL_BUILTIN_TREE at zero opcode cost.
Returns plain n (not Result), so no tree_bridge_returns_result entry
needed. No error propagation needed either: bisect cannot fail on
sorted-numeric input. Type errors at the bridge raise ILO-R009
identically across engines via the normal interpreter arm.

Builtin tag appended last to preserve every existing on-wire tag.

* diagnostic: add FixPlan types and fix_plan field to Diagnostic

Add FixPlan { path, edits: Vec<FixEdit> } and FixEdit { line_start,
line_end, before, after } matching the Zero PR #137 schema. Wire
fix_plan serialization into --json output. No diagnostics emit plans yet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* diagnostic: wire fix_plan derivation for T004, T003, T032, L002

Add Diagnostic::derive_fix_plan() which pattern-matches on code and
builds a structured FixPlan from the primary span + source text:

- ILO-T004 / ILO-T003: parse "did you mean 'X'?" from hint, replace span
- ILO-T032: bare fmt/fmt2 → prepend "prnt " before the call
- ILO-L002: underscore ident → hyphenated form from suggestion backticks

Wire enrich() closure in check_cmd to call derive_fix_plan() after
attaching source and diag_path (file-only; absent for inline code).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* tests: add fix_plan unit + integration tests for T004, T032, L002, T003

Unit tests in diagnostic::tests::derive_fix_plan_* cover:
- T004 typo rename, T003 type rename, T032 fmt prefix, L002 hyphen
- absent-without-hint and absent-without-source guard cases
- JSON shape (line_range array, before/after keys, path field)

Integration tests in json_output_contracts exercise the full
check --json → NDJSON stderr pipeline for each wired diagnostic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: document fix_plan field in JSON_OUTPUT.md and ai.txt

JSON_OUTPUT.md: expand ilo check section with diagnostic NDJSON shape,
optional fix_plan schema (Zero PR#137 style), and table of codes that
emit structured edits.

ai.txt: add [fix_plan] note in ERROR DIAGNOSTICS section describing the
field and which codes populate it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix-plans: wire T008, P011, T041 to derive_fix_plan

Extends derive_fix_plan() with three new code handlers:

- ILO-T008 (return type mismatch): wraps the offending return expression
  with `str`/`num` when the hint identifies a cast; no-ops for non-cast types
- ILO-P011 (reserved keyword as identifier): renames the span token to
  `<name>2` (safe mechanical rename with no semantic ambiguity)
- ILO-T041 (nil-coalesce on Result): rewrites `expr ?? default` to
  `?expr{~v:v;^_:default}` by splitting on the ` ?? ` operator

Adds 5 unit tests in src/diagnostic/mod.rs and 4 integration tests in
tests/json_output_contracts.rs exercising the live binary via ilo check --json.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(caps): ILO-59 CLI capability flags — allow-net/read/write/run

Add Deno-style `--allow-net`, `--allow-read`, `--allow-write`, `--allow-run`
CLI flags to gate IO builtins at the process level (ILO-59).

- `src/caps.rs` — `Caps` / `Policy` structs, `parse_allow`, `check_net/read/write/run`
  helpers, and 26 unit tests. Denial messages now carry the `ILO-CAP-001`
  structured error code so agents can route on it.
- `src/cli/args.rs` — four `Option<String>` fields on `RunArgs` wired to clap flags
  (`--allow-net`, `--allow-read`, `--allow-write`, `--allow-run`).
- `src/main.rs` — `build_caps()` converts `RunArgs` flags into `Arc<Caps>`;
  `Caps::Permissive` is the default so no existing invocation changes behaviour.
- `src/interpreter/mod.rs` + `src/vm/mod.rs` — `caps.check_*` calls at every
  IO builtin site.
- `tests/capability_flags.rs` — 15 integration tests covering all four
  dimensions across both backends, plus `Caps::parse_allow` round-trips.
- `examples/capability-sandbox.ilo` — runnable demo.
- `SANDBOX.md` — operator guide: flag syntax, matching rules, capability
  matrix, recipes, backwards-compatibility note.
- `ai.txt` + `SPEC.md` — capability matrix and flag reference added.

`Caps::Permissive` is `#[default]`. Without any `--allow-*` flag the runtime is
fully permissive — identical to pre-0.13 behaviour.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add wro truncating-write builtin as companion to wra

Ship `wro path s` (write-overwrite): truncates the target file before
writing, creating it if missing. Returns `R t t` matching `wr`/`wra`.
Wires enum, name, dispatch, verify, tree-bridge eligibility, example,
regression tests (5 cases), SPEC, ai.txt, and skills/ilo doc.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: resolve stdlib hyphen contradiction via Path B (ILO-76)

Principle 1's naming rule said every hyphen doubles token cost, but stdlib
ships ~20 hyphenated builtins. Added an explicit "by design, not
contradiction" callout to both MANIFESTO.md and ai.txt explaining that
stdlib names are a closed memorised vocabulary (same resolution pattern as
the residual-English-keywords note in principle 4). Froze the set: no new
hyphenated builtins will be added, existing names stay without a deprecation
window. No code changes — doc-only resolution.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: regenerate ai.txt

* Add ilo trace subcommand for JSON-line value snapshots (ILO-72)

Implements `ilo trace <file.ilo> [func] [args...]` which runs the
tree-walking interpreter and emits one JSON line per statement execution.
Each line carries schemaVersion, line number, source text, all current
bindings, and the statement result value.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: cargo fmt + regenerate ai.txt

* chore(aot): audit unsafe in rodata path and add fuzz harness (ILO-64)

Add SAFETY comments to every unsafe block in the AOT .rodata deserialise
path (ilo_aot_publish_program, jit_string_const, ilo_aot_parse_arg,
compile_cranelift OP_LOADK), document invariants in src/aot/README.md,
and wire up a cargo-fuzz target (fuzz/fuzz_targets/rodata_deserialise.rs)
with a nightly CI job (.github/workflows/fuzz.yml). No nightly toolchain
in this environment so the fuzz run is deferred to CI.

Refs ILO-64.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add sha256-hex and sha256d builtins for raw-bytes hashing (ILO-383)

Adds two new crypto builtins that hash hex-decoded bytes rather than
UTF-8 text, enabling pure-ilo Bitcoin Merkle tree computation and other
binary-protocol hashing without shelling out to Python:

- `sha256-hex hex:t > t`: SHA-256 of hex-decoded bytes, lowercase hex.
- `sha256d hex:t > t`: double-SHA256 (sha256(sha256(x))), Bitcoin shape.

Both error ILO-R009 on odd-length or non-hex input. Tree-bridge eligible
(VM and Cranelift JIT inherit via the tree interpreter). Includes cross-
engine regression tests, two examples (sha256-hex.ilo, sha256d-bitcoin.ilo),
and SPEC.md + ai.txt updates.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(parser): anchor ILO-P003 on orphaned identifier, not on trailing semicolon

When a prefix-binop chain like `*/dt 1 6 var` is parsed, the nested `/`
consumes `dt` and `1`, the outer `*` takes `(/dt 1)` as left and `6` as
right, leaving `var` orphaned at top-level.  `parse_decl` then attempted
to parse `var` as a new function declaration and emitted
`ILO-P003: expected '>', got ';'` anchored on the `;` after `var` — far
from the actual problem site.  This cost the scientific-researcher persona
(pair 23, 2026-05-21 A/B run) three misleading iterations.

Fix: add a guard in `parse_decl` that detects an identifier immediately
followed by `;`/`}`/EOF (impossible as a function header) when at least
one token has already been consumed.  Emits a targeted ILO-P003 anchored
on the orphaned identifier itself with a hint naming the bind-first
workaround (`t=/a b c;*t d`).

Adds `tests/parser_span_eof_drift.rs` with six tests covering the core
reproducer, hint text, single-op variant, and three happy-path controls.
All 3330 lib + 390 integration tests pass.

Closes ILO-378.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: cargo fmt + regenerate ai.txt

* feat(bench): persona-corpus smoke regression harness (ILO-384)

Adds a CI job that runs 13 representative personas (spanning Date/Time,
Numeric, HTTP, IO, Records, Text, and Tools blocks) against Claude Haiku
using the current skill modules, then gates the PR on outcome and mean
generation-token regressions (threshold 15%).

- bench/persona-smoke.txt  – pinned persona list, one slug per line
- scripts/persona-smoke.py – harness: skill-load → Haiku generate → ilo run
  → classify outcome → diff vs baseline JSON; also prints per-module token
  sizes so cap progress is visible in every CI summary
- .github/workflows/persona-smoke.yml – triggers on skills/ilo/** or ai.txt
  changes; manual dispatch supports --baseline refresh with auto-PR

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: tighten modular-skill token caps to measured baseline + headroom (ILO-382)

Set per-module token overrides to actual measured size plus ~50-token
headroom (aggressive cap per ILO-382). Reduce aggregate token limit from
15,000 to 12,500. Tighten byte tripwires in tests/skill_modular.rs from
7,000/52,000 to 6,500/42,000 to match current actual usage.

New per-module caps: ilo-builtins-core 1125, ilo-builtins-math 1460,
ilo-builtins-text 1190, ilo-agent 1270, ilo-builtins-io 2000 (already
tight), ilo-language 1700 (already tight). Growth past any cap now
requires an editorial trim or module split.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: anonymous record literal shorthand {field:val} without typedef

Adds `{field:val; field2:val2}` syntax for constructing ad-hoc structural
records without a prior `type` declaration. The type checker synthesises
a structural `AnonRecord` type that unifies with other anonymous records
of the same shape.

- Parser: recognises `{ident:...}` in expression position via `is_anon_record_literal` lookahead
- AST: `Expr::AnonRecord { fields }` variant with full desugaring/traversal coverage
- Type checker: `Ty::AnonRecord` structural type with field access, destructure, and `with` update
- Interpreter: evaluates to `Value::Record` with `__anon` type_name
- VM: compiles via `OP_RECNEW` / `OP_RECNEW_EMPTY` with shape-stable synthetic type names
- Codegen: fmt and python backends handle the new variant
- Example: `examples/anon-record.ilo` with 9 passing test cases

Closes ILO-54

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: cargo fmt + regenerate ai.txt

* feat(ILO-377): add `by <step>` syntax for stepped range loops

Extends `@i start..end{body}` with an optional `by <step>` clause so
agents can write `@i 0..n by 2{...}` instead of manual parity filters.

- Lexer: `by` becomes Token::By (stops greedy call-arg parsing cleanly)
- Parser: parse_foreach checks for Token::By after end expr
- AST: ForRange gains `step: Option<Expr>` field
- Interpreter: uses step in while loop; cnt advances by step
- VM bytecode: step_reg evaluated once, used in ADD instead of hard-coded 1
- Verifier: type-checks step; rejects literal zero/negative steps (ILO-V001)
- Codegen (fmt, python): renders `by step` / `range(s,e,step)`
- graph.rs / collect_stmts: walks step expr for call-dependency tracking
- examples/range-step.ilo: three -- run: / -- out: examples for examples_engines
- examples/inline-lambda-capture.ilo: rename `by` param to `amt` (reserved)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: cargo fmt

* Add Manifesto Principle 6: Structured Compiler-to-Agent Surface

Elevates ilo's structured-output discipline to a first-class principle,
codifying that every CLI subcommand ships --json, every artifact carries
schemaVersion, and every diagnostic has machine-readable fields. Cross-
references ILO-360 (typed fix plans), ILO-363 (provenance + golden),
ILO-364 (closed-loop benchmark) as the implementing work.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(manifesto): amend Principle 2 to acknowledge stdlib-depth tension

Adds a paragraph to the Constrained principle explaining that "small
vocabulary" trades against stdlib depth, and codifies a decision
criterion for adding builtins: ≥3 independent persona transcripts,
>40 token workaround cost, no existing composition below threshold.
Links to persona-runs/ab-shared-issues.md as the living gaps register.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(regression): AOT path for fn-body Result unwrap regression (ILO-406)

Extends ILO-53's fn-body multistep regression gate to the AOT engine:
`result_unwrap_mid_body_aot` compiles the same Result-unwrap fixture with
`ilo compile`, runs the native binary, and asserts output == "42".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: align with fleet patches

* feat(bench): closed-loop benchmark harness for ilo vs Zero per-task economics (ILO-364)

Adds scripts/closed-loop-bench.py: drives a full LLM → compile → repair →
retry loop (default N=5) across 5 canonical tasks, on ilo with Haiku and
Sonnet.  Logs generation tokens, repair tokens per turn, input tokens,
attempts-to-success, wall time, and final outcome.  Outputs a date-stamped
JSON dataset (bench/closed-loop-<date>.json) plus a markdown writeup with the
headline comparison table.  Second-language CLI (Zero) is parameterised via
--lang2-name / --lang2-bin and deferred until Zero is installable in CI.
Skill documentation is loaded once per process (steady-state caching) to match
the zero-gap-specs projection.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(caps): ILO-345 --allow-env flag for env-read capability

Add --allow-env CLI flag (Option<String> allowlist) to gate the `env`
and `env-all` builtins. Mirrors the existing --allow-net/read/write/run
pattern from ILO-59: omitting the flag stays permissive; any --allow-*
flag present switches to Caps::Restricted where only listed vars pass.

- caps.rs: add `env: Policy` field to Caps::Restricted; add check_env()
  method (exact-name match; env-all passes "*" sentinel)
- cli/args.rs: add --allow-env=VARS RunArgs field
- main.rs: wire allow_env into build_caps(); update all RunArgs literals
- interpreter/mod.rs: check_env() guard in Builtin::Env and Builtin::EnvAll
- vm/mod.rs: check_env() guard in OP_ENV handler
- tests: caps unit tests, dispatch_run integration tests, example file

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(bench): pin CI runner shape and record hardware in results (ILO-348)

- bench.yml: adds a "Collect hardware info and check baseline" step that
  reads /proc/cpuinfo and /proc/meminfo, seeds bench/hw-baseline.json on
  first run, and exits non-zero when the runner shape differs from the
  baseline so polluted results are never committed.
- bench/run.sh: embeds cpu_model, cpu_count, mem_gb into results.json
  under a top-level "hardware" key; falls back to live detection on local
  runs when .hw-info.json is absent.
- bench/results.json: back-fills hardware block for the existing baseline
  run (AMD EPYC 7763, 2-core, 6.8 GB — standard GitHub ubuntu-latest).
- bench/hw-baseline.json: seeds the hardware baseline from that same run.
- Regression check in bench.yml skips comparison when hardware changed
  between the two result files (belt-and-suspenders guard).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: wire bench/results.json into perf table artifact (ILO-347)

Add scripts/gen-perf-table.py which reads bench/results.json and
renders a markdown table to bench/perf-table.md. The generated file
can be embedded into a future site/ page via an include directive.
Commit the initial generated table alongside the script.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Refactor parser context flags into a ParseContext struct

Groups `no_whitespace_call` (and any future boolean context flags) into
a `ParseContext` struct with `push_ctx`/`pop_ctx` helpers, replacing the
scattered manual save/restore pairs with a single snapshot+restore call.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add closure-heavy regression tests pinning ILO-49 working state

The AArch64 cranelift-jit 0.116 veneer assertion (wasmtime#12239) is not
reproducible on x86_64. Investigation confirms the Phase 2 closure-capture
lift already handles all closure shapes correctly on --jit with no panic
fallback. Adds examples/closure-heavy.ilo as the canonical regression
fixture and tests/regression_closure_heavy_jit.rs with 4 tests asserting
correct output and absence of [ilo:jit-fallback] on all pipeline shapes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(ILO-60): module system runtime use resolution MVP

Add named-module import form, module-level privacy enforcement, and
_-prefixed private declaration support.

New features:
- `use alias:"path"` named-module import: public symbols renamed to
  `alias-name` (e.g. `use math:"math-lib"` → `math-dbl`, `math-half`)
- `_name` declaration syntax for module-private functions and types;
  `_` immediately adjacent to ident at declaration head and call sites
- Privacy enforcement: `_`-prefixed names excluded from alias imports
  and blocked in selective `[...]` imports (ILO-P019)
- Cycle detection already present (ILO-P018); flat imports unchanged
- `examples/use-basic.ilo` + `examples/use-basic-helper.ilo`
- SPEC.md and ai.txt updated with full module system documentation
- 9 new unit tests (parser + resolve_imports)

Deferred: re-exports, conditional imports, lazy loading, private-helper
inlining for alias imports (private fns visible in flat imports only).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: cargo fmt --all

* Add provenance matrix and golden-file diagnostics (ILO-363 Phase 4)

- Add `Phase` enum and `phase` field to `ErrorEntry` in `src/diagnostic/registry.rs`, covering all 88 registry entries; `ilo explain --json` now includes `"phase"` in its output envelope
- Create `conformance/provenance-surface.json` mapping 25 surface features to their compiler function, source path, example fixture, and owned error codes
- Generate `conformance/diagnostics/<CODE>.expected.json` golden files for the top 20 error codes (L001-L003, P001-P005, T001-T009, R001/R003/R005)
- Add `tests/golden_diagnostics.rs` with 22 tests (one per golden file + provenance-surface validity + key-shape guard); runs under `--features golden`; supports `--bless` / `ILO_GOLDEN_BLESS=1` for one-line snapshot updates
- Add `golden` Cargo feature and document the bless workflow in `CONTRIBUTING.md`

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: align with fleet patches

* chore: audit tree-bridge builtins for tree_bridge_returns_result gaps (ILO-397)

Cross-referenced every (Builtin, arity) pair in is_tree_bridge_eligible against
the verify.rs BUILTINS signature table. No gaps found: all bridge-eligible
builtins that return an ILO Result type ('R ...') are already present in
tree_bridge_returns_result.

Adds a regression test (vm::tests::tree_bridge_eligible_result_builtins_are_in_returns_result)
that encodes the complete eligible-and-result-returning set and asserts both
directions: every such builtin is in tree_bridge_returns_result, and
tree_bridge_returns_result reports true for each.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: lock ILO-371 Cranelift HOF parity with regression tests

Add tests/regression_cranelift_parity.rs with 9 cross-engine tests
(VM + JIT + AOT) covering the three symptoms from the 2026-05-21
persona dogfood run: grp-nil on AOT with closure-captured key fn,
mset accumulator perf at scale, and main>_ prnt-drop on AOT.

Also add an ILO-371 dispatch contract comment in compile_cranelift.rs
explaining the ACTIVE_PROGRAM TLS requirement for every HOF callback
path, and pointing at the new regression file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: align with fleet patches

* feat(verify): structural subtyping from anon records into named record types (ILO-367)

An anonymous record `{name:"jane" age:30}` is now accepted wherever a named
record type (e.g. `person`) is expected, provided every field declared on the
named type is present in the anon record with a compatible type. Extra fields
on the anon record are permitted (width subtyping). The check applies at
function call sites and at return-type boundaries.

New helpers: `anon_satisfies_named` and `compatible_ext` (extends `compatible`
with types registry access). Five new unit tests cover: exact match, extra
fields, missing field (rejected), wrong field type (rejected), and return
position.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: align with fleet patches

* feat: labelled args in match-subject position (ILO-355)

Extends the labelled-arg resolver from ILO-71 to the `?` match-subject
parser paths in both stmt and expr positions. `?fn lbl:val{arms}` and
`r=?fn lbl:val{arms}` now rewrite to a `Call` subject with positional
args resolved by name, identical to how `fn lbl:val` works in regular
call position.

- `looks_like_labelled_call_match_subject`: lookahead probe detecting
  `ident:value` pairs before `{`; disambiguates type-context colons
  using the same `>` / `:` signal as `peek_labelled_arg_label`.
- `parse_match_stmt` and `parse_match_expr`: new labelled-subject branch
  after the existing positional-call rewrite block; consumes leading
  positional operands then hands off to `resolve_labelled_args`.
- Four parser unit tests covering all-labelled reversed, stmt, expr, and
  mixed-reversed-label shapes.

All 3382 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: ilo trace --depth expr + --watch <name> (ILO-344)

Extends `ilo trace` (ILO-72 / #605) with two new flags:

- `--depth statement|expr` (default: statement) — when `expr` is
  selected, additional JSON-line events are emitted for every function
  call and binary-op sub-expression, interleaved before the parent
  statement event. Each expr event carries `kind:"expr"`, `expr` (source
  text), `refs` (variable names touched), and `result`.

- `--watch <name>` (repeatable) — filters output so that only events
  whose bindings/refs include the named variable are emitted. Works for
  both stmt and expr events; unknown names produce no output cleanly.

Implementation touches:
  - `TraceDepth` value-enum + new fields on `TraceArgs` in cli/args.rs
  - `ExprTraceEvent`, `EXPR_TRACE_HOOK`, `CURRENT_STMT_SPAN` thread-locals
    and `run_with_trace_opts` in interpreter/mod.rs
  - `fire_expr_trace_event`, `collect_refs` helpers in interpreter/mod.rs
  - Call and BinOp arms of `eval_expr` fire expression events
  - `emit_stmt_event` / `emit_expr_event` with watch-filter in cli/trace.rs
  - 5 new integration tests in tests/cli_trace.rs (all 8 pass)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: align with fleet patches

* Add long-form map-op aliases (map-get/map-set/etc) [ILO-82]

Adds hyphen-form discoverability aliases for all six map builtins to
BUILTIN_ALIASES in src/ast/mod.rs: map-get→mget, map-set→mset,
map-has→mhas, map-del→mdel, map-keys→mkeys, map-values→mvals.
Regression tests in tests/regression_map_op_aliases.rs pin alias
resolution across all alias/canonical combinations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: align with fleet patches

* feat: add mget-or example and dedicated regression tests (ILO-42)

Add examples/mget-or.ilo demonstrating the mget-or builtin across text-key,
numeric-key, and text-value maps, and tests/regression_mget_or.rs pinning
hit/miss/type-mismatch behaviour across VM and Cranelift engines.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: align with fleet patches

* research: Gleam feature audit for 0.13.0 (ILO-37)

Audits Gleam 1.x features against ilo 0.12.x, evaluating each against
the six Manifesto principles. Top absorb candidates: use-style R T E
flattener, typed todo/panic, | alternatives and multi-subject ? match.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: align with fleet patches

* fix(verify): treat O _ (Optional Unknown) as opaque in builtin arg checks

jpar! unwraps to _ (Unknown). mget on _ returns O _ (Optional Unknown).
Previously, builtin type checks like len/hd/at/map/flt/srt/rev/zip/mkeys
etc only accepted Ty::Unknown as a "skip check" escape — O _ was a
concrete Optional type and would emit spurious ILO-T013 errors.

Adds is_opaque(ty) helper returning true for _ and O _, and threads it
through every builtin argument type-check arm that previously only passed
Ty::Unknown through. Downstream chains of the form:

  r=jpar! body; v=mget r "items"; len v

now verify cleanly. Real type errors (e.g. mget on a known L t) still fire.

Regression tests cover the block-validator and fix-plan-emitter persona
shapes from the 2026-05-21 A/B run (ILO-373).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: align with fleet patches

* chore: extract top-10 oversized dispatch arms into #[inline(never)] helpers (ILO-341)

Extracts the 10 largest `if builtin == Some(Builtin::...)` arms from
`call_function` into out-of-line `#[inline(never)]` helper functions,
mirroring the existing `rolling_window_run` / `lstsq_run` / `matvec_run`
pattern introduced in #494 / #506 / #515.

Arms extracted (by size, largest first):
- `ifft_run`          (was 198 lines → arm now 11 lines)
- `matmul_run`        (was 130 lines → arm now 17 lines)
- `rgxall_multi_run`  (was  82 lines → arm now 19 lines)
- `rgxall1_run`       (was  60 lines → arm now 19 lines)
- `rgxall_run`        (was  60 lines → arm now 18 lines)
- `quantile_run`      (was  54 lines → arm now 16 lines)
- `where_run`         (was  60 lines → arm now 19 lines)
- `stdev_run`         (was  43 lines → arm now  9 lines)
- `variance_run`      (was  43 lines → arm now  9 lines)
- `median_run`        (was  43 lines → arm now  9 lines)
- `rgx_run`           (was  42 lines → arm now 18 lines)
- `rgxsub_run`        (was  44 lines → arm now 21 lines)

39 arms remain above the 40-line threshold (tracked in ILO-341 follow-up).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: align with fleet patches

* ci: align with fleet patches

* feat: use<- chain flattener (ILO-409)

Add `<-` bind operator that desugars `x <- expr ; rest` into
`?expr{~x: rest; ^e: ^e}`, flattening multi-step R-T-E chains.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: changelog entry for ILO-409 use-chain flattener

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: cargo fmt --all

---------

Co-authored-by: Daniel Morris <daniel@cubitts.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…687)

* docs: surface get/pst headers, jpth dot-path, text-concat ref

Three doc-only discoverability fixes hit by multiple personas in 2026-05-20
sessions. Canonical signatures in SPEC.md and src/verify.rs were already
correct, but the skill docs weren't front-loading them enough for agents to
find on first read.

- SKILL.md gets a "Quick reference - things agents miss" block covering
  text concatenation (+, fmt, cat), HTTP custom headers (every verb takes
  an optional M t t map), and jpth being dot-path not JSONPath.
- ilo-builtins-io.md HTTP section gets a bold lead on headers and a
  runnable mset/get!/pst! example.
- ilo-builtins-io.md JSON section front-loads the dot-path warning before
  the jpth signature line.
- ilo-builtins-text.md gets a concat/format quick-reference at the top so
  agents reaching for "how do I join two strings" pick + over cat.

Personas: scrapingbee, tui-client (#26b headers), bearer-token-client (#26c
jpth), 5+ across sessions (#26e concat confusion).

* docs: tighten io/text doc additions to fit token budget

Trim the headers/jpth/concat additions to single dense lines so the
budget-checker doesn't get worse than baseline. Net result: io drops
from 1837 to 1748 (under its previous overage), text returns to ~baseline.

* add rand alias for rnd

* docs: rand alias in SPEC, ai.txt, skill, and CHANGELOG

- SPEC.md builtins table notes that `rnd` returns random, not round,
  with the alias pair `rand`/`random` and a pointer at `rou`/`round`
  for rounding.
- SPEC.md aliases table gains the `rand` -> `rnd` row.
- ai.txt regenerated from SPEC.md via build.rs.
- skills/ilo/ilo-builtins.md math section now spells out the
  round-vs-random trap so agents writing programs through the skill
  see the disambiguation in context.
- CHANGELOG 0.12.1 lists `rand` as an additive ergonomic alias.

* feat: bisect for O(log N) sorted-list search (#5bb)

Add bisect xs:L n target:n > n, Python bisect_left semantics: returns
the leftmost index i such that xs[0..i] < target <= xs[i..]. Empty list
returns 0; target greater than every element returns len xs; ties
resolve to the leftmost equal index. NaN target propagates as NaN to
match argmax/argmin policy.

The asymptotic point is the whole point. The natural recipe in ilo
today is len (flt (x:n>b;< x target) xs) or hd (flt fn (enumerate xs)),
which is O(n) per lookup. interp1d, sorted-lookup, percentile pickers
and histogram binning all need this on inner loops; collapsing the
scan to one builtin call gives O(log n) at the same token cost.
Caller owns the sortedness precondition; we do not validate it,
matching srt/unq/Python bisect precedent.

Tree-bridge eligible alongside the argmax/argmin/argsort index-
returning aggregates: the tree interpreter does the work, VM and
Cranelift inherit through OP_CALL_BUILTIN_TREE at zero opcode cost.
Returns plain n (not Result), so no tree_bridge_returns_result entry
needed. No error propagation needed either: bisect cannot fail on
sorted-numeric input. Type errors at the bridge raise ILO-R009
identically across engines via the normal interpreter arm.

Builtin tag appended last to preserve every existing on-wire tag.

* feat: surface stability matrix in ilo spec --json ai and ai.txt (ILO-75)

Adds a `stability` field to the `ilo spec --json ai` JSON envelope listing
stable/provisional/experimental surfaces, cross-linked to STABILITY.md.
Adds a STABILITY line to ai.txt so agents consuming the compact spec see
the stability contract directly. Per-item stability in the spec output is
scoped as a follow-up (the current ai.txt format is a flat blob, not
per-item; restructuring that is L scope).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: per-item stability field in ilo spec --json ai (ILO-340)

Add a `stability()` method to `Builtin` that returns `"provisional"` for
all builtins shipped in 0.12.1 or earlier and `"experimental"` for the
twelve unreleased builtins above 0.12.1 in CHANGELOG.md (matvec, lstsq,
jpar-list, get-to, pst-to, tz-offset, run2, rgxall-multi, fmod,
dtparse-rel, dur-parse, dur-fmt).

`ilo spec --json ai` now emits a `builtins` array where every entry carries
`{name, stability}`, sourced from `Builtin::ALL` via the new method. The
existing top-level `stability` summary is preserved unchanged for backward
compat. SPEC.md gains a `## Stability` section so build.rs regenerates the
matching `STABILITY:` line in ai.txt (fixing the pre-existing divergence
between SPEC.md and ai.txt introduced in PR #598). One new integration test
asserts the builtins array is present, non-empty, and uses only known tiers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(spec): per-item stability for CLI flags in spec JSON (ILO-350)

Add `CLI_FLAGS` static registry in `src/cli/args.rs` with a `CliFlag`
struct (`name`, `stability`) covering all 34 public long flags.
Emit a `"flags"` array alongside the existing `"builtins"` array in
`ilo spec --json ai`, giving AI consumers the same per-item stability
annotations for CLI flags that ILO-340 / PR #609 added for builtins.

- `CLI_FLAGS: &[CliFlag]` — exhaustive, deduplicated, compile-time constant
- `spec --json ai` response gains `"flags": [{name, stability}, …]`
- New integration test: `spec_json_ai_flags_array_has_stability_fields`
- New unit tests in args.rs: validity, no-duplicates, non-empty

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: cargo fmt --all

---------

Co-authored-by: Daniel Morris <daniel@cubitts.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
@danieljohnmorris danieljohnmorris added the mac-reviewing Currently being merge-prepped by mac-side agent label May 22, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 22, 2026

⚠️ JUnit XML file not found

The CLI was unable to find any JUnit XML files to upload.
For more help, visit our troubleshooting guide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

mac-reviewing Currently being merge-prepped by mac-side agent

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant