Skip to content

feature: crypto primitives - sha256, hmac-sha256, base64, hex, ct-eq#494

Closed
danieljohnmorris wants to merge 10 commits into
mainfrom
feature/crypto-primitives
Closed

feature: crypto primitives - sha256, hmac-sha256, base64, hex, ct-eq#494
danieljohnmorris wants to merge 10 commits into
mainfrom
feature/crypto-primitives

Conversation

@danieljohnmorris
Copy link
Copy Markdown
Collaborator

Summary

  • Adds 9 crypto builtins validated by batch-1 personas (webhook-sig, jwt, hex-dump workflows)
  • All tree-bridge eligible: VM and Cranelift get them free via existing dispatch path, zero new opcodes
  • Dependencies: sha2, hmac, base64, hex, subtle crates (RustCrypto ecosystem, audited)

Builtins

Builtin Sig Notes
sha256 s > t SHA-256 lowercase hex of UTF-8 bytes
hmac-sha256 key body > t HMAC-SHA256 lowercase hex
base64-enc s > t RFC 4648 standard with = padding
base64-dec s > R t t Err on invalid input
base64url-enc s > t RFC 4648 url-safe, no padding (JWT)
base64url-dec s > R t t Err on invalid input
hex-enc bytes:L n > t 0-255 integer list to lowercase hex
hex-dec s > R (L n) t Err on odd-length or non-hex
ct-eq a b > b Constant-time equality via subtle

Repro before/after

Before: no way to sign a webhook or verify a HMAC digest in ilo without shelling out.

After:

verify sig:t body:t>b;expected=hmac-sha256 "secret" body;ct-eq expected sig

What's in the diff

  1. deps: add sha2, hmac, base64, hex, subtle for crypto primitives
  2. feat: register crypto builtins - Builtin enum, arity, tree-bridge, verifier
  3. feat: implement crypto builtin evaluation in tree interpreter
  4. docs+tests: cross-engine regression tests, example, doc sync for crypto builtins
  5. chore: apply cargo fmt to interpreter and tests

Test plan

  • 36 cross-engine regression tests in tests/regression_crypto_primitives.rs
  • SHA-256 empty string and "abc" (FIPS 180-4 vectors, coreutils verified)
  • HMAC-SHA256 RFC 4231 test case 2 (Jefe/what-do-ya-want) + simple key vector
  • base64 RFC 4648 known vectors (Man/TWFu, Ma/TWE=) + round-trips
  • base64url no-padding and URL-safe alphabet assertion
  • hex round-trips + uppercase input acceptance + error cases
  • ct-eq equal/unequal/different-length/empty + HMAC comparison pattern
  • examples/crypto-primitives.ilo with run/out annotations (passes examples_engines)
  • Full integration test suite passes (36 new + existing 3000+)

Follow-ups

  • hmac-sha256 takes text keys only; binary keys (e.g. derived from hex-dec) would need a hmac-sha256-bytes variant (deferred)
  • sha256-bytes taking L n input would close the "hash raw bytes" gap without a hex round-trip (deferred)

@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

❌ Patch coverage is 66.32653% with 66 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/interpreter/mod.rs 60.94% 66 Missing ⚠️

📢 Thoughts on this report? Let us know!

…rifier

Add 9 variants: Sha256, HmacSha256, Base64Enc, Base64Dec, Base64UrlEnc,
Base64UrlDec, HexEnc, HexDec, CtEq. All appended to Builtin::ALL to
preserve on-wire tags. Mark each as tree-bridge eligible in vm/mod.rs;
base64-dec, base64url-dec, hex-dec added to tree_bridge_returns_result.
Add arity entries to verify.rs BUILTINS table.
sha256: SHA-256 hex digest of UTF-8 bytes. hmac-sha256: HMAC-SHA256
lowercase hex (key, body both text). base64-enc/dec: RFC 4648 standard
with padding. base64url-enc/dec: RFC 4648 url-safe no-pad. hex-enc:
list of 0-255 integers to lowercase hex. hex-dec: hex string to list
of byte values, accepts upper/lowercase. ct-eq: constant-time text
equality via subtle::ConstantTimeEq. Decode ops return R t/L n t.
…to builtins

regression_crypto_primitives.rs: 36 cross-engine tests covering sha256
(empty/abc/hello-world vectors), hmac-sha256 (RFC 4231 test case 2 +
simple vector), base64 round-trips and padding, base64url no-padding
and URL-safe alphabet, hex encode/decode round-trips, ct-eq equal/
unequal/different-length/empty, and the HMAC comparison use-case.

examples/crypto-primitives.ilo: annotated example with run/out assertions
exercising all 9 builtins across tree/VM.

Docs: SPEC.md builtin table + Crypto section, ai.txt entries after
default-on-err, ilo-builtins-text.md Crypto section, CHANGELOG.md.
Compact the Duration and Crypto sections without losing any information
agents actually use. Duration's full-sentence prose trimmed to one-line
descriptions; Crypto example block trimmed to 6 lines. Total 972 tokens
(was 1275), budget passes.
My manual edit to ai.txt placed crypto entries in the wrong position.
build.rs generates ai.txt from the SPEC.md table in document order;
the crypto rows appear after rdjl/get-many, not after default-on-err.
Committing the build.rs output so CI passes.
Compact the Duration and Crypto sections in ilo-builtins-text.md to stay
safely under the 1000-token CI limit. Removed code blocks; content is now
inline prose. Local count: 705 tokens (was 972).
sha256, hmac-sha256, base64-enc/dec, base64url-enc/dec, hex-enc/dec, ct-eq
plus get-to, pst-to, jpar-list, dur-parse, dur-fmt, get-many, env-all,
rgxall, rgxall-multi, rgxall1, rgxsub, lget-or, mget-or, now-ms, default-on-err.
…bug builds

sha2/hmac debug-build frames are larger than the default 2 MiB test
thread stack. The recursive fibonacci test (fib(10)) hits the limit.
Spawn with an 8 MiB thread so the test is robust regardless of which
crypto deps are compiled alongside it.
@danieljohnmorris danieljohnmorris force-pushed the feature/crypto-primitives branch from ddc0277 to c8fa110 Compare May 20, 2026 20:18
danieljohnmorris added a commit that referenced this pull request May 20, 2026
The first commit added the lstsq arm inline in call_function, which tipped
the giant dispatch frame over the cargo-nextest stack budget. CI hit a
SIGABRT on interpret_braceless_guard_fibonacci - the same recursion-depth
bomb #506 (sha2/hmac) and #494 (caps fields) papered over previously and
that #5av plans to fix structurally by decomposing call_function.

Move the lstsq body out to a standalone #[inline(never)] helper `lstsq_run`
next to lu_decompose / lu_solve. The arm in call_function is now a single
return call, contributing zero frame growth. Local tests pass either way
(default cargo test stack > nextest), so the fix is CI-driven.
danieljohnmorris added a commit that referenced this pull request May 21, 2026
The first commit added the lstsq arm inline in call_function, which tipped
the giant dispatch frame over the cargo-nextest stack budget. CI hit a
SIGABRT on interpret_braceless_guard_fibonacci - the same recursion-depth
bomb #506 (sha2/hmac) and #494 (caps fields) papered over previously and
that #5av plans to fix structurally by decomposing call_function.

Move the lstsq body out to a standalone #[inline(never)] helper `lstsq_run`
next to lu_decompose / lu_solve. The arm in call_function is now a single
return call, contributing zero frame growth. Local tests pass either way
(default cargo test stack > nextest), so the fix is CI-driven.
danieljohnmorris added a commit that referenced this pull request May 21, 2026
The first commit added the lstsq arm inline in call_function, which tipped
the giant dispatch frame over the cargo-nextest stack budget. CI hit a
SIGABRT on interpret_braceless_guard_fibonacci - the same recursion-depth
bomb #506 (sha2/hmac) and #494 (caps fields) papered over previously and
that #5av plans to fix structurally by decomposing call_function.

Move the lstsq body out to a standalone #[inline(never)] helper `lstsq_run`
next to lu_decompose / lu_solve. The arm in call_function is now a single
return call, contributing zero frame growth. Local tests pass either way
(default cargo test stack > nextest), so the fix is CI-driven.
danieljohnmorris added a commit that referenced this pull request May 21, 2026
The first commit added the lstsq arm inline in call_function, which tipped
the giant dispatch frame over the cargo-nextest stack budget. CI hit a
SIGABRT on interpret_braceless_guard_fibonacci - the same recursion-depth
bomb #506 (sha2/hmac) and #494 (caps fields) papered over previously and
that #5av plans to fix structurally by decomposing call_function.

Move the lstsq body out to a standalone #[inline(never)] helper `lstsq_run`
next to lu_decompose / lu_solve. The arm in call_function is now a single
return call, contributing zero frame growth. Local tests pass either way
(default cargo test stack > nextest), so the fix is CI-driven.
@danieljohnmorris
Copy link
Copy Markdown
Collaborator Author

Closing to unstick the merge queue. The keep-both rebase strategy can't handle this PR cleanly (it touches the same dispatch table every other PR appends to, producing broken-brace artifacts on rebase). Will reimplement against current main after the rest of the queue drains.

@danieljohnmorris
Copy link
Copy Markdown
Collaborator Author

Superseded by a fresh reimplementation on feature/crypto-primitives-v2 — this branch became a keep-both conflict-magnet after main moved. The replacement is a tighter scope (6 builtins instead of 9: sha256, hmac-sha256, b64, b64-dec, hex, ct-eq) and rebuilt against current main from scratch. New PR link incoming.

danieljohnmorris pushed a commit that referenced this pull request May 22, 2026
…elpers (ILO-341)

Extracts the 10 largest `if builtin == Some(Builtin::...)` arms from
`call_function` into out-of-line `#[inline(never)]` helper functions,
mirroring the existing `rolling_window_run` / `lstsq_run` / `matvec_run`
pattern introduced in #494 / #506 / #515.

Arms extracted (by size, largest first):
- `ifft_run`          (was 198 lines → arm now 11 lines)
- `matmul_run`        (was 130 lines → arm now 17 lines)
- `rgxall_multi_run`  (was  82 lines → arm now 19 lines)
- `rgxall1_run`       (was  60 lines → arm now 19 lines)
- `rgxall_run`        (was  60 lines → arm now 18 lines)
- `quantile_run`      (was  54 lines → arm now 16 lines)
- `where_run`         (was  60 lines → arm now 19 lines)
- `stdev_run`         (was  43 lines → arm now  9 lines)
- `variance_run`      (was  43 lines → arm now  9 lines)
- `median_run`        (was  43 lines → arm now  9 lines)
- `rgx_run`           (was  42 lines → arm now 18 lines)
- `rgxsub_run`        (was  44 lines → arm now 21 lines)

39 arms remain above the 40-line threshold (tracked in ILO-341 follow-up).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
danieljohnmorris pushed a commit that referenced this pull request May 22, 2026
…elpers (ILO-341)

Extracts the 10 largest `if builtin == Some(Builtin::...)` arms from
`call_function` into out-of-line `#[inline(never)]` helper functions,
mirroring the existing `rolling_window_run` / `lstsq_run` / `matvec_run`
pattern introduced in #494 / #506 / #515.

Arms extracted (by size, largest first):
- `ifft_run`          (was 198 lines → arm now 11 lines)
- `matmul_run`        (was 130 lines → arm now 17 lines)
- `rgxall_multi_run`  (was  82 lines → arm now 19 lines)
- `rgxall1_run`       (was  60 lines → arm now 19 lines)
- `rgxall_run`        (was  60 lines → arm now 18 lines)
- `quantile_run`      (was  54 lines → arm now 16 lines)
- `where_run`         (was  60 lines → arm now 19 lines)
- `stdev_run`         (was  43 lines → arm now  9 lines)
- `variance_run`      (was  43 lines → arm now  9 lines)
- `median_run`        (was  43 lines → arm now  9 lines)
- `rgx_run`           (was  42 lines → arm now 18 lines)
- `rgxsub_run`        (was  44 lines → arm now 21 lines)

39 arms remain above the 40-line threshold (tracked in ILO-341 follow-up).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
danieljohnmorris added a commit that referenced this pull request May 22, 2026
* docs: sync getx/pstx to SPEC.md, ai.txt, and skill

Four rows in the SPEC builtins table, paragraph + example in the HTTP
section, regenerated ai.txt via build.rs, plus a 'when to reach for getx
vs get' paragraph in skills/ilo/ilo-builtins-io.md.

Site companion change pushed separately to ilo-lang/site.

* doc: sync ilo test surface across SPEC, ai.txt, skill

SPEC.md gains the CLI invocation line in the inventory plus a full
**ilo test** paragraph next to ilo check, covering engine selection,
the -- engine-skip: passthrough, and the all-pass / any-fail exit
codes. ai.txt gets the matching agent-spec entry inline. The skill's
ilo-agent.md gets a Testing section with the three canonical
invocations so an agent writing tests for its own programs sees the
shape without round-tripping to SPEC.

* doc(spec): note hyphen-vs-subtraction whitespace rule

Pins the lexer's whitespace-sensitive disambiguation in the identifier-
syntax section: `a-b` is always one ident, `a - b` is subtraction.
Mentions the new two-kebab-half hint surfaced by ILO-T004 so agents
reading the spec see the canonical prefix and infix-with-spaces forms.

ai.txt regenerated by build.rs from SPEC.md.

* chore: add b64/hex to SPEC reserved-names section

regression_reserved_names_doc has been failing on main since the
crypto-primitives merge (PR #560 added b64 / b64-dec / hex without
updating the enumerated 3-char reserved list). Adds them so the test
goes green for any branch off main.

ai.txt regenerated.

* feat: O(n) rolling-window reducers rsum/ravg/rmin

Add three new builtins for sliding-window aggregations over numeric
lists: rsum (running sum), ravg (running mean), and rmin (running
minimum). Output length is len xs - n + 1; empty when n > len xs.

The asymptotic point is the whole point. The natural recipe in ilo
today is map (i:n>n;sum (slc xs i (+ i n))) ..., which is O(n*w) and
explodes for fat windows on long inputs. rsum/ravg use a running-sum
(one add and one subtract per step), and rmin uses a monotonic deque,
giving O(n) amortised total for all three.

Tree-bridge eligible alongside the cumsum/cprod/ewm aggregate family:
the tree interpreter does the work, VM and Cranelift inherit through
OP_CALL_BUILTIN_TREE at zero opcode cost. ILO-R009 propagates on
Cranelift via tree_bridge_propagates_error so the n=0 / negative-n /
non-numeric-element error parity holds across engines.

Builtin tags appended last to preserve every existing on-wire tag.

* docs: surface get/pst headers, jpth dot-path, text-concat ref

Three doc-only discoverability fixes hit by multiple personas in 2026-05-20
sessions. Canonical signatures in SPEC.md and src/verify.rs were already
correct, but the skill docs weren't front-loading them enough for agents to
find on first read.

- SKILL.md gets a "Quick reference - things agents miss" block covering
  text concatenation (+, fmt, cat), HTTP custom headers (every verb takes
  an optional M t t map), and jpth being dot-path not JSONPath.
- ilo-builtins-io.md HTTP section gets a bold lead on headers and a
  runnable mset/get!/pst! example.
- ilo-builtins-io.md JSON section front-loads the dot-path warning before
  the jpth signature line.
- ilo-builtins-text.md gets a concat/format quick-reference at the top so
  agents reaching for "how do I join two strings" pick + over cat.

Personas: scrapingbee, tui-client (#26b headers), bearer-token-client (#26c
jpth), 5+ across sessions (#26e concat confusion).

* docs: tighten io/text doc additions to fit token budget

Trim the headers/jpth/concat additions to single dense lines so the
budget-checker doesn't get worse than baseline. Net result: io drops
from 1837 to 1748 (under its previous overage), text returns to ~baseline.

* docs: ILO-T043 registry entry and SPEC sync

Adds --explain ILO-T043 reachable entry with the canonical fix walk-
through (tail-position move, ret-wrap, ?h restructure). SPEC.md tail-
call rules section now points at the new warning. ai.txt regenerated
by build.rs picks up the SPEC change.

* add rand alias for rnd

* docs: rand alias in SPEC, ai.txt, skill, and CHANGELOG

- SPEC.md builtins table notes that `rnd` returns random, not round,
  with the alias pair `rand`/`random` and a pointer at `rou`/`round`
  for rounding.
- SPEC.md aliases table gains the `rand` -> `rnd` row.
- ai.txt regenerated from SPEC.md via build.rs.
- skills/ilo/ilo-builtins.md math section now spells out the
  round-vs-random trap so agents writing programs through the skill
  see the disambiguation in context.
- CHANGELOG 0.12.1 lists `rand` as an additive ergonomic alias.

* feat: bisect for O(log N) sorted-list search (#5bb)

Add bisect xs:L n target:n > n, Python bisect_left semantics: returns
the leftmost index i such that xs[0..i] < target <= xs[i..]. Empty list
returns 0; target greater than every element returns len xs; ties
resolve to the leftmost equal index. NaN target propagates as NaN to
match argmax/argmin policy.

The asymptotic point is the whole point. The natural recipe in ilo
today is len (flt (x:n>b;< x target) xs) or hd (flt fn (enumerate xs)),
which is O(n) per lookup. interp1d, sorted-lookup, percentile pickers
and histogram binning all need this on inner loops; collapsing the
scan to one builtin call gives O(log n) at the same token cost.
Caller owns the sortedness precondition; we do not validate it,
matching srt/unq/Python bisect precedent.

Tree-bridge eligible alongside the argmax/argmin/argsort index-
returning aggregates: the tree interpreter does the work, VM and
Cranelift inherit through OP_CALL_BUILTIN_TREE at zero opcode cost.
Returns plain n (not Result), so no tree_bridge_returns_result entry
needed. No error propagation needed either: bisect cannot fail on
sorted-numeric input. Type errors at the bridge raise ILO-R009
identically across engines via the normal interpreter arm.

Builtin tag appended last to preserve every existing on-wire tag.

* diagnostic: add FixPlan types and fix_plan field to Diagnostic

Add FixPlan { path, edits: Vec<FixEdit> } and FixEdit { line_start,
line_end, before, after } matching the Zero PR #137 schema. Wire
fix_plan serialization into --json output. No diagnostics emit plans yet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* diagnostic: wire fix_plan derivation for T004, T003, T032, L002

Add Diagnostic::derive_fix_plan() which pattern-matches on code and
builds a structured FixPlan from the primary span + source text:

- ILO-T004 / ILO-T003: parse "did you mean 'X'?" from hint, replace span
- ILO-T032: bare fmt/fmt2 → prepend "prnt " before the call
- ILO-L002: underscore ident → hyphenated form from suggestion backticks

Wire enrich() closure in check_cmd to call derive_fix_plan() after
attaching source and diag_path (file-only; absent for inline code).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* tests: add fix_plan unit + integration tests for T004, T032, L002, T003

Unit tests in diagnostic::tests::derive_fix_plan_* cover:
- T004 typo rename, T003 type rename, T032 fmt prefix, L002 hyphen
- absent-without-hint and absent-without-source guard cases
- JSON shape (line_range array, before/after keys, path field)

Integration tests in json_output_contracts exercise the full
check --json → NDJSON stderr pipeline for each wired diagnostic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: document fix_plan field in JSON_OUTPUT.md and ai.txt

JSON_OUTPUT.md: expand ilo check section with diagnostic NDJSON shape,
optional fix_plan schema (Zero PR#137 style), and table of codes that
emit structured edits.

ai.txt: add [fix_plan] note in ERROR DIAGNOSTICS section describing the
field and which codes populate it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix-plans: wire T008, P011, T041 to derive_fix_plan

Extends derive_fix_plan() with three new code handlers:

- ILO-T008 (return type mismatch): wraps the offending return expression
  with `str`/`num` when the hint identifies a cast; no-ops for non-cast types
- ILO-P011 (reserved keyword as identifier): renames the span token to
  `<name>2` (safe mechanical rename with no semantic ambiguity)
- ILO-T041 (nil-coalesce on Result): rewrites `expr ?? default` to
  `?expr{~v:v;^_:default}` by splitting on the ` ?? ` operator

Adds 5 unit tests in src/diagnostic/mod.rs and 4 integration tests in
tests/json_output_contracts.rs exercising the live binary via ilo check --json.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(caps): ILO-59 CLI capability flags — allow-net/read/write/run

Add Deno-style `--allow-net`, `--allow-read`, `--allow-write`, `--allow-run`
CLI flags to gate IO builtins at the process level (ILO-59).

- `src/caps.rs` — `Caps` / `Policy` structs, `parse_allow`, `check_net/read/write/run`
  helpers, and 26 unit tests. Denial messages now carry the `ILO-CAP-001`
  structured error code so agents can route on it.
- `src/cli/args.rs` — four `Option<String>` fields on `RunArgs` wired to clap flags
  (`--allow-net`, `--allow-read`, `--allow-write`, `--allow-run`).
- `src/main.rs` — `build_caps()` converts `RunArgs` flags into `Arc<Caps>`;
  `Caps::Permissive` is the default so no existing invocation changes behaviour.
- `src/interpreter/mod.rs` + `src/vm/mod.rs` — `caps.check_*` calls at every
  IO builtin site.
- `tests/capability_flags.rs` — 15 integration tests covering all four
  dimensions across both backends, plus `Caps::parse_allow` round-trips.
- `examples/capability-sandbox.ilo` — runnable demo.
- `SANDBOX.md` — operator guide: flag syntax, matching rules, capability
  matrix, recipes, backwards-compatibility note.
- `ai.txt` + `SPEC.md` — capability matrix and flag reference added.

`Caps::Permissive` is `#[default]`. Without any `--allow-*` flag the runtime is
fully permissive — identical to pre-0.13 behaviour.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add wro truncating-write builtin as companion to wra

Ship `wro path s` (write-overwrite): truncates the target file before
writing, creating it if missing. Returns `R t t` matching `wr`/`wra`.
Wires enum, name, dispatch, verify, tree-bridge eligibility, example,
regression tests (5 cases), SPEC, ai.txt, and skills/ilo doc.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: resolve stdlib hyphen contradiction via Path B (ILO-76)

Principle 1's naming rule said every hyphen doubles token cost, but stdlib
ships ~20 hyphenated builtins. Added an explicit "by design, not
contradiction" callout to both MANIFESTO.md and ai.txt explaining that
stdlib names are a closed memorised vocabulary (same resolution pattern as
the residual-English-keywords note in principle 4). Froze the set: no new
hyphenated builtins will be added, existing names stay without a deprecation
window. No code changes — doc-only resolution.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: regenerate ai.txt

* Add ilo trace subcommand for JSON-line value snapshots (ILO-72)

Implements `ilo trace <file.ilo> [func] [args...]` which runs the
tree-walking interpreter and emits one JSON line per statement execution.
Each line carries schemaVersion, line number, source text, all current
bindings, and the statement result value.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: cargo fmt + regenerate ai.txt

* chore(aot): audit unsafe in rodata path and add fuzz harness (ILO-64)

Add SAFETY comments to every unsafe block in the AOT .rodata deserialise
path (ilo_aot_publish_program, jit_string_const, ilo_aot_parse_arg,
compile_cranelift OP_LOADK), document invariants in src/aot/README.md,
and wire up a cargo-fuzz target (fuzz/fuzz_targets/rodata_deserialise.rs)
with a nightly CI job (.github/workflows/fuzz.yml). No nightly toolchain
in this environment so the fuzz run is deferred to CI.

Refs ILO-64.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add sha256-hex and sha256d builtins for raw-bytes hashing (ILO-383)

Adds two new crypto builtins that hash hex-decoded bytes rather than
UTF-8 text, enabling pure-ilo Bitcoin Merkle tree computation and other
binary-protocol hashing without shelling out to Python:

- `sha256-hex hex:t > t`: SHA-256 of hex-decoded bytes, lowercase hex.
- `sha256d hex:t > t`: double-SHA256 (sha256(sha256(x))), Bitcoin shape.

Both error ILO-R009 on odd-length or non-hex input. Tree-bridge eligible
(VM and Cranelift JIT inherit via the tree interpreter). Includes cross-
engine regression tests, two examples (sha256-hex.ilo, sha256d-bitcoin.ilo),
and SPEC.md + ai.txt updates.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(parser): anchor ILO-P003 on orphaned identifier, not on trailing semicolon

When a prefix-binop chain like `*/dt 1 6 var` is parsed, the nested `/`
consumes `dt` and `1`, the outer `*` takes `(/dt 1)` as left and `6` as
right, leaving `var` orphaned at top-level.  `parse_decl` then attempted
to parse `var` as a new function declaration and emitted
`ILO-P003: expected '>', got ';'` anchored on the `;` after `var` — far
from the actual problem site.  This cost the scientific-researcher persona
(pair 23, 2026-05-21 A/B run) three misleading iterations.

Fix: add a guard in `parse_decl` that detects an identifier immediately
followed by `;`/`}`/EOF (impossible as a function header) when at least
one token has already been consumed.  Emits a targeted ILO-P003 anchored
on the orphaned identifier itself with a hint naming the bind-first
workaround (`t=/a b c;*t d`).

Adds `tests/parser_span_eof_drift.rs` with six tests covering the core
reproducer, hint text, single-op variant, and three happy-path controls.
All 3330 lib + 390 integration tests pass.

Closes ILO-378.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: cargo fmt + regenerate ai.txt

* feat(bench): persona-corpus smoke regression harness (ILO-384)

Adds a CI job that runs 13 representative personas (spanning Date/Time,
Numeric, HTTP, IO, Records, Text, and Tools blocks) against Claude Haiku
using the current skill modules, then gates the PR on outcome and mean
generation-token regressions (threshold 15%).

- bench/persona-smoke.txt  – pinned persona list, one slug per line
- scripts/persona-smoke.py – harness: skill-load → Haiku generate → ilo run
  → classify outcome → diff vs baseline JSON; also prints per-module token
  sizes so cap progress is visible in every CI summary
- .github/workflows/persona-smoke.yml – triggers on skills/ilo/** or ai.txt
  changes; manual dispatch supports --baseline refresh with auto-PR

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: tighten modular-skill token caps to measured baseline + headroom (ILO-382)

Set per-module token overrides to actual measured size plus ~50-token
headroom (aggressive cap per ILO-382). Reduce aggregate token limit from
15,000 to 12,500. Tighten byte tripwires in tests/skill_modular.rs from
7,000/52,000 to 6,500/42,000 to match current actual usage.

New per-module caps: ilo-builtins-core 1125, ilo-builtins-math 1460,
ilo-builtins-text 1190, ilo-agent 1270, ilo-builtins-io 2000 (already
tight), ilo-language 1700 (already tight). Growth past any cap now
requires an editorial trim or module split.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: anonymous record literal shorthand {field:val} without typedef

Adds `{field:val; field2:val2}` syntax for constructing ad-hoc structural
records without a prior `type` declaration. The type checker synthesises
a structural `AnonRecord` type that unifies with other anonymous records
of the same shape.

- Parser: recognises `{ident:...}` in expression position via `is_anon_record_literal` lookahead
- AST: `Expr::AnonRecord { fields }` variant with full desugaring/traversal coverage
- Type checker: `Ty::AnonRecord` structural type with field access, destructure, and `with` update
- Interpreter: evaluates to `Value::Record` with `__anon` type_name
- VM: compiles via `OP_RECNEW` / `OP_RECNEW_EMPTY` with shape-stable synthetic type names
- Codegen: fmt and python backends handle the new variant
- Example: `examples/anon-record.ilo` with 9 passing test cases

Closes ILO-54

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: cargo fmt + regenerate ai.txt

* feat(ILO-377): add `by <step>` syntax for stepped range loops

Extends `@i start..end{body}` with an optional `by <step>` clause so
agents can write `@i 0..n by 2{...}` instead of manual parity filters.

- Lexer: `by` becomes Token::By (stops greedy call-arg parsing cleanly)
- Parser: parse_foreach checks for Token::By after end expr
- AST: ForRange gains `step: Option<Expr>` field
- Interpreter: uses step in while loop; cnt advances by step
- VM bytecode: step_reg evaluated once, used in ADD instead of hard-coded 1
- Verifier: type-checks step; rejects literal zero/negative steps (ILO-V001)
- Codegen (fmt, python): renders `by step` / `range(s,e,step)`
- graph.rs / collect_stmts: walks step expr for call-dependency tracking
- examples/range-step.ilo: three -- run: / -- out: examples for examples_engines
- examples/inline-lambda-capture.ilo: rename `by` param to `amt` (reserved)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: cargo fmt

* Add Manifesto Principle 6: Structured Compiler-to-Agent Surface

Elevates ilo's structured-output discipline to a first-class principle,
codifying that every CLI subcommand ships --json, every artifact carries
schemaVersion, and every diagnostic has machine-readable fields. Cross-
references ILO-360 (typed fix plans), ILO-363 (provenance + golden),
ILO-364 (closed-loop benchmark) as the implementing work.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(manifesto): amend Principle 2 to acknowledge stdlib-depth tension

Adds a paragraph to the Constrained principle explaining that "small
vocabulary" trades against stdlib depth, and codifies a decision
criterion for adding builtins: ≥3 independent persona transcripts,
>40 token workaround cost, no existing composition below threshold.
Links to persona-runs/ab-shared-issues.md as the living gaps register.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(regression): AOT path for fn-body Result unwrap regression (ILO-406)

Extends ILO-53's fn-body multistep regression gate to the AOT engine:
`result_unwrap_mid_body_aot` compiles the same Result-unwrap fixture with
`ilo compile`, runs the native binary, and asserts output == "42".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: align with fleet patches

* feat(bench): closed-loop benchmark harness for ilo vs Zero per-task economics (ILO-364)

Adds scripts/closed-loop-bench.py: drives a full LLM → compile → repair →
retry loop (default N=5) across 5 canonical tasks, on ilo with Haiku and
Sonnet.  Logs generation tokens, repair tokens per turn, input tokens,
attempts-to-success, wall time, and final outcome.  Outputs a date-stamped
JSON dataset (bench/closed-loop-<date>.json) plus a markdown writeup with the
headline comparison table.  Second-language CLI (Zero) is parameterised via
--lang2-name / --lang2-bin and deferred until Zero is installable in CI.
Skill documentation is loaded once per process (steady-state caching) to match
the zero-gap-specs projection.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(caps): ILO-345 --allow-env flag for env-read capability

Add --allow-env CLI flag (Option<String> allowlist) to gate the `env`
and `env-all` builtins. Mirrors the existing --allow-net/read/write/run
pattern from ILO-59: omitting the flag stays permissive; any --allow-*
flag present switches to Caps::Restricted where only listed vars pass.

- caps.rs: add `env: Policy` field to Caps::Restricted; add check_env()
  method (exact-name match; env-all passes "*" sentinel)
- cli/args.rs: add --allow-env=VARS RunArgs field
- main.rs: wire allow_env into build_caps(); update all RunArgs literals
- interpreter/mod.rs: check_env() guard in Builtin::Env and Builtin::EnvAll
- vm/mod.rs: check_env() guard in OP_ENV handler
- tests: caps unit tests, dispatch_run integration tests, example file

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(bench): pin CI runner shape and record hardware in results (ILO-348)

- bench.yml: adds a "Collect hardware info and check baseline" step that
  reads /proc/cpuinfo and /proc/meminfo, seeds bench/hw-baseline.json on
  first run, and exits non-zero when the runner shape differs from the
  baseline so polluted results are never committed.
- bench/run.sh: embeds cpu_model, cpu_count, mem_gb into results.json
  under a top-level "hardware" key; falls back to live detection on local
  runs when .hw-info.json is absent.
- bench/results.json: back-fills hardware block for the existing baseline
  run (AMD EPYC 7763, 2-core, 6.8 GB — standard GitHub ubuntu-latest).
- bench/hw-baseline.json: seeds the hardware baseline from that same run.
- Regression check in bench.yml skips comparison when hardware changed
  between the two result files (belt-and-suspenders guard).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: wire bench/results.json into perf table artifact (ILO-347)

Add scripts/gen-perf-table.py which reads bench/results.json and
renders a markdown table to bench/perf-table.md. The generated file
can be embedded into a future site/ page via an include directive.
Commit the initial generated table alongside the script.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Refactor parser context flags into a ParseContext struct

Groups `no_whitespace_call` (and any future boolean context flags) into
a `ParseContext` struct with `push_ctx`/`pop_ctx` helpers, replacing the
scattered manual save/restore pairs with a single snapshot+restore call.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add closure-heavy regression tests pinning ILO-49 working state

The AArch64 cranelift-jit 0.116 veneer assertion (wasmtime#12239) is not
reproducible on x86_64. Investigation confirms the Phase 2 closure-capture
lift already handles all closure shapes correctly on --jit with no panic
fallback. Adds examples/closure-heavy.ilo as the canonical regression
fixture and tests/regression_closure_heavy_jit.rs with 4 tests asserting
correct output and absence of [ilo:jit-fallback] on all pipeline shapes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(ILO-60): module system runtime use resolution MVP

Add named-module import form, module-level privacy enforcement, and
_-prefixed private declaration support.

New features:
- `use alias:"path"` named-module import: public symbols renamed to
  `alias-name` (e.g. `use math:"math-lib"` → `math-dbl`, `math-half`)
- `_name` declaration syntax for module-private functions and types;
  `_` immediately adjacent to ident at declaration head and call sites
- Privacy enforcement: `_`-prefixed names excluded from alias imports
  and blocked in selective `[...]` imports (ILO-P019)
- Cycle detection already present (ILO-P018); flat imports unchanged
- `examples/use-basic.ilo` + `examples/use-basic-helper.ilo`
- SPEC.md and ai.txt updated with full module system documentation
- 9 new unit tests (parser + resolve_imports)

Deferred: re-exports, conditional imports, lazy loading, private-helper
inlining for alias imports (private fns visible in flat imports only).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: cargo fmt --all

* Add provenance matrix and golden-file diagnostics (ILO-363 Phase 4)

- Add `Phase` enum and `phase` field to `ErrorEntry` in `src/diagnostic/registry.rs`, covering all 88 registry entries; `ilo explain --json` now includes `"phase"` in its output envelope
- Create `conformance/provenance-surface.json` mapping 25 surface features to their compiler function, source path, example fixture, and owned error codes
- Generate `conformance/diagnostics/<CODE>.expected.json` golden files for the top 20 error codes (L001-L003, P001-P005, T001-T009, R001/R003/R005)
- Add `tests/golden_diagnostics.rs` with 22 tests (one per golden file + provenance-surface validity + key-shape guard); runs under `--features golden`; supports `--bless` / `ILO_GOLDEN_BLESS=1` for one-line snapshot updates
- Add `golden` Cargo feature and document the bless workflow in `CONTRIBUTING.md`

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: align with fleet patches

* chore: audit tree-bridge builtins for tree_bridge_returns_result gaps (ILO-397)

Cross-referenced every (Builtin, arity) pair in is_tree_bridge_eligible against
the verify.rs BUILTINS signature table. No gaps found: all bridge-eligible
builtins that return an ILO Result type ('R ...') are already present in
tree_bridge_returns_result.

Adds a regression test (vm::tests::tree_bridge_eligible_result_builtins_are_in_returns_result)
that encodes the complete eligible-and-result-returning set and asserts both
directions: every such builtin is in tree_bridge_returns_result, and
tree_bridge_returns_result reports true for each.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: lock ILO-371 Cranelift HOF parity with regression tests

Add tests/regression_cranelift_parity.rs with 9 cross-engine tests
(VM + JIT + AOT) covering the three symptoms from the 2026-05-21
persona dogfood run: grp-nil on AOT with closure-captured key fn,
mset accumulator perf at scale, and main>_ prnt-drop on AOT.

Also add an ILO-371 dispatch contract comment in compile_cranelift.rs
explaining the ACTIVE_PROGRAM TLS requirement for every HOF callback
path, and pointing at the new regression file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: align with fleet patches

* feat(verify): structural subtyping from anon records into named record types (ILO-367)

An anonymous record `{name:"jane" age:30}` is now accepted wherever a named
record type (e.g. `person`) is expected, provided every field declared on the
named type is present in the anon record with a compatible type. Extra fields
on the anon record are permitted (width subtyping). The check applies at
function call sites and at return-type boundaries.

New helpers: `anon_satisfies_named` and `compatible_ext` (extends `compatible`
with types registry access). Five new unit tests cover: exact match, extra
fields, missing field (rejected), wrong field type (rejected), and return
position.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: align with fleet patches

* feat: labelled args in match-subject position (ILO-355)

Extends the labelled-arg resolver from ILO-71 to the `?` match-subject
parser paths in both stmt and expr positions. `?fn lbl:val{arms}` and
`r=?fn lbl:val{arms}` now rewrite to a `Call` subject with positional
args resolved by name, identical to how `fn lbl:val` works in regular
call position.

- `looks_like_labelled_call_match_subject`: lookahead probe detecting
  `ident:value` pairs before `{`; disambiguates type-context colons
  using the same `>` / `:` signal as `peek_labelled_arg_label`.
- `parse_match_stmt` and `parse_match_expr`: new labelled-subject branch
  after the existing positional-call rewrite block; consumes leading
  positional operands then hands off to `resolve_labelled_args`.
- Four parser unit tests covering all-labelled reversed, stmt, expr, and
  mixed-reversed-label shapes.

All 3382 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: ilo trace --depth expr + --watch <name> (ILO-344)

Extends `ilo trace` (ILO-72 / #605) with two new flags:

- `--depth statement|expr` (default: statement) — when `expr` is
  selected, additional JSON-line events are emitted for every function
  call and binary-op sub-expression, interleaved before the parent
  statement event. Each expr event carries `kind:"expr"`, `expr` (source
  text), `refs` (variable names touched), and `result`.

- `--watch <name>` (repeatable) — filters output so that only events
  whose bindings/refs include the named variable are emitted. Works for
  both stmt and expr events; unknown names produce no output cleanly.

Implementation touches:
  - `TraceDepth` value-enum + new fields on `TraceArgs` in cli/args.rs
  - `ExprTraceEvent`, `EXPR_TRACE_HOOK`, `CURRENT_STMT_SPAN` thread-locals
    and `run_with_trace_opts` in interpreter/mod.rs
  - `fire_expr_trace_event`, `collect_refs` helpers in interpreter/mod.rs
  - Call and BinOp arms of `eval_expr` fire expression events
  - `emit_stmt_event` / `emit_expr_event` with watch-filter in cli/trace.rs
  - 5 new integration tests in tests/cli_trace.rs (all 8 pass)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: align with fleet patches

* Add long-form map-op aliases (map-get/map-set/etc) [ILO-82]

Adds hyphen-form discoverability aliases for all six map builtins to
BUILTIN_ALIASES in src/ast/mod.rs: map-get→mget, map-set→mset,
map-has→mhas, map-del→mdel, map-keys→mkeys, map-values→mvals.
Regression tests in tests/regression_map_op_aliases.rs pin alias
resolution across all alias/canonical combinations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: align with fleet patches

* feat: add mget-or example and dedicated regression tests (ILO-42)

Add examples/mget-or.ilo demonstrating the mget-or builtin across text-key,
numeric-key, and text-value maps, and tests/regression_mget_or.rs pinning
hit/miss/type-mismatch behaviour across VM and Cranelift engines.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: align with fleet patches

* research: Gleam feature audit for 0.13.0 (ILO-37)

Audits Gleam 1.x features against ilo 0.12.x, evaluating each against
the six Manifesto principles. Top absorb candidates: use-style R T E
flattener, typed todo/panic, | alternatives and multi-subject ? match.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: align with fleet patches

* fix(verify): treat O _ (Optional Unknown) as opaque in builtin arg checks

jpar! unwraps to _ (Unknown). mget on _ returns O _ (Optional Unknown).
Previously, builtin type checks like len/hd/at/map/flt/srt/rev/zip/mkeys
etc only accepted Ty::Unknown as a "skip check" escape — O _ was a
concrete Optional type and would emit spurious ILO-T013 errors.

Adds is_opaque(ty) helper returning true for _ and O _, and threads it
through every builtin argument type-check arm that previously only passed
Ty::Unknown through. Downstream chains of the form:

  r=jpar! body; v=mget r "items"; len v

now verify cleanly. Real type errors (e.g. mget on a known L t) still fire.

Regression tests cover the block-validator and fix-plan-emitter persona
shapes from the 2026-05-21 A/B run (ILO-373).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: align with fleet patches

* chore: extract top-10 oversized dispatch arms into #[inline(never)] helpers (ILO-341)

Extracts the 10 largest `if builtin == Some(Builtin::...)` arms from
`call_function` into out-of-line `#[inline(never)]` helper functions,
mirroring the existing `rolling_window_run` / `lstsq_run` / `matvec_run`
pattern introduced in #494 / #506 / #515.

Arms extracted (by size, largest first):
- `ifft_run`          (was 198 lines → arm now 11 lines)
- `matmul_run`        (was 130 lines → arm now 17 lines)
- `rgxall_multi_run`  (was  82 lines → arm now 19 lines)
- `rgxall1_run`       (was  60 lines → arm now 19 lines)
- `rgxall_run`        (was  60 lines → arm now 18 lines)
- `quantile_run`      (was  54 lines → arm now 16 lines)
- `where_run`         (was  60 lines → arm now 19 lines)
- `stdev_run`         (was  43 lines → arm now  9 lines)
- `variance_run`      (was  43 lines → arm now  9 lines)
- `median_run`        (was  43 lines → arm now  9 lines)
- `rgx_run`           (was  42 lines → arm now 18 lines)
- `rgxsub_run`        (was  44 lines → arm now 21 lines)

39 arms remain above the 40-line threshold (tracked in ILO-341 follow-up).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: align with fleet patches

* ci: align with fleet patches

* feat: use<- chain flattener (ILO-409)

Add `<-` bind operator that desugars `x <- expr ; rest` into
`?expr{~x: rest; ^e: ^e}`, flattening multi-step R-T-E chains.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: changelog entry for ILO-409 use-chain flattener

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: cargo fmt --all

---------

Co-authored-by: Daniel Morris <daniel@cubitts.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant