From a89572cb5f339daa2c3f2a9ecbcf4b6cdaf17f8c Mon Sep 17 00:00:00 2001 From: Daniel Morris Date: Fri, 22 May 2026 00:33:30 +0100 Subject: [PATCH 1/3] feat: anonymous record literal shorthand {field:val} without typedef Adds `{field:val; field2:val2}` syntax for constructing ad-hoc structural records without a prior `type` declaration. The type checker synthesises a structural `AnonRecord` type that unifies with other anonymous records of the same shape. - Parser: recognises `{ident:...}` in expression position via `is_anon_record_literal` lookahead - AST: `Expr::AnonRecord { fields }` variant with full desugaring/traversal coverage - Type checker: `Ty::AnonRecord` structural type with field access, destructure, and `with` update - Interpreter: evaluates to `Value::Record` with `__anon` type_name - VM: compiles via `OP_RECNEW` / `OP_RECNEW_EMPTY` with shape-stable synthetic type names - Codegen: fmt and python backends handle the new variant - Example: `examples/anon-record.ilo` with 9 passing test cases Closes ILO-54 Co-Authored-By: Claude Sonnet 4.6 --- ai.txt | 1 - examples/anon-record.ilo | 75 ++++++++++++++++++++++++++ src/ast/mod.rs | 111 ++++++++++++++++++++++++++++++++++++++- src/codegen/fmt.rs | 7 +++ src/codegen/python.rs | 15 +++++- src/graph.rs | 5 ++ src/interpreter/mod.rs | 14 ++++- src/parser/mod.rs | 55 ++++++++++++++++++- src/verify.rs | 86 ++++++++++++++++++++++++++++++ src/vm/mod.rs | 75 ++++++++++++++++++++++++++ 10 files changed, 436 insertions(+), 8 deletions(-) create mode 100644 examples/anon-record.ilo diff --git a/ai.txt b/ai.txt index 987e496d..0b1813fe 100644 --- a/ai.txt +++ b/ai.txt @@ -19,4 +19,3 @@ PATTERNS (FOR LLM GENERATORS): [Bind-first pattern] Always bind complex expressi ERROR DIAGNOSTICS: ilo verifies programs before execution and reports errors with stable codes, source context, and suggestions. [Error codes] Every error has a stable `ILO-` code. The letter is the namespace - the phase that raised the diagnostic - so agents and tools can route on prefix without parsing the message. Numeric ranges are reserved per namespace with generous gaps, so future codes slot in cleanly and the contract is forward-compatible. `ILO-L000-099`=L=Lexer / tokenisation=active `ILO-P100-199`=P=Parser / syntax=active `ILO-N200-299`=N=Names / resolution=reserved `ILO-I300-399`=I=Imports=reserved `ILO-T400-499`=T=Types=active `ILO-V500-599`=V=Verifier (post-type checks)=reserved `ILO-R600-699`=R=Runtime=active `ILO-D700-799`=D=Deprecation warnings=reserved `ILO-E800-899`=E=Engine-specific limitations=reserved `ILO-S900-999`=S=Skill / spec system=reserved **Historical codes.** ilo shipped with flat numbering inside each namespace - `ILO-L001`, `ILO-P001`, `ILO-T001`, `ILO-R001`, `ILO-W001`, all starting at 001. Those codes remain valid forever. The hundreds-block allocation above applies to new codes from now on, and a cross-engine regression test asserts every emitted code lives in a documented range. **Reserved namespaces.** `N`, `I`, `V`, `D`, `E`, `S` carry no codes today. They are forward declarations so the first code in each category slots into its own range without conflicting with the active namespaces. `D` is earmarked for deprecation warnings: when a feature is scheduled for removal it emits an `ILO-D7xx` warning at compile time without failing the build. Use `--explain` to see a detailed explanation: ilo --explain ILO-T004 [Source context] Errors point at the relevant source location with a caret: error[ILO-T005]: undefined function 'foo' (called with 1 args) --> 1:9 1 | f x:n>n;foo x = note: in function 'f' = suggestion: did you mean 'f'? Parser, verifier, and runtime errors all show source spans. The verifier uses the enclosing statement span as the best available location for expression-level errors. [Suggestions] The verifier provides context-aware hints: **Did you mean?** - Levenshtein-based suggestions for undefined variables, functions, fields, and types **Type conversion** - suggests `str` for n→t, `num` for t→n **Missing arms** - lists uncovered match patterns with types **Arity** - shows expected parameter signature [Error output formats] --ansi / -a ANSI colour (default for TTY) --text / -t Plain text (no colour) --json / -j JSON (default for piped output) --no-hints / -nh Suppress idiomatic hints --silent / -s Suppress program stdout (mainly for --bench; see below) NO_COLOR=1 Disable colour (same as --text) **`--silent` / `-s`.** Suppresses the program's own stdout (`prnt`, `prnv`, `jprn`, etc.) for the duration of execution. Designed for `ilo --bench`: combined with `--json` it lets agent harnesses (e.g. persona cost rollup) consume the bench JSON envelope on stdout without it being drowned in the benchmarked function's own output. Stderr is never silenced, so genuine errors still surface. Diagnostic output (including the bench JSON envelope and the human-readable bench summary block) is always emitted on stdout regardless of `--silent` — the flag only redirects program-level prints. Unix only (no-op on Windows for the program-stdout half; bench output still reaches stdout there). JSON error output follows a structured schema with `severity`, `code`, `message`, `labels` (with spans), `notes`, and `suggestion` fields. Runtime errors raised from the Cranelift JIT (opt-in via `--jit`) populate `labels` with the source span of the failing operation, matching tree and VM behaviour. Span coverage threads through every JIT runtime helper (unwrap, panic-unwrap, list-get, slice, index, jpth, mget, record-field strict access, builtin dispatch, dynamic call); AOT-compiled binaries inherit the same coverage. Pre-v0.11.6 builds surfaced `{"labels":[]}` for these shapes - if you see an empty labels array on a runtime error, the binary is out of date. AOT binaries also install an async-signal-safe handler in `ilo_aot_init` that catches fatal signals (SIGSEGV, SIGBUS, SIGFPE, SIGILL, SIGABRT) and writes a single JSON line on stderr identifying the signal before the process terminates with the conventional 128+signo exit code. The diagnostic uses `ILO-R015` (AOT runtime fault). Without the handler, a hard fault inside compiled native code would leave the process with raw signal exit (e.g. 139 for SIGSEGV) and no diagnostic — agents driving ilo couldn't distinguish a clean non-zero exit from a hard fault. A SIGSEGV from an AOT binary is always a bug in ilo (codegen or runtime helper); file an issue with the source program and the JSON line. AOT binaries also install an async-signal-safe handler in `ilo_aot_init` that catches fatal signals (SIGSEGV, SIGBUS, SIGFPE, SIGILL, SIGABRT) and writes a single JSON line on stderr identifying the signal before the process terminates with the conventional 128+signo exit code. The diagnostic uses `ILO-R015` (AOT runtime fault). Without the handler, a hard fault inside compiled native code would leave the process with raw signal exit (e.g. 139 for SIGSEGV) and no diagnostic — agents driving ilo couldn't distinguish a clean non-zero exit from a hard fault. A SIGSEGV from an AOT binary is always a bug in ilo (codegen or runtime helper); file an issue with the source program and the JSON line. [Top-level program output] For a program whose entry function returns a Result, the `~`/`^` wrapper is split across streams and exit codes so shell callers do not have to strip a prefix: `~v` (Ok)=`v` (bare)=-=0 `^e` (Err)=-=`^e`=1 any non-Result=`v`=-=0 In `--json` mode the value is always wrapped (`{"schemaVersion": 1, "ok": v}` / `{"schemaVersion": 1, "error": {...}}`) and emitted to stdout; exit codes match the plain-mode table. The `schemaVersion` field was added in 0.12.1 to every CLI `--json` envelope (`run`, `graph`, `--ast`, `serv`, `tools --json`, `spec --json`) so agents can route on a single field across every command. See `JSON_OUTPUT.md` for the full audit table. `Display` on `Value::Ok` / `Value::Err` still renders `~v` / `^e` in every other context (nested values, `prnt`, REPL prompts, error messages, debug output) - only the top-level program-return print path is split. The contract applies uniformly to in-process runners (`ilo prog.ilo`, `--vm`, `--jit`) and to AOT-compiled standalone binaries from `ilo compile`. Both strip the top-level `~`/`^` wrapper on stdout, route `^e` to stderr, and use the same exit codes - output is byte-for-byte identical across every backend. **Auto-echo suppression for `prnt` + status sentinel.** When the entry function has at least one *unconditional top-level* `prnt` call AND the tail expression is a bare wrapped string literal (`~"text"` or `^"text"`), the top-level auto-echo is suppressed. The wrapped literal is treated as a status sentinel rather than a value the caller wants captured. Without this rule, a function shaped like `m>R t t;prnt "report";~"ok"` emits `report\nok\n` on stdout and shell callers piping the output have to strip the trailing `ok`. The rule does NOT fire when (a) there is no `prnt` in the body — `m>R t t;~"ok"` still prints `ok` because the wrapped literal IS the program's output (the `cli-tasks-save-ok.ilo` pattern); (b) the `prnt` is nested inside a guard, loop, or match arm — those are conditional and the `prnt` may never run; (c) the tail is `~v` where `v` is a binding or call — that's a real return value. `^"text"` errors still go to stderr with exit 1; the suppression rule never silently swallows an Err. Pinned by `tests/regression_tilde_str_noecho.rs` and `examples/tilde-str-noecho.ilo`. [Idiomatic hints] After successful execution, ilo scans the source for non-canonical forms and emits hints to stderr: hint: `==` → `=` saves 1 char (both mean equality in ilo) hint: `length` → `len` (canonical short form) Builtin alias hints appear at most once per program (the first long-form name found). In JSON mode, hints appear as `{"hints":["..."]}` on stderr. Suppress with `--no-hints` / `-nh`. [CLI invocation] ilo 'code' [args...] -- inline program; default-runs the entry function ilo program.ilo [func] [args] -- if `func` is omitted and the file declares exactly one function, that function runs automatically ilo run program.ilo [func] [a] -- verb form; same dispatch as the bare positional ilo check program.ilo [--json] [--strict] -- run the verifier without executing (exit 0 = clean; --strict treats warnings as exit-code errors) ilo test [path] [--engine vm|jit|all] -- run `-- run:` / `-- out:` / `-- err:` assertions in .ilo files (exit 0 on all-pass, 1 on any failure) ilo build program.ilo -o out -- AOT compile to a standalone binary (alias for `compile`) ilo program.ilo --ast -- print parsed AST as JSON and exit ilo --explain ILO-T004 -- print error explanation and exit ilo help ai -- compact AI spec to stdout (= contents of ai.txt) ilo serv -- long-lived JSON request/response loop ilo --max-ast-depth N -- cap parser nesting at N (default 256; protects `ilo serv` and other untrusted-source paths from DoS payloads, raises ILO-P103) ilo --max-runtime SECS -- cap wall-clock runtime at SECS (default 60; 0 disables; raises ILO-R016) ilo --max-output-bytes BYTES -- cap stdout output at BYTES (default ~100 MB; 0 disables; raises ILO-R017) **Production-safety guards (`ILO-R016`, `ILO-R017`).** `ilo run` caps wall-clock runtime at 60 s and stdout output at ~100 MB by default. A runaway loop (missing increment, recursion with no base case) aborts with `ILO-R016` once the time budget hits, instead of burning CPU forever; a `prnt` loop without termination aborts with `ILO-R017` once the byte budget hits, instead of filling the agent transcript with megabytes of garbage. Both guards write a structured diagnostic to stderr and exit 1. Defaults are well above any legitimate program (real agent tasks finish under 10 s and produce kilobytes); raise with `--max-runtime SECS` / `--max-output-bytes BYTES`, set either to `0` to disable. The guards were installed by the mandelbrot persona report (2026-05-20) which spun in an infinite loop and wrote 165 MB of stdout before the harness intervened. **Verb-noun aliases.** `ilo run ` is an exact alias for the bare positional `ilo ` - same dispatch, same engine selection, same arg handling. `ilo build -o ` is an alias for `ilo compile -o `. Both exist to match the toolchain conventions used by `cargo`, `go`, and `zero` so agents and humans can guess the command name without consulting the help text. The bare positional forms remain fully supported for backwards compatibility; nothing has been removed. **`ilo check`.** Standalone verifier invocation: lex, parse, resolve imports, and run the type verifier without proceeding to bytecode compilation or execution. Exit code 0 means the program is well-typed and verifier-clean; exit code 1 means at least one diagnostic was emitted on stderr. The output mode follows the global flags (`--json` for NDJSON diagnostics, `--text` for plain text, `--ansi` for coloured output; auto-detected when omitted - JSON when stderr is not a TTY, ANSI otherwise). `ilo check` works on both files and inline code; on a syntactically-broken input it still reports the parse error rather than crashing, which is important for editor and agent loops that may feed in half-written programs. **`ilo test`.** Runs the `-- run: ` / `-- out: ` (or `-- err: `) annotations embedded in `.ilo` source files - the same format the in-tree `tests/examples_engines.rs` integration harness already uses. A file path tests that one file; a directory walks `*.ilo` recursively. Each case runs as a subprocess (`ilo --vm `), output is asserted against the expected payload, and the result prints as `PASS path::fn (line N)` / `FAIL path::fn (line N) (got: X, want: Y)`. The final line reports `N passed, M failed`. Exit 0 if everything passed, 1 if any case failed or no annotations were found. The default engine is `--vm`; pass `--engine jit` or `--engine all` to widen the matrix. Per-file `-- engine-skip: vm jit` annotations skip the listed engines, matching the integration harness. Because every example under `examples/` uses this annotation format already, `ilo test examples/` doubles as a smoke test for the language itself and as a worked reference an agent can read when writing tests for its own programs. **`ilo check --strict`.** Treats every warning-severity diagnostic (ILO-T032 bare `fmt`, ILO-T033 bare `mset` / `+=` / `mdel`, ILO-W002 `@x (jpar! …){…}` steering to `jpar-list!`, future warning codes) as a hard exit-code failure. The diagnostic stream itself is unchanged: warnings still emit with `severity: "warning"` in the JSON output, so editor integrations that route by severity stay correct. Only the exit code is elevated. CI harnesses that gate merges on `ilo check` should use `--strict` so warnings can't slip through silently; for interactive use, the default (warnings-are-advisory) is the right behaviour. **Default-run.** Inline programs (`ilo 'code'`) and single-function files run their entry function with the remaining CLI args; no explicit function name needed. Multi-function files auto-pick a function called `main` when no positional func arg is supplied. The same heuristic applies to the explicit engine flags - `--vm` and `--jit` both auto-pick `main` on multi-fn files, matching the default-engine behaviour. With no `main` declared, supply a function-name argument. **AOT entry-pick.** `ilo compile file.ilo -o out` (alias `ilo build`) follows the same entry-pick rules as the in-process engines: a single user-defined function is used directly; on multi-function files the entry is `main` if defined, otherwise the explicit positional `func` arg (`ilo compile file.ilo -o out run`); otherwise the compile fails with `ILO-E801` and exits 1 without writing a binary. AOT does not fall back to "first declared function" - that historical default produced binaries that called the wrong entry symbol and SIGSEGV'd at runtime. **Default engine.** The bytecode register VM is the default execution path. It supports every opcode (closures with Phase 2 capture, listview windows, fused len-of-filter, every modern shape), and avoids the JIT compile-and-bail cost paid by the pre-v0.11.9 Cranelift-first default whenever a program touched an opcode the JIT couldn't handle. Cranelift JIT is opt-in via `--jit`; on opt-in, the JIT runs hot numeric loops and falls back to the VM on bailout. Phase 2 captures run natively on every public backend - VM, JIT, and AOT (`ilo compile`); AOT embeds the postcard `CompiledProgram` blob into the binary's `.rodata` so dispatch helpers can re-enter the VM on user-fn callbacks the same way the in-process runners do. For long-running workloads where the JIT pays for itself, opt in explicitly; for most agent workloads the VM is the right default. **Tree-walker is internal-only.** The tree-walking interpreter is no longer user-selectable: `--run-tree` and its `--run` alias were removed from the public CLI in 0.12.1 (they now error with the unknown-flag guard). The interpreter stays in-tree as the dispatch target for HOF / regex / fmt-variadic / IO / sleep / ct / rsrt / closure-bind-ctx shapes the VM and Cranelift haven't lifted natively yet - the VM bails to it transparently for the ops listed by `is_tree_bridge_eligible` (`rgx`, `rgxall`, `rgxall1`, `rgxall-multi`, `rgxsub`, `fmt`, `fmt2`, `rd`, `rdb`, `rdjl`, `rdin`, `rdinl`, `sleep`, `lsd`, `walk`, `glob`, `dirname`, `basename`, `pathjoin`, `fsize`, `mtime`, `isfile`, `isdir`, `run`, `env-all`, `jkeys`, `tz-offset`, `ct` 2-arg and 3-arg, `rsrt` 2-arg and 3-arg, `dur-parse`, `dur-fmt`, and the closure-bind ctx variants of `map`/`flt`/`fld`/`srt`). Cross-engine parity for those shapes is pinned by `tests/regression_builtin_bridge.rs` and `tests/regression_tree_bridge_invariants.rs`. 0.13.0+ is on track for a hard drop once the bridge consumers are lifted natively and the shared runtime types (`Value`, `MapKey`, `RuntimeError`, math helpers) are extracted from `src/interpreter/` to a non-engine module. **Subcommand dispatch.** The first positional argument is interpreted as a function name when it has the shape of an ilo identifier - `[a-z][a-z0-9]*(-[a-z0-9]+)*` - so `ilo file.ilo list-orders` routes to the `list-orders` function. Args that don't match the ident shape (file paths like `/tmp/data.json`, numbers, sigils, bracketed lists, anything with a `.` or `/`) route to `main` (or the entry function) as a positional CLI arg instead. Trailing dashes (`foo-`), doubled dashes (`foo--bar`), and negative numbers (`-1`) are not idents and pass through as data. **Unknown `--flag` guard.** Any token in the positional tail matching the clean long-flag shape `--word` or `--word-with-dashes` that isn't a recognised flag is rejected upfront with `error: unrecognised flag '--'. Use 'ilo --help' for valid flags. To pass it as a literal arg, separate with '--' first.` and exit 1. This prevents `ilo main.ilo --engine tree` from silently consuming `--engine` as a positional arg (which used to surface as misleading `ILO-R012 no functions defined` or `ILO-R004 main: expected N args, got N+1`). To pass a hyphen-prefixed token through as literal data, place the `--` separator first: `ilo main.ilo -- --foo`. Anything after the first `--` is data. Tokens with `=` (`--key=val`), trailing or doubled dashes (`--foo-`, `--foo--bar`), and negative numbers (`-1`) are not clean flag shapes and pass through unchanged. **Text-typed params.** When the entry function declares a parameter of type `t`, the CLI passes the raw arg through without numeric coercion. `ilo 'f x:t>t;x' 42` returns the string `"42"`, not the number 42. **Exit codes.** A program returning `Value::Err` (or `^reason` from the entry function) exits with code 1 and prints the err payload on stderr. `~v` (Ok) and any non-Result return value exit 0. Verifier and parser errors exit 2. **List args from the CLI.** Comma-separated args become `L n` or `L t` automatically: `ilo 'f xs:L n>n;sum xs' 1,2,3`. FORMATTER: Dense output is the default - newlines are for humans, not agents. No flag needed for dense format: ilo 'code' Dense wire format (default) ilo 'code' --dense / -d Same, explicit ilo 'code' --expanded / -e Expanded human format (for code review) [Dense format] Single line per declaration, minimal whitespace. Operators glue to first operand: cls sp:n>t;>=sp 1000{"gold"};>=sp 500{"silver"};"bronze" [Expanded format] Multi-line with 2-space indentation. Operators spaced from operands: cls sp:n > t >= sp 1000 { "gold" } >= sp 500 { "silver" } "bronze" Dense format is canonical - `dense(parse(dense(parse(src)))) == dense(parse(src))`. COMPLETE EXAMPLE: tool get-user"Retrieve user by ID" uid:t>R profile t timeout:5,retry:2 tool send-email"Send an email" to:t subject:t body:t>R _ t timeout:10,retry:1 type profile{id:t;name:t;email:t;verified:b} ntf uid:t msg:t>R _ t;get-user uid;?{^e:^+"Lookup failed: "e;~d:!d.verified{^"Email not verified"};send-email d.email "Notification" msg;?{^e:^+"Send failed: "e;~_:~_}} [Recursive Example] Factorial and Fibonacci as standalone functions: fac n:n>n;<=n 1 1;r=fac -n 1;*n r fib n:n>n;<=n 1 n;a=fib -n 1;b=fib -n 2;+a b -STABILITY: See STABILITY.md at repo root for the per-surface stability matrix. Three tiers: stable (schemaVersion:1 envelope, ILO-error-codes, serv-protocol-phases, file-version-pragma, manifesto-principles, reserved-name-policy), provisional (builtin-signatures, cli-flag-names, error-message-prose, examples-corpus, ilo-test-surface), experimental (0.13-in-flight-features, aot-artifact-format, cranelift-jit-internals, extensions-dir, cargo-feature-flags). Stable surfaces are safe to pin across releases. Provisional surfaces carry a deprecation-window guarantee. Experimental surfaces may disappear without notice. `ilo spec --json ai` surfaces this matrix in the `stability` field of the JSON envelope. diff --git a/examples/anon-record.ilo b/examples/anon-record.ilo new file mode 100644 index 00000000..d460aed5 --- /dev/null +++ b/examples/anon-record.ilo @@ -0,0 +1,75 @@ +-- ILO-54: Anonymous record literals — no typedef required. +-- +-- Exercises: construction, field access, pass-to-fn, return-from-fn, +-- destructure, and `with` update. + +-- Pass anonymous record to a function +greet x:_>t + x.name + +-- Return anonymous record from a function +make-point ax:n ay:n>_ + {x:ax y:ay} + +-- Basic construction and field access +access-name>t + r = {name:"alice" age:30} + r.name + +access-age>n + r = {name:"alice" age:30} + r.age + +-- Pass anonymous record to a function +pass-to-fn>t + greet {name:"bob"} + +-- Return anonymous record from a function and access fields +return-x>n + p = make-point 3 4 + p.x + +return-y>n + p = make-point 3 4 + p.y + +-- Destructure anonymous record +destruct-name>t + r = {name:"alice" age:30} + {name;age} = r + name + +destruct-age>n + r = {name:"alice" age:30} + {name;age} = r + age + +-- Update via with +with-name>t + r = {name:"alice" age:30} + r2 = r with name:"carol" + r2.name + +with-age>n + r = {name:"alice" age:30} + r2 = r with name:"carol" + r2.age + +-- run: access-name +-- out: alice +-- run: access-age +-- out: 30 +-- run: pass-to-fn +-- out: bob +-- run: return-x +-- out: 3 +-- run: return-y +-- out: 4 +-- run: destruct-name +-- out: alice +-- run: destruct-age +-- out: 30 +-- run: with-name +-- out: carol +-- run: with-age +-- out: 30 diff --git a/src/ast/mod.rs b/src/ast/mod.rs index 66607678..7d1bca53 100644 --- a/src/ast/mod.rs +++ b/src/ast/mod.rs @@ -339,6 +339,13 @@ pub enum Expr { fields: Vec<(String, Expr)>, }, + /// Anonymous record literal: `{field:val field:val}` — no typename required. + /// Type checker synthesises a structural type; runtime uses `"__anon"` as the + /// Value::Record type_name since engines only care about field names. + AnonRecord { + fields: Vec<(String, Expr)>, + }, + /// Match expression: `?expr{arms}` or `?{arms}` used as value Match { subject: Option>, @@ -650,7 +657,7 @@ fn resolve_aliases_expr(expr: &mut Expr) { resolve_aliases_expr(item); } } - Expr::Record { fields, .. } => { + Expr::Record { fields, .. } | Expr::AnonRecord { fields } => { for (_, val) in fields { resolve_aliases_expr(val); } @@ -714,6 +721,12 @@ pub fn desugar_dot_var_index(program: &mut Program) { record_fields.insert(p.name.clone()); } } + // Also collect field names from anonymous record literals so that + // `r.name` where `name` happens to be a local variable is NOT + // rewritten to `at r name` — anonymous records are still records. + if let Decl::Function { body, .. } = decl { + collect_anon_record_fields_stmts(body, &mut record_fields); + } } for decl in &mut program.declarations { @@ -848,7 +861,7 @@ fn desugar_expr(expr: &mut Expr, scope: &[String], rf: &std::collections::HashSe desugar_expr(it, scope, rf); } } - Expr::Record { fields, .. } => { + Expr::Record { fields, .. } | Expr::AnonRecord { fields } => { for (_, v) in fields { desugar_expr(v, scope, rf); } @@ -913,6 +926,100 @@ fn desugar_expr(expr: &mut Expr, scope: &[String], rf: &std::collections::HashSe } } +/// Collect field names from all AnonRecord literals in a statement list. +fn collect_anon_record_fields_stmts( + stmts: &[Spanned], + out: &mut std::collections::HashSet, +) { + for stmt in stmts { + collect_anon_record_fields_stmt(&stmt.node, out); + } +} + +fn collect_anon_record_fields_stmt(stmt: &Stmt, out: &mut std::collections::HashSet) { + match stmt { + Stmt::Let { value, .. } => collect_anon_record_fields_expr(value, out), + Stmt::Expr(e) | Stmt::Return(e) => collect_anon_record_fields_expr(e, out), + Stmt::Break(Some(e)) => collect_anon_record_fields_expr(e, out), + Stmt::Guard { + condition, + body, + else_body, + .. + } => { + collect_anon_record_fields_expr(condition, out); + collect_anon_record_fields_stmts(body, out); + if let Some(eb) = else_body { + collect_anon_record_fields_stmts(eb, out); + } + } + Stmt::While { condition, body } => { + collect_anon_record_fields_expr(condition, out); + collect_anon_record_fields_stmts(body, out); + } + Stmt::ForEach { collection, body, .. } => { + collect_anon_record_fields_expr(collection, out); + collect_anon_record_fields_stmts(body, out); + } + Stmt::Destructure { value, .. } => collect_anon_record_fields_expr(value, out), + _ => {} + } +} + +fn collect_anon_record_fields_expr(expr: &Expr, out: &mut std::collections::HashSet) { + match expr { + Expr::AnonRecord { fields } => { + for (name, val) in fields { + out.insert(name.clone()); + collect_anon_record_fields_expr(val, out); + } + } + Expr::Record { fields, .. } => { + for (_, val) in fields { + collect_anon_record_fields_expr(val, out); + } + } + Expr::Call { args, .. } => { + for arg in args { + collect_anon_record_fields_expr(arg, out); + } + } + Expr::BinOp { left, right, .. } => { + collect_anon_record_fields_expr(left, out); + collect_anon_record_fields_expr(right, out); + } + Expr::UnaryOp { operand, .. } => collect_anon_record_fields_expr(operand, out), + Expr::Field { object, .. } => collect_anon_record_fields_expr(object, out), + Expr::Index { object, .. } => collect_anon_record_fields_expr(object, out), + Expr::With { object, updates } => { + collect_anon_record_fields_expr(object, out); + for (_, val) in updates { + collect_anon_record_fields_expr(val, out); + } + } + Expr::List(items) => { + for item in items { + collect_anon_record_fields_expr(item, out); + } + } + Expr::Ok(e) | Expr::Err(e) => collect_anon_record_fields_expr(e, out), + Expr::Ternary { + condition, + then_expr, + else_expr, + } => { + collect_anon_record_fields_expr(condition, out); + collect_anon_record_fields_expr(then_expr, out); + collect_anon_record_fields_expr(else_expr, out); + } + Expr::NilCoalesce { value, default } => { + collect_anon_record_fields_expr(value, out); + collect_anon_record_fields_expr(default, out); + } + _ => {} + } +} + /// Cycle-capability classifier for runtime values of a given static type. /// /// Background: ilo's runtime is reference-counted (Arc in the tree diff --git a/src/codegen/fmt.rs b/src/codegen/fmt.rs index 38805c0d..ac901ec3 100644 --- a/src/codegen/fmt.rs +++ b/src/codegen/fmt.rs @@ -578,6 +578,13 @@ fn fmt_expr(expr: &Expr, mode: FmtMode) -> String { let items_str: Vec = items.iter().map(|i| fmt_expr(i, mode)).collect(); format!("[{}]", items_str.join(", ")) } + Expr::AnonRecord { fields } => { + let fields_str: Vec = fields + .iter() + .map(|(n, v)| format!("{}:{}", n, fmt_expr(v, mode))) + .collect(); + format!("{{{}}}", fields_str.join(" ")) + } Expr::Record { type_name, fields } => { if fields.is_empty() { return type_name.clone(); diff --git a/src/codegen/python.rs b/src/codegen/python.rs index 5dab9ccb..88ecbda3 100644 --- a/src/codegen/python.rs +++ b/src/codegen/python.rs @@ -105,7 +105,9 @@ fn expr_uses_rd(expr: &Expr) -> bool { Expr::Ok(e) | Expr::Err(e) => expr_uses_rd(e), Expr::Field { object, .. } | Expr::Index { object, .. } => expr_uses_rd(object), Expr::List(items) => items.iter().any(expr_uses_rd), - Expr::Record { fields, .. } => fields.iter().any(|(_, e)| expr_uses_rd(e)), + Expr::Record { fields, .. } | Expr::AnonRecord { fields } => { + fields.iter().any(|(_, e)| expr_uses_rd(e)) + } Expr::Match { subject, arms } => { subject.as_ref().is_some_and(|s| expr_uses_rd(s)) || arms @@ -171,7 +173,9 @@ fn expr_uses_unwrap(expr: &Expr) -> bool { Expr::Ok(e) | Expr::Err(e) => expr_uses_unwrap(e), Expr::Field { object, .. } | Expr::Index { object, .. } => expr_uses_unwrap(object), Expr::List(items) => items.iter().any(expr_uses_unwrap), - Expr::Record { fields, .. } => fields.iter().any(|(_, e)| expr_uses_unwrap(e)), + Expr::Record { fields, .. } | Expr::AnonRecord { fields } => { + fields.iter().any(|(_, e)| expr_uses_unwrap(e)) + } Expr::Match { subject, arms } => { subject.as_ref().is_some_and(|s| expr_uses_unwrap(s)) || arms @@ -964,6 +968,13 @@ fn emit_expr(out: &mut String, level: usize, expr: &Expr) -> String { let items_str: Vec = items.iter().map(|i| emit_expr(out, level, i)).collect(); format!("[{}]", items_str.join(", ")) } + Expr::AnonRecord { fields } => { + let mut parts = Vec::new(); + for (name, val) in fields { + parts.push(format!("\"{}\": {}", name, emit_expr(out, level, val))); + } + format!("{{{}}}", parts.join(", ")) + } Expr::Record { type_name, fields } => { let mut parts = vec![format!("\"_type\": \"{}\"", type_name)]; for (name, val) in fields { diff --git a/src/graph.rs b/src/graph.rs index 518a8586..560192af 100644 --- a/src/graph.rs +++ b/src/graph.rs @@ -89,6 +89,11 @@ fn collect_calls(expr: &Expr, calls: &mut BTreeSet, types: &mut BTreeSet collect_calls(arg, calls, types); } } + Expr::AnonRecord { fields } => { + for (_, val) in fields { + collect_calls(val, calls, types); + } + } Expr::Record { type_name, fields, .. } => { diff --git a/src/interpreter/mod.rs b/src/interpreter/mod.rs index 40e7151a..cc85b112 100644 --- a/src/interpreter/mod.rs +++ b/src/interpreter/mod.rs @@ -7983,7 +7983,9 @@ fn expr_refers_to(name: &str, expr: &Expr) -> bool { Expr::UnaryOp { operand, .. } => expr_refers_to(name, operand), Expr::Ok(inner) | Expr::Err(inner) => expr_refers_to(name, inner), Expr::List(items) => items.iter().any(|e| expr_refers_to(name, e)), - Expr::Record { fields, .. } => fields.iter().any(|(_, e)| expr_refers_to(name, e)), + Expr::Record { fields, .. } | Expr::AnonRecord { fields } => { + fields.iter().any(|(_, e)| expr_refers_to(name, e)) + } // Conservative: assume Match arms might reference `name`. Falls back // to the general path, which is correct (just slower) in the rare // case where a self-rebind RHS is wrapped in a match. @@ -8603,6 +8605,16 @@ fn eval_expr(env: &mut Env, expr: &Expr) -> Result { } Ok(Value::List(Arc::new(vals))) } + Expr::AnonRecord { fields } => { + let mut field_map = HashMap::new(); + for (name, val_expr) in fields { + field_map.insert(name.clone(), eval_expr(env, val_expr)?); + } + Ok(Value::Record { + type_name: "__anon".to_string(), + fields: field_map, + }) + } Expr::Record { type_name, fields } => { let mut field_map = HashMap::new(); for (name, val_expr) in fields { diff --git a/src/parser/mod.rs b/src/parser/mod.rs index 053633b1..2cbf2f07 100644 --- a/src/parser/mod.rs +++ b/src/parser/mod.rs @@ -969,7 +969,12 @@ impl Parser { // the dense single-line workaround they've been settling for. // Skip the brace-block path when the leading `{` is a destructure // pattern (`f p:pt>n;{x}=p;...`) — that's a statement, not a wrap. - let body = if self.peek() == Some(&Token::LBrace) && !self.is_destructure_pattern() { + // Also skip when it looks like an anonymous record literal `{field:val ...}`: + // that's a return-expression, not a brace-wrapped body. + let body = if self.peek() == Some(&Token::LBrace) + && !self.is_destructure_pattern() + && !self.is_anon_record_literal() + { self.parse_brace_body_or_record(&name)? } else { self.parse_body_or_record(&name)? @@ -3530,6 +3535,41 @@ or write `({fmt_name} \"...\" ...)` so its args are grouped." Ok(Expr::Record { type_name, fields }) } + /// Lookahead: does `{` start an anonymous record literal? + /// + /// Returns true when the token stream looks like `{ ident : ...` — i.e. + /// the first token inside the braces is an identifier immediately followed + /// by a colon. This is unambiguous: a destructure pattern `{a;b}=` uses + /// semicolons, a match/guard body block never starts with `ident:`, and + /// the existing map-literal friendly-error fires only on text/number heads. + fn is_anon_record_literal(&self) -> bool { + // Current token must be `{`; pos+1 is the first field name; pos+2 is `:`. + self.peek() == Some(&Token::LBrace) + && matches!(self.token_at(self.pos + 1), Some(Token::Ident(_))) + && self.token_at(self.pos + 2) == Some(&Token::Colon) + } + + /// Parse the body of an anonymous record literal (after `{` has been consumed). + /// + /// Grammar: `ident:atom (ident:atom)*` then expects `}` from caller. + fn parse_anon_record_body(&mut self) -> Result { + let mut fields = Vec::new(); + while self.is_named_field_ahead() { + let fname = self.expect_ident()?; + self.expect(&Token::Colon)?; + let value = self.parse_atom()?; + fields.push((fname, value)); + } + if fields.is_empty() { + return Err(self.error_hint( + "ILO-P009", + "anonymous record literal `{...}` must have at least one field".into(), + "use `{field:value}` syntax, e.g. `{name:\"alice\" age:30}`".into(), + )); + } + Ok(Expr::AnonRecord { fields }) + } + /// Lookahead: does the token at `pos` start a prefix binary operator /// (operator followed by 2+ simple atoms before the next operator/terminator)? /// @@ -3839,6 +3879,10 @@ results first: `r={first_op}a b;…r` keeps each step explicit." /// Can the current token start an atom? fn can_start_atom(&self) -> bool { + // Anonymous record literal `{field:val ...}` is also a valid atom start. + if self.is_anon_record_literal() { + return true; + } matches!( self.peek(), Some(Token::Ident(_)) @@ -4083,6 +4127,13 @@ results first: `r={first_op}a b;…r` keeps each step explicit." self.expect(&Token::RBracket)?; Ok(Expr::List(items)) } + Some(Token::LBrace) if self.is_anon_record_literal() => { + self.advance(); // consume `{` + let expr = self.parse_anon_record_body()?; + self.expect(&Token::RBrace)?; + let expr = self.parse_field_chain(expr, None)?; + Ok(expr) + } Some(Token::Ident(name)) => { self.advance(); // Zero-arg builtins used as operands (arguments to other calls) @@ -4580,7 +4631,7 @@ For variable-position list indexing bind the head first: \ self.collect_free_in_expr(i, params, local, free); } } - Expr::Record { fields, .. } => { + Expr::Record { fields, .. } | Expr::AnonRecord { fields } => { for (_, v) in fields { self.collect_free_in_expr(v, params, local, free); } diff --git a/src/verify.rs b/src/verify.rs index 4f4bf3fe..78566ee4 100644 --- a/src/verify.rs +++ b/src/verify.rs @@ -19,6 +19,10 @@ pub enum Ty { /// Function type: params then return. `F n n` = Fn(vec![Number], Number). Fn(Vec, Box), Named(String), + /// Structural record type inferred from an anonymous record literal `{f:v ...}`. + /// Two `AnonRecord` types are compatible when they have exactly the same field + /// names and compatible field types (order-independent). + AnonRecord(Vec<(String, Ty)>), Unknown, } @@ -48,6 +52,11 @@ impl std::fmt::Display for Ty { write!(f, " {ret}") } Ty::Named(name) => write!(f, "{name}"), + Ty::AnonRecord(fields) => { + let parts: Vec = + fields.iter().map(|(n, t)| format!("{n}:{t}")).collect(); + write!(f, "{{{}}}", parts.join(" ")) + } Ty::Unknown => write!(f, "_"), } } @@ -223,6 +232,18 @@ fn compatible(a: &Ty, b: &Ty) -> bool { && compatible(ar, br) } (Ty::Named(a), Ty::Named(b)) => a == b, + // Two anonymous records unify when they have the same field names (order-independent) + // and compatible field types. + (Ty::AnonRecord(a_fields), Ty::AnonRecord(b_fields)) => { + if a_fields.len() != b_fields.len() { + return false; + } + let b_map: std::collections::HashMap<&str, &Ty> = + b_fields.iter().map(|(n, t)| (n.as_str(), t)).collect(); + a_fields + .iter() + .all(|(n, t)| b_map.get(n.as_str()).is_some_and(|bt| compatible(t, bt))) + } _ => false, } } @@ -4384,6 +4405,27 @@ impl VerifyContext { Stmt::Destructure { bindings, value } => { let record_ty = self.infer_expr(func, scope, value, span); match &record_ty { + Ty::AnonRecord(fields_ty) => { + let fields_ty = fields_ty.clone(); + for binding in bindings { + if let Some((_, fty)) = fields_ty.iter().find(|(n, _)| n == binding) { + scope_insert(scope, binding.clone(), fty.clone()); + } else { + let field_names: Vec = + fields_ty.iter().map(|(n, _)| n.clone()).collect(); + let hint = closest_match(binding, field_names.iter()) + .map(|s| format!("did you mean '{s}'?")); + self.err( + "ILO-T019", + func, + format!("no field '{binding}' on anonymous record"), + hint, + Some(span), + ); + scope_insert(scope, binding.clone(), Ty::Unknown); + } + } + } Ty::Named(type_name) => { if let Some(type_def) = self.types.get(type_name).cloned() { for binding in bindings { @@ -5323,6 +5365,16 @@ impl VerifyContext { } } + Expr::AnonRecord { fields } => { + // Infer each field's type and return a structural AnonRecord type. + // No declaration required; shape is derived entirely from the literal. + let inferred: Vec<(String, Ty)> = fields + .iter() + .map(|(n, e)| (n.clone(), self.infer_expr(func, scope, e, span))) + .collect(); + Ty::AnonRecord(inferred) + } + Expr::Record { type_name, fields } => { if let Some(type_def) = self.types.get(type_name) { let def_fields = type_def.fields.clone(); @@ -5402,6 +5454,24 @@ impl VerifyContext { return Ty::Nil; } match &obj_ty { + Ty::AnonRecord(fields_ty) => { + if let Some((_, fty)) = fields_ty.iter().find(|(n, _)| n == field) { + fty.clone() + } else { + let field_names: Vec = + fields_ty.iter().map(|(n, _)| n.clone()).collect(); + let hint = closest_match(field, field_names.iter()) + .map(|s| format!("did you mean '{s}'?")); + self.err( + "ILO-T019", + func, + format!("no field '{field}' on anonymous record"), + hint, + Some(span), + ); + Ty::Unknown + } + } Ty::Named(type_name) => { if let Some(type_def) = self.types.get(type_name) { if let Some((_, fty)) = type_def.fields.iter().find(|(n, _)| n == field) @@ -5614,6 +5684,22 @@ ilo has no tuple type." Expr::With { object, updates } => { let obj_ty = self.infer_expr(func, scope, object, span); match &obj_ty { + Ty::AnonRecord(fields_ty) => { + // Build updated fields: carry through unchanged fields, replace updated ones. + let def_fields = fields_ty.clone(); + let mut new_fields = def_fields.clone(); + for (fname, expr) in updates { + if let Some(pos) = new_fields.iter().position(|(n, _)| n == fname) { + let actual = self.infer_expr(func, scope, expr, span); + new_fields[pos] = (fname.clone(), actual); + } else { + // New field being added via `with` — allowed for anonymous records + let actual = self.infer_expr(func, scope, expr, span); + new_fields.push((fname.clone(), actual)); + } + } + Ty::AnonRecord(new_fields) + } Ty::Named(type_name) => { if let Some(type_def) = self.types.get(type_name) { let def_fields = type_def.fields.clone(); diff --git a/src/vm/mod.rs b/src/vm/mod.rs index 3ab6d8c0..ccd2a779 100644 --- a/src/vm/mod.rs +++ b/src/vm/mod.rs @@ -5757,6 +5757,81 @@ impl RegCompiler { } } + Expr::AnonRecord { fields } => { + // Anonymous record: synthesize a stable type name from the sorted + // field list so that two literals with the same shape share one + // registry entry (matching the structural unification the verifier + // promises). The name is internal — agents never see it. + let mut sorted_names: Vec<&str> = + fields.iter().map(|(n, _)| n.as_str()).collect(); + sorted_names.sort_unstable(); + let type_name = format!("__anon_{}", sorted_names.join("_")); + let fields_owned: Vec<(String, _)> = fields.clone(); + // Delegate to the same logic as named Record by building an + // owned Vec and reusing the same bytecode path inline. + let type_id = match self.type_registry.name_to_id.get(&type_name) { + Some(&id) => id, + None => { + let field_names: Vec = + fields_owned.iter().map(|(n, _)| n.clone()).collect(); + self.type_registry.register(type_name.clone(), field_names, 0) + } + }; + let canonical_order: Vec = + self.type_registry.types[type_id as usize].fields.clone(); + let source_fields: HashMap<&str, &Expr> = + fields_owned.iter().map(|(n, e)| (n.as_str(), e)).collect(); + let n = canonical_order.len(); + let pre_reg = self.next_reg as usize; + let fits_contiguous = n <= 255 && type_id <= 255 && pre_reg + 2 * n < 255; + assert!( + type_id <= 255, + "type_id {} exceeds 8-bit limit in OP_RECNEW", + type_id + ); + if fits_contiguous { + let ordered_regs: Vec = canonical_order + .iter() + .map(|fname| { + let expr = source_fields[fname.as_str()]; + self.compile_expr(expr) + }) + .collect(); + let a = self.alloc_reg(); + let fields_base = self.next_reg; + assert!( + (self.next_reg as usize) + ordered_regs.len() <= 255, + "register overflow: anonymous record literal requires too many register slots" + ); + self.next_reg += ordered_regs.len() as u8; + if self.next_reg > self.max_reg { + self.max_reg = self.next_reg; + } + for (i, &field_reg) in ordered_regs.iter().enumerate() { + let target = fields_base + i as u8; + if field_reg != target { + self.emit_abc(OP_MOVE, target, field_reg, 0); + } + } + let bx = (type_id << 8) | ordered_regs.len() as u16; + self.emit_abx(OP_RECNEW, a, bx); + self.reg_record_type[a as usize] = type_id; + a + } else { + let a = self.alloc_reg(); + self.emit_abx(OP_RECNEW_EMPTY, a, type_id); + let after_result = self.next_reg; + for (i, fname) in canonical_order.iter().enumerate() { + let expr = source_fields[fname.as_str()]; + let val_reg = self.compile_expr(expr); + self.emit_abc(OP_RECSETFIELD, a, val_reg, i as u8); + self.next_reg = after_result; + } + self.reg_record_type[a as usize] = type_id; + a + } + } + Expr::Record { type_name, fields } => { // Look up or auto-register type in registry let type_id = match self.type_registry.name_to_id.get(type_name) { From 20f02d3923f33c1ae3f8f382ba1ca08e8ffbf0e8 Mon Sep 17 00:00:00 2001 From: Daniel Morris Date: Fri, 22 May 2026 01:01:58 +0100 Subject: [PATCH 2/3] ci: bump per-module token caps to match #618 --- scripts/check-skill-tokens.py | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/scripts/check-skill-tokens.py b/scripts/check-skill-tokens.py index 4f2a44b5..cb0397d7 100755 --- a/scripts/check-skill-tokens.py +++ b/scripts/check-skill-tokens.py @@ -43,14 +43,21 @@ ] PER_MODULE_LIMIT = 1000 -# `ilo-language` is the foundational module every agent loads first; it -# carries a higher cap because core syntax doesn't split cleanly into -# smaller files. `ilo-builtins-io` is the next most-touched module — -# HTTP, JSON, env, time, and process all live there; agent dogfooding -# hits this cap on every other doc PR. Bumped to match its density. +# Per-module overrides for the densest modules. Caps track measured size +# with light headroom; the aggregate budget (TOTAL_LIMIT) is the real +# token-economics gate, since agents load 1-2 modules per task. +# `ilo-language` is the foundational module every agent loads first. +# `ilo-builtins-io` covers HTTP, JSON, env, time, process - dogfooding +# hits it on every other doc PR. `ilo-builtins-math` carries the full +# numerics surface (stats, distance, regression, FFT, bisect). `ilo-agent` +# documents the agent-protocol RPC, which has grown with each verb. PER_MODULE_OVERRIDES = { - "ilo-language": 1500, - "ilo-builtins-io": 1500, + "ilo-language": 1700, + "ilo-builtins-core": 1200, + "ilo-builtins-math": 1500, + "ilo-builtins-io": 2000, + "ilo-builtins-text": 1200, + "ilo-agent": 1300, } TOTAL_LIMIT = 15000 From 10a024f7e81ac68143bd5090fc6dfff090bdf016 Mon Sep 17 00:00:00 2001 From: Daniel Morris Date: Fri, 22 May 2026 02:37:31 +0100 Subject: [PATCH 3/3] chore: cargo fmt + regenerate ai.txt --- src/ast/mod.rs | 4 +++- src/verify.rs | 3 +-- src/vm/mod.rs | 6 +++--- 3 files changed, 7 insertions(+), 6 deletions(-) diff --git a/src/ast/mod.rs b/src/ast/mod.rs index 7d1bca53..5a502468 100644 --- a/src/ast/mod.rs +++ b/src/ast/mod.rs @@ -957,7 +957,9 @@ fn collect_anon_record_fields_stmt(stmt: &Stmt, out: &mut std::collections::Hash collect_anon_record_fields_expr(condition, out); collect_anon_record_fields_stmts(body, out); } - Stmt::ForEach { collection, body, .. } => { + Stmt::ForEach { + collection, body, .. + } => { collect_anon_record_fields_expr(collection, out); collect_anon_record_fields_stmts(body, out); } diff --git a/src/verify.rs b/src/verify.rs index 78566ee4..5f6c8c9e 100644 --- a/src/verify.rs +++ b/src/verify.rs @@ -53,8 +53,7 @@ impl std::fmt::Display for Ty { } Ty::Named(name) => write!(f, "{name}"), Ty::AnonRecord(fields) => { - let parts: Vec = - fields.iter().map(|(n, t)| format!("{n}:{t}")).collect(); + let parts: Vec = fields.iter().map(|(n, t)| format!("{n}:{t}")).collect(); write!(f, "{{{}}}", parts.join(" ")) } Ty::Unknown => write!(f, "_"), diff --git a/src/vm/mod.rs b/src/vm/mod.rs index ccd2a779..b6b3854f 100644 --- a/src/vm/mod.rs +++ b/src/vm/mod.rs @@ -5762,8 +5762,7 @@ impl RegCompiler { // field list so that two literals with the same shape share one // registry entry (matching the structural unification the verifier // promises). The name is internal — agents never see it. - let mut sorted_names: Vec<&str> = - fields.iter().map(|(n, _)| n.as_str()).collect(); + let mut sorted_names: Vec<&str> = fields.iter().map(|(n, _)| n.as_str()).collect(); sorted_names.sort_unstable(); let type_name = format!("__anon_{}", sorted_names.join("_")); let fields_owned: Vec<(String, _)> = fields.clone(); @@ -5774,7 +5773,8 @@ impl RegCompiler { None => { let field_names: Vec = fields_owned.iter().map(|(n, _)| n.clone()).collect(); - self.type_registry.register(type_name.clone(), field_names, 0) + self.type_registry + .register(type_name.clone(), field_names, 0) } }; let canonical_order: Vec =