diff --git a/SANDBOX.md b/SANDBOX.md new file mode 100644 index 00000000..f64ee26b --- /dev/null +++ b/SANDBOX.md @@ -0,0 +1,109 @@ +# ilo Capability Sandbox: Operator Guide + +> CLI capability flags for multi-tenant and sandboxed deployments (ILO-59). + +## Overview + +ilo programs can read files, write files, make network requests, and spawn +subprocesses. In single-user / trusted contexts this is fine — the same +footprint as any scripting language. In multi-tenant deployments (agents +running untrusted ilo code on a shared server) this leaves SSRF, +arbitrary-filesystem-read, and arbitrary-command-execution open. + +Capability flags give operators a per-process sandbox. Any `--allow-*` flag +switches the runtime from **permissive** (legacy default, no restrictions) to +**restricted** (only explicitly listed targets are permitted). Capabilities not +mentioned default to unrestricted when in restricted mode, so you can add a +single flag without breaking other IO. + +## Flags + +| Flag | Value syntax | Effect | +|------|-------------|--------| +| `--allow-net[=HOSTS]` | comma-separated hosts, `*`, or empty | Gate outbound HTTP/HTTPS | +| `--allow-read[=PATHS]` | comma-separated path prefixes, `*`, or empty | Gate file reads | +| `--allow-write[=PATHS]` | comma-separated path prefixes, `*`, or empty | Gate file writes | +| `--allow-run[=CMDS]` | comma-separated command names, `*`, or empty | Gate subprocess spawning | + +**Value semantics:** + +- Omitted flag → that capability is unrestricted (permissive). +- `--allow-net=*` or `--allow-net=all` → net unrestricted (explicit All). +- `--allow-net=api.example.com,cdn.example.com` → only those two hosts. +- `--allow-net=` (empty value) → all network blocked. + +Once any `--allow-*` flag is present the mode is restricted; all four +dimensions are individually governed by their flag (or `Policy::All` if that +flag was omitted). + +## Matching rules + +**Network (`--allow-net`):** host extracted from URL (scheme and path stripped). +Exact match or leading `*.`-wildcard: `*.example.com` matches +`api.example.com` and `example.com`. + +**Read / write (`--allow-read`, `--allow-write`):** path-prefix matching with +separator boundary. `/tmp` permits `/tmp/foo` but not `/tmpfoo`. Trailing +slash on the prefix is normalised. + +**Run (`--allow-run`):** exact command name or basename match. `/usr/bin/ls` +is matched by `ls` in the allowlist. + +## Error code + +A denied capability emits `ILO-CAP-001` as the error value returned from the +builtin: + +``` +ILO-CAP-001 blocked by --allow-net policy: host=evil.example is not in the allowlist +``` + +The error is a normal ilo `R` (Result) `Err` value — programs can pattern-match +it with `?{res|er: ...}` and react programmatically. It is not a fatal abort. + +## Capability matrix + +| Builtin | Capability checked | +|---------|-------------------| +| `get`, `post`, `put`, `patch`, `del`, `http-get`, `http-post`, `fetch` | `--allow-net` | +| `rd`, `rd-lines`, `ls`, `lsr` | `--allow-read` | +| `wr`, `wr-lines`, `wr-app` | `--allow-write` | +| `run`, `run2` | `--allow-run` | + +## Recipes + +### Block all IO + +```sh +ilo run --allow-net= --allow-read= --allow-write= --allow-run= untrusted.ilo +``` + +### Allow only outbound calls to one API + +```sh +ilo run --allow-net=api.example.com trusted.ilo +``` + +### Read-only scratch space + +```sh +ilo run --allow-read=/data --allow-write=/tmp agent.ilo +``` + +### Wildcard subdomain + +```sh +ilo run --allow-net="*.internal.corp" service.ilo +``` + +## Backwards compatibility + +`Caps::Permissive` is the default. Any script that does not pass `--allow-*` +runs without restriction — identical behaviour to pre-0.13. + +## See also + +- `examples/capability-sandbox.ilo` — runnable demo. +- `SPEC.md` — capability flags section. +- ILO-59 (Linear) — implementation ticket. +- ILO-47 (Linear) — `World` capability parameter (the language-level long-term move). diff --git a/SPEC.md b/SPEC.md index 0fb8447c..ce30280b 100644 --- a/SPEC.md +++ b/SPEC.md @@ -2071,8 +2071,14 @@ ilo --max-ast-depth N -- cap parser nesting at N (default 256; prote and other untrusted-source paths from DoS payloads, raises ILO-P103) ilo --max-runtime SECS -- cap wall-clock runtime at SECS (default 60; 0 disables; raises ILO-R016) ilo --max-output-bytes BYTES -- cap stdout output at BYTES (default ~100 MB; 0 disables; raises ILO-R017) +ilo run --allow-net[=HOSTS] -- restrict outbound net to comma-separated hosts (* = all, empty = none) +ilo run --allow-read[=PATHS] -- restrict file reads to comma-separated path prefixes +ilo run --allow-write[=PATHS] -- restrict file writes to comma-separated path prefixes +ilo run --allow-run[=CMDS] -- restrict subprocess spawning to comma-separated command names ``` +**Capability flags (`ILO-CAP-001`).** `ilo run --allow-net=HOSTS --allow-read=PATHS --allow-write=PATHS --allow-run=CMDS` gates IO builtins at the process level. Any `--allow-*` flag present switches the runtime from **permissive** (default — no restrictions, full backwards compatibility) to **restricted** (only listed targets are permitted). Denial returns a normal `R` Err value with code `ILO-CAP-001`; programs can pattern-match it. Capability matrix: `get`/`post`/`put`/`patch`/`del`/`fetch` → `--allow-net`; `rd`/`rd-lines`/`ls`/`lsr` → `--allow-read`; `wr`/`wr-lines`/`wr-app` → `--allow-write`; `run`/`run2` → `--allow-run`. Value syntax: omit = unrestricted; `*` = all permitted; empty (`--allow-net=`) = all blocked; comma list = only those targets. Matching: net = hostname extracted from URL, exact or `*.domain` wildcard; read/write = path-prefix with separator boundary; run = basename or full-path match. See `SANDBOX.md` for the operator guide and `examples/capability-sandbox.ilo` for a runnable demo. + **Production-safety guards (`ILO-R016`, `ILO-R017`).** `ilo run` caps wall-clock runtime at 60 s and stdout output at ~100 MB by default. A runaway loop (missing increment, recursion with no base case) aborts with `ILO-R016` once the time budget hits, instead of burning CPU forever; a `prnt` loop without termination aborts with `ILO-R017` once the byte budget hits, instead of filling the agent transcript with megabytes of garbage. Both guards write a structured diagnostic to stderr and exit 1. Defaults are well above any legitimate program (real agent tasks finish under 10 s and produce kilobytes); raise with `--max-runtime SECS` / `--max-output-bytes BYTES`, set either to `0` to disable. The guards were installed by the mandelbrot persona report (2026-05-20) which spun in an infinite loop and wrote 165 MB of stdout before the harness intervened. **Verb-noun aliases.** `ilo run ` is an exact alias for the bare positional `ilo ` - same dispatch, same engine selection, same arg handling. `ilo build -o ` is an alias for `ilo compile -o `. Both exist to match the toolchain conventions used by `cargo`, `go`, and `zero` so agents and humans can guess the command name without consulting the help text. The bare positional forms remain fully supported for backwards compatibility; nothing has been removed. diff --git a/ai.txt b/ai.txt index 0b1813fe..b494e0ce 100644 --- a/ai.txt +++ b/ai.txt @@ -16,6 +16,6 @@ TOOLS (EXTERNAL CALLS): tool "" > timeou IMPORTS: Split programs across files with `use`: use "path/to/file.ilo" -- import all declarations use "path/to/file.ilo" [name1 name2] -- import only named declarations All imported declarations merge into a flat shared namespace - no qualification, no `mod::fn` syntax. The verifier catches name collisions. -- math.ilo dbl n:n>n; *n 2 half n:n>n; /n 2 -- main.ilo use "math.ilo" run n:n>n; dbl! half n [Rules] Path is relative to the importing file's directory Transitive: if `a.ilo` uses `b.ilo`, `b.ilo`'s declarations are visible to `main.ilo` when it uses `a.ilo` Circular imports are an error (`ILO-P018`) Scoped import with unknown name: `ILO-P019` `use` in inline code (no file context): `ILO-P017` [Error codes] `ILO-P017`=File not found or `use` in inline mode `ILO-P018`=Circular import detected `ILO-P019`=Name in `[...]` list not declared in the imported file ERROR HANDLING: `R ok err` return type. Call then match: get-user uid;?{^e:^+"Lookup failed: "e;~d:use d} Compensate/rollback inline: charge pid amt;?{^e:release rid;^+"Payment failed: "e;~cid:continue} [Auto-Unwrap `!`] `func! args` calls `func` and auto-unwraps the Result: if `~v` (Ok), returns `v`; if `^e` (Err), immediately returns `^e` from the enclosing function. inner x:n>R n t;~x outer x:n>R n t;d=inner! x;~d Equivalent to `r=inner x;?r{~v:v;^e:^e}` but in 1 token instead of 12. Rules: The called function must return `R` or `O` (else verifier error ILO-T025) The enclosing function must return `R` (or `O` for Optional callees) (else verifier error ILO-T026) `!` goes after the function name, before args: `get! url` not `get url!` Zero-arg: `fetch!()` [Panic-Unwrap `!!`] `func!! args` is symmetric in shape with `!`, but on the failure path it aborts the program with a runtime diagnostic and exit code 1 instead of propagating. There is no enclosing-return-type constraint, so persona code can use it from `main>t`, `main>n`, or any non-Result / non-Optional context. main>t;rdl!! "input.txt" -- read file, abort with diagnostic if missing main>n;v=num!! "42";v -- parse number, abort on parse error main>n;m=mset mmap "k" 7;mget!! m "k" -- get value or abort if key missing On `^e` (Err) the program writes `panic-unwrap: ` to stderr and exits 1. On `O nil` the program writes `panic-unwrap: expected value, got nil`. On `~v` (Ok) or non-nil Optional, the inner value is extracted, identical to `!`. Rules: The called function must return `R` or `O` (else verifier error ILO-T025) **No constraint on the enclosing function's return type** - this is the difference from `!` `!!` goes after the function name, before args: `rdl!! path` not `rdl path!!` Zero-arg: `fetch!!()` Use `!` when the caller wants to react to the Err (compensate, retry, log). Use `!!` when the failure is a programming or environmental error the caller has no way to recover from - typical in short scripts, glue code, and main entry points. PATTERNS (FOR LLM GENERATORS): [Bind-first pattern] Always bind complex expressions to variables before using them in operators. Operators only accept atoms and nested operators as operands - not function calls. -- DON'T: *n fac -n 1 (fac is an operand of *, not a call) -- DO: r=fac -n 1;*n r (bind call result, then use in operator) [Recursion template] >;;...;;combine 1. **Guard**: base case returns early - `<=n 1 1` (or `<=n 1{1}`) 2. **Bind**: bind recursive call results - `r=fac -n 1` 3. **Combine**: use bound results in final expression - `*n r` [Factorial] fac n:n>n;<=n 1 1;r=fac -n 1;*n r `<=n 1 1` - braceless guard: if n <= 1, return 1 `r=fac -n 1` - recursive call with prefix subtract as argument `*n r` - multiply n by result [Fibonacci] fib n:n>n;<=n 1 n;a=fib -n 1;b=fib -n 2;+a b `<=n 1 n` - braceless guard: return n for 0 and 1 `a=fib -n 1;b=fib -n 2` - two recursive calls, each with prefix arg `+a b` - add results [Tail-call optimisation] ilo guarantees that **tail calls do not consume host-stack frames**. A function that recurses only in tail position can run to arbitrary depth — the runtime trampolines the call by rebinding parameters in place rather than pushing a frame. The manifesto's "Constrained" rule (every feature must pay for itself in tokens) vetoed adding a `loop` keyword. Instead, tail-recursive accumulator patterns are the canonical idiom for iteration beyond what `@` foreach covers, and the TCO guarantee makes them safe at any depth. A call is in **tail position** when its return value is the function's return value: the last statement of the body, the expression of a `ret` statement, an arm of a tail-position `?` match, or the body of a braceless guard. Calls inside `@` foreach, `@` range, `wh` loops, or as operands of further computation are NOT in tail position. > **Recursive self-call discarded at non-tail position fires `ILO-T043`.** When a function calls itself before another statement runs, the recursive return is silently dropped — every call falls through to the later statements. The verifier emits `ILO-T043` with a hint pointing at the tail-position fix (move the recursive call to the body's last statement, wrap it in `ret`, or restructure via `?h cond then else`). The warning is narrowly scoped to self-calls (caller name == callee name); bare non-recursive user-fn calls at non-tail position may be side-effecting and do not warn. Surfaced 2026-05-21 by the interp1d persona: see `examples/recursive-tail-position.ilo` for the canonical fix shape. -- Tail-recursive countdown — runs to arbitrary depth. count-down n:n>n;=n 0 0;count-down -n 1 -- Tail-recursive accumulator — sums a list without growing the host stack. sum-acc xs:L n acc:n>n;empty=len xs;=empty 0 acc;sum-acc tl xs +acc hd xs Constraints on the tail-call peephole: The callee must be a direct user-defined function name (not a FnRef in scope, not a closure, not a builtin, not a tool). The call must have no auto-unwrap (`!` / `!!`) — those forms inspect the result before deciding whether to propagate. These constraints leave the common shapes (recursive accumulators, state machines, mutual recursion via direct names) covered. Other shapes still recurse the host stack as before; for deep recursion through non-tail-eligible shapes, restructure into an accumulator. Tree interpreter and bytecode VM (`--vm`) support shipped in 0.12.x; the VM emits `OP_TAILCALL` for tail-position user-fn calls and reuses the current call frame instead of pushing a new one, so depth is bounded only by available heap. Cranelift (`--jit`, AOT) gains matching `return_call` lowering in a subsequent PR; until then, deep tail-recursion under the JIT/AOT path recurses the host stack and is bounded by it. [Multi-statement bodies] Semicolons separate statements. Last expression is the return value. f x:n>n;a=*x 2;b=+a 1;*b b -- (x*2 + 1)^2 Bodies may also be written across multiple newline-separated lines, indented under the signature. The parser stays inside the same function body while it sees an open bracket (`[`, `(`, `{`) or a pipe operator continuation. This makes long literals and multi-line conditional pipelines readable without semicolons: f x:n>n a=*x 2 b=+a 1 *b b g>L n [10, 20, 30, 40, 50, 60, 70, 80] Statement separation reverts to standard rules once brackets close. A blank line ends the current declaration. Windows CRLF (`\r\n`) is normalised to `\n` before lexing, so files edited on Windows parse identically to Unix-line-ending files. [Multi-function files] Functions in a file are separated by **newlines**. The parser strips all newlines, so the token stream is flat. After parsing each function body, the parser uses the next newline-delimited boundary to start the next declaration. A non-last function body's **final expression must not be a bare variable reference (`Ref`) or a function call**, because the parser greedily reads following tokens as additional call arguments. Safe endings prevent this: Binary operator=`+n 0`, `*x 1`=✓=fixed arity - no greedy loop Index access=`xs.0`, `rec.field`=✓=returns `Expr::Index`, not `Ref` Match block=`?v{…}`=✓=ends with `}` ForEach block=`@x xs{…}`=✓=ends with `}` Parenthesised expr=`(x>>f>>g)`=✓=ends with `)` Record constructor=`point x:1 y:2`=✓=parses as `Expr::Record`, not `Ref` Text/number literal=`"ok"`, `42`=✓=literal, not `Ref` Bare variable (`Ref`)=`n`, `result`=✗=greedy loop fires Bare function call=`len xs`, `f a`=✗=greedy loop fires The **last function in a file** can end with anything - greedy parsing stops at EOF. -- Non-last functions: end with a binary expression digs n:n>n;t=str n;l=len t;+l 0 -- +l 0 = l (binary, safe) clmp n:n lo:n hi:n>n;n hi hi;+n 0 -- +n 0 = n (binary, safe; `clamp` is a builtin) -- Last function: bare call is fine sz xs:L n>n;len xs -- EOF - greedy loop stops naturally To use a pipe chain in a non-last function, wrap it in parentheses: dbl-inc x:n>n;(x>>dbl>>inc) -- parens prevent >> from consuming next function's name inc-sq x:n>n;x>>inc>>sq -- last function - no parens needed [DO / DON'T] -- DON'T: fac n:n>n;<=n 1 1;*n fac -n 1 -- ↑ *n sees fac as an atom operand, not a call -- DO: fac n:n>n;<=n 1 1;r=fac -n 1;*n r -- ↑ bind-first: call result goes into r, then *n r works -- DON'T: +fac -n 1 fac -n 2 -- ↑ + takes two operands; fac is just an atom ref -- DO: a=fac -n 1;b=fac -n 2;+a b -- ↑ bind both calls, then combine -ERROR DIAGNOSTICS: ilo verifies programs before execution and reports errors with stable codes, source context, and suggestions. [Error codes] Every error has a stable `ILO-` code. The letter is the namespace - the phase that raised the diagnostic - so agents and tools can route on prefix without parsing the message. Numeric ranges are reserved per namespace with generous gaps, so future codes slot in cleanly and the contract is forward-compatible. `ILO-L000-099`=L=Lexer / tokenisation=active `ILO-P100-199`=P=Parser / syntax=active `ILO-N200-299`=N=Names / resolution=reserved `ILO-I300-399`=I=Imports=reserved `ILO-T400-499`=T=Types=active `ILO-V500-599`=V=Verifier (post-type checks)=reserved `ILO-R600-699`=R=Runtime=active `ILO-D700-799`=D=Deprecation warnings=reserved `ILO-E800-899`=E=Engine-specific limitations=reserved `ILO-S900-999`=S=Skill / spec system=reserved **Historical codes.** ilo shipped with flat numbering inside each namespace - `ILO-L001`, `ILO-P001`, `ILO-T001`, `ILO-R001`, `ILO-W001`, all starting at 001. Those codes remain valid forever. The hundreds-block allocation above applies to new codes from now on, and a cross-engine regression test asserts every emitted code lives in a documented range. **Reserved namespaces.** `N`, `I`, `V`, `D`, `E`, `S` carry no codes today. They are forward declarations so the first code in each category slots into its own range without conflicting with the active namespaces. `D` is earmarked for deprecation warnings: when a feature is scheduled for removal it emits an `ILO-D7xx` warning at compile time without failing the build. Use `--explain` to see a detailed explanation: ilo --explain ILO-T004 [Source context] Errors point at the relevant source location with a caret: error[ILO-T005]: undefined function 'foo' (called with 1 args) --> 1:9 1 | f x:n>n;foo x = note: in function 'f' = suggestion: did you mean 'f'? Parser, verifier, and runtime errors all show source spans. The verifier uses the enclosing statement span as the best available location for expression-level errors. [Suggestions] The verifier provides context-aware hints: **Did you mean?** - Levenshtein-based suggestions for undefined variables, functions, fields, and types **Type conversion** - suggests `str` for n→t, `num` for t→n **Missing arms** - lists uncovered match patterns with types **Arity** - shows expected parameter signature [Error output formats] --ansi / -a ANSI colour (default for TTY) --text / -t Plain text (no colour) --json / -j JSON (default for piped output) --no-hints / -nh Suppress idiomatic hints --silent / -s Suppress program stdout (mainly for --bench; see below) NO_COLOR=1 Disable colour (same as --text) **`--silent` / `-s`.** Suppresses the program's own stdout (`prnt`, `prnv`, `jprn`, etc.) for the duration of execution. Designed for `ilo --bench`: combined with `--json` it lets agent harnesses (e.g. persona cost rollup) consume the bench JSON envelope on stdout without it being drowned in the benchmarked function's own output. Stderr is never silenced, so genuine errors still surface. Diagnostic output (including the bench JSON envelope and the human-readable bench summary block) is always emitted on stdout regardless of `--silent` — the flag only redirects program-level prints. Unix only (no-op on Windows for the program-stdout half; bench output still reaches stdout there). JSON error output follows a structured schema with `severity`, `code`, `message`, `labels` (with spans), `notes`, and `suggestion` fields. Runtime errors raised from the Cranelift JIT (opt-in via `--jit`) populate `labels` with the source span of the failing operation, matching tree and VM behaviour. Span coverage threads through every JIT runtime helper (unwrap, panic-unwrap, list-get, slice, index, jpth, mget, record-field strict access, builtin dispatch, dynamic call); AOT-compiled binaries inherit the same coverage. Pre-v0.11.6 builds surfaced `{"labels":[]}` for these shapes - if you see an empty labels array on a runtime error, the binary is out of date. AOT binaries also install an async-signal-safe handler in `ilo_aot_init` that catches fatal signals (SIGSEGV, SIGBUS, SIGFPE, SIGILL, SIGABRT) and writes a single JSON line on stderr identifying the signal before the process terminates with the conventional 128+signo exit code. The diagnostic uses `ILO-R015` (AOT runtime fault). Without the handler, a hard fault inside compiled native code would leave the process with raw signal exit (e.g. 139 for SIGSEGV) and no diagnostic — agents driving ilo couldn't distinguish a clean non-zero exit from a hard fault. A SIGSEGV from an AOT binary is always a bug in ilo (codegen or runtime helper); file an issue with the source program and the JSON line. AOT binaries also install an async-signal-safe handler in `ilo_aot_init` that catches fatal signals (SIGSEGV, SIGBUS, SIGFPE, SIGILL, SIGABRT) and writes a single JSON line on stderr identifying the signal before the process terminates with the conventional 128+signo exit code. The diagnostic uses `ILO-R015` (AOT runtime fault). Without the handler, a hard fault inside compiled native code would leave the process with raw signal exit (e.g. 139 for SIGSEGV) and no diagnostic — agents driving ilo couldn't distinguish a clean non-zero exit from a hard fault. A SIGSEGV from an AOT binary is always a bug in ilo (codegen or runtime helper); file an issue with the source program and the JSON line. [Top-level program output] For a program whose entry function returns a Result, the `~`/`^` wrapper is split across streams and exit codes so shell callers do not have to strip a prefix: `~v` (Ok)=`v` (bare)=-=0 `^e` (Err)=-=`^e`=1 any non-Result=`v`=-=0 In `--json` mode the value is always wrapped (`{"schemaVersion": 1, "ok": v}` / `{"schemaVersion": 1, "error": {...}}`) and emitted to stdout; exit codes match the plain-mode table. The `schemaVersion` field was added in 0.12.1 to every CLI `--json` envelope (`run`, `graph`, `--ast`, `serv`, `tools --json`, `spec --json`) so agents can route on a single field across every command. See `JSON_OUTPUT.md` for the full audit table. `Display` on `Value::Ok` / `Value::Err` still renders `~v` / `^e` in every other context (nested values, `prnt`, REPL prompts, error messages, debug output) - only the top-level program-return print path is split. The contract applies uniformly to in-process runners (`ilo prog.ilo`, `--vm`, `--jit`) and to AOT-compiled standalone binaries from `ilo compile`. Both strip the top-level `~`/`^` wrapper on stdout, route `^e` to stderr, and use the same exit codes - output is byte-for-byte identical across every backend. **Auto-echo suppression for `prnt` + status sentinel.** When the entry function has at least one *unconditional top-level* `prnt` call AND the tail expression is a bare wrapped string literal (`~"text"` or `^"text"`), the top-level auto-echo is suppressed. The wrapped literal is treated as a status sentinel rather than a value the caller wants captured. Without this rule, a function shaped like `m>R t t;prnt "report";~"ok"` emits `report\nok\n` on stdout and shell callers piping the output have to strip the trailing `ok`. The rule does NOT fire when (a) there is no `prnt` in the body — `m>R t t;~"ok"` still prints `ok` because the wrapped literal IS the program's output (the `cli-tasks-save-ok.ilo` pattern); (b) the `prnt` is nested inside a guard, loop, or match arm — those are conditional and the `prnt` may never run; (c) the tail is `~v` where `v` is a binding or call — that's a real return value. `^"text"` errors still go to stderr with exit 1; the suppression rule never silently swallows an Err. Pinned by `tests/regression_tilde_str_noecho.rs` and `examples/tilde-str-noecho.ilo`. [Idiomatic hints] After successful execution, ilo scans the source for non-canonical forms and emits hints to stderr: hint: `==` → `=` saves 1 char (both mean equality in ilo) hint: `length` → `len` (canonical short form) Builtin alias hints appear at most once per program (the first long-form name found). In JSON mode, hints appear as `{"hints":["..."]}` on stderr. Suppress with `--no-hints` / `-nh`. [CLI invocation] ilo 'code' [args...] -- inline program; default-runs the entry function ilo program.ilo [func] [args] -- if `func` is omitted and the file declares exactly one function, that function runs automatically ilo run program.ilo [func] [a] -- verb form; same dispatch as the bare positional ilo check program.ilo [--json] [--strict] -- run the verifier without executing (exit 0 = clean; --strict treats warnings as exit-code errors) ilo test [path] [--engine vm|jit|all] -- run `-- run:` / `-- out:` / `-- err:` assertions in .ilo files (exit 0 on all-pass, 1 on any failure) ilo build program.ilo -o out -- AOT compile to a standalone binary (alias for `compile`) ilo program.ilo --ast -- print parsed AST as JSON and exit ilo --explain ILO-T004 -- print error explanation and exit ilo help ai -- compact AI spec to stdout (= contents of ai.txt) ilo serv -- long-lived JSON request/response loop ilo --max-ast-depth N -- cap parser nesting at N (default 256; protects `ilo serv` and other untrusted-source paths from DoS payloads, raises ILO-P103) ilo --max-runtime SECS -- cap wall-clock runtime at SECS (default 60; 0 disables; raises ILO-R016) ilo --max-output-bytes BYTES -- cap stdout output at BYTES (default ~100 MB; 0 disables; raises ILO-R017) **Production-safety guards (`ILO-R016`, `ILO-R017`).** `ilo run` caps wall-clock runtime at 60 s and stdout output at ~100 MB by default. A runaway loop (missing increment, recursion with no base case) aborts with `ILO-R016` once the time budget hits, instead of burning CPU forever; a `prnt` loop without termination aborts with `ILO-R017` once the byte budget hits, instead of filling the agent transcript with megabytes of garbage. Both guards write a structured diagnostic to stderr and exit 1. Defaults are well above any legitimate program (real agent tasks finish under 10 s and produce kilobytes); raise with `--max-runtime SECS` / `--max-output-bytes BYTES`, set either to `0` to disable. The guards were installed by the mandelbrot persona report (2026-05-20) which spun in an infinite loop and wrote 165 MB of stdout before the harness intervened. **Verb-noun aliases.** `ilo run ` is an exact alias for the bare positional `ilo ` - same dispatch, same engine selection, same arg handling. `ilo build -o ` is an alias for `ilo compile -o `. Both exist to match the toolchain conventions used by `cargo`, `go`, and `zero` so agents and humans can guess the command name without consulting the help text. The bare positional forms remain fully supported for backwards compatibility; nothing has been removed. **`ilo check`.** Standalone verifier invocation: lex, parse, resolve imports, and run the type verifier without proceeding to bytecode compilation or execution. Exit code 0 means the program is well-typed and verifier-clean; exit code 1 means at least one diagnostic was emitted on stderr. The output mode follows the global flags (`--json` for NDJSON diagnostics, `--text` for plain text, `--ansi` for coloured output; auto-detected when omitted - JSON when stderr is not a TTY, ANSI otherwise). `ilo check` works on both files and inline code; on a syntactically-broken input it still reports the parse error rather than crashing, which is important for editor and agent loops that may feed in half-written programs. **`ilo test`.** Runs the `-- run: ` / `-- out: ` (or `-- err: `) annotations embedded in `.ilo` source files - the same format the in-tree `tests/examples_engines.rs` integration harness already uses. A file path tests that one file; a directory walks `*.ilo` recursively. Each case runs as a subprocess (`ilo --vm `), output is asserted against the expected payload, and the result prints as `PASS path::fn (line N)` / `FAIL path::fn (line N) (got: X, want: Y)`. The final line reports `N passed, M failed`. Exit 0 if everything passed, 1 if any case failed or no annotations were found. The default engine is `--vm`; pass `--engine jit` or `--engine all` to widen the matrix. Per-file `-- engine-skip: vm jit` annotations skip the listed engines, matching the integration harness. Because every example under `examples/` uses this annotation format already, `ilo test examples/` doubles as a smoke test for the language itself and as a worked reference an agent can read when writing tests for its own programs. **`ilo check --strict`.** Treats every warning-severity diagnostic (ILO-T032 bare `fmt`, ILO-T033 bare `mset` / `+=` / `mdel`, ILO-W002 `@x (jpar! …){…}` steering to `jpar-list!`, future warning codes) as a hard exit-code failure. The diagnostic stream itself is unchanged: warnings still emit with `severity: "warning"` in the JSON output, so editor integrations that route by severity stay correct. Only the exit code is elevated. CI harnesses that gate merges on `ilo check` should use `--strict` so warnings can't slip through silently; for interactive use, the default (warnings-are-advisory) is the right behaviour. **Default-run.** Inline programs (`ilo 'code'`) and single-function files run their entry function with the remaining CLI args; no explicit function name needed. Multi-function files auto-pick a function called `main` when no positional func arg is supplied. The same heuristic applies to the explicit engine flags - `--vm` and `--jit` both auto-pick `main` on multi-fn files, matching the default-engine behaviour. With no `main` declared, supply a function-name argument. **AOT entry-pick.** `ilo compile file.ilo -o out` (alias `ilo build`) follows the same entry-pick rules as the in-process engines: a single user-defined function is used directly; on multi-function files the entry is `main` if defined, otherwise the explicit positional `func` arg (`ilo compile file.ilo -o out run`); otherwise the compile fails with `ILO-E801` and exits 1 without writing a binary. AOT does not fall back to "first declared function" - that historical default produced binaries that called the wrong entry symbol and SIGSEGV'd at runtime. **Default engine.** The bytecode register VM is the default execution path. It supports every opcode (closures with Phase 2 capture, listview windows, fused len-of-filter, every modern shape), and avoids the JIT compile-and-bail cost paid by the pre-v0.11.9 Cranelift-first default whenever a program touched an opcode the JIT couldn't handle. Cranelift JIT is opt-in via `--jit`; on opt-in, the JIT runs hot numeric loops and falls back to the VM on bailout. Phase 2 captures run natively on every public backend - VM, JIT, and AOT (`ilo compile`); AOT embeds the postcard `CompiledProgram` blob into the binary's `.rodata` so dispatch helpers can re-enter the VM on user-fn callbacks the same way the in-process runners do. For long-running workloads where the JIT pays for itself, opt in explicitly; for most agent workloads the VM is the right default. **Tree-walker is internal-only.** The tree-walking interpreter is no longer user-selectable: `--run-tree` and its `--run` alias were removed from the public CLI in 0.12.1 (they now error with the unknown-flag guard). The interpreter stays in-tree as the dispatch target for HOF / regex / fmt-variadic / IO / sleep / ct / rsrt / closure-bind-ctx shapes the VM and Cranelift haven't lifted natively yet - the VM bails to it transparently for the ops listed by `is_tree_bridge_eligible` (`rgx`, `rgxall`, `rgxall1`, `rgxall-multi`, `rgxsub`, `fmt`, `fmt2`, `rd`, `rdb`, `rdjl`, `rdin`, `rdinl`, `sleep`, `lsd`, `walk`, `glob`, `dirname`, `basename`, `pathjoin`, `fsize`, `mtime`, `isfile`, `isdir`, `run`, `env-all`, `jkeys`, `tz-offset`, `ct` 2-arg and 3-arg, `rsrt` 2-arg and 3-arg, `dur-parse`, `dur-fmt`, and the closure-bind ctx variants of `map`/`flt`/`fld`/`srt`). Cross-engine parity for those shapes is pinned by `tests/regression_builtin_bridge.rs` and `tests/regression_tree_bridge_invariants.rs`. 0.13.0+ is on track for a hard drop once the bridge consumers are lifted natively and the shared runtime types (`Value`, `MapKey`, `RuntimeError`, math helpers) are extracted from `src/interpreter/` to a non-engine module. **Subcommand dispatch.** The first positional argument is interpreted as a function name when it has the shape of an ilo identifier - `[a-z][a-z0-9]*(-[a-z0-9]+)*` - so `ilo file.ilo list-orders` routes to the `list-orders` function. Args that don't match the ident shape (file paths like `/tmp/data.json`, numbers, sigils, bracketed lists, anything with a `.` or `/`) route to `main` (or the entry function) as a positional CLI arg instead. Trailing dashes (`foo-`), doubled dashes (`foo--bar`), and negative numbers (`-1`) are not idents and pass through as data. **Unknown `--flag` guard.** Any token in the positional tail matching the clean long-flag shape `--word` or `--word-with-dashes` that isn't a recognised flag is rejected upfront with `error: unrecognised flag '--'. Use 'ilo --help' for valid flags. To pass it as a literal arg, separate with '--' first.` and exit 1. This prevents `ilo main.ilo --engine tree` from silently consuming `--engine` as a positional arg (which used to surface as misleading `ILO-R012 no functions defined` or `ILO-R004 main: expected N args, got N+1`). To pass a hyphen-prefixed token through as literal data, place the `--` separator first: `ilo main.ilo -- --foo`. Anything after the first `--` is data. Tokens with `=` (`--key=val`), trailing or doubled dashes (`--foo-`, `--foo--bar`), and negative numbers (`-1`) are not clean flag shapes and pass through unchanged. **Text-typed params.** When the entry function declares a parameter of type `t`, the CLI passes the raw arg through without numeric coercion. `ilo 'f x:t>t;x' 42` returns the string `"42"`, not the number 42. **Exit codes.** A program returning `Value::Err` (or `^reason` from the entry function) exits with code 1 and prints the err payload on stderr. `~v` (Ok) and any non-Result return value exit 0. Verifier and parser errors exit 2. **List args from the CLI.** Comma-separated args become `L n` or `L t` automatically: `ilo 'f xs:L n>n;sum xs' 1,2,3`. +ERROR DIAGNOSTICS: ilo verifies programs before execution and reports errors with stable codes, source context, and suggestions. [Error codes] Every error has a stable `ILO-` code. The letter is the namespace - the phase that raised the diagnostic - so agents and tools can route on prefix without parsing the message. Numeric ranges are reserved per namespace with generous gaps, so future codes slot in cleanly and the contract is forward-compatible. `ILO-L000-099`=L=Lexer / tokenisation=active `ILO-P100-199`=P=Parser / syntax=active `ILO-N200-299`=N=Names / resolution=reserved `ILO-I300-399`=I=Imports=reserved `ILO-T400-499`=T=Types=active `ILO-V500-599`=V=Verifier (post-type checks)=reserved `ILO-R600-699`=R=Runtime=active `ILO-D700-799`=D=Deprecation warnings=reserved `ILO-E800-899`=E=Engine-specific limitations=reserved `ILO-S900-999`=S=Skill / spec system=reserved **Historical codes.** ilo shipped with flat numbering inside each namespace - `ILO-L001`, `ILO-P001`, `ILO-T001`, `ILO-R001`, `ILO-W001`, all starting at 001. Those codes remain valid forever. The hundreds-block allocation above applies to new codes from now on, and a cross-engine regression test asserts every emitted code lives in a documented range. **Reserved namespaces.** `N`, `I`, `V`, `D`, `E`, `S` carry no codes today. They are forward declarations so the first code in each category slots into its own range without conflicting with the active namespaces. `D` is earmarked for deprecation warnings: when a feature is scheduled for removal it emits an `ILO-D7xx` warning at compile time without failing the build. Use `--explain` to see a detailed explanation: ilo --explain ILO-T004 [Source context] Errors point at the relevant source location with a caret: error[ILO-T005]: undefined function 'foo' (called with 1 args) --> 1:9 1 | f x:n>n;foo x = note: in function 'f' = suggestion: did you mean 'f'? Parser, verifier, and runtime errors all show source spans. The verifier uses the enclosing statement span as the best available location for expression-level errors. [Suggestions] The verifier provides context-aware hints: **Did you mean?** - Levenshtein-based suggestions for undefined variables, functions, fields, and types **Type conversion** - suggests `str` for n→t, `num` for t→n **Missing arms** - lists uncovered match patterns with types **Arity** - shows expected parameter signature [Error output formats] --ansi / -a ANSI colour (default for TTY) --text / -t Plain text (no colour) --json / -j JSON (default for piped output) --no-hints / -nh Suppress idiomatic hints --silent / -s Suppress program stdout (mainly for --bench; see below) NO_COLOR=1 Disable colour (same as --text) **`--silent` / `-s`.** Suppresses the program's own stdout (`prnt`, `prnv`, `jprn`, etc.) for the duration of execution. Designed for `ilo --bench`: combined with `--json` it lets agent harnesses (e.g. persona cost rollup) consume the bench JSON envelope on stdout without it being drowned in the benchmarked function's own output. Stderr is never silenced, so genuine errors still surface. Diagnostic output (including the bench JSON envelope and the human-readable bench summary block) is always emitted on stdout regardless of `--silent` — the flag only redirects program-level prints. Unix only (no-op on Windows for the program-stdout half; bench output still reaches stdout there). JSON error output follows a structured schema with `severity`, `code`, `message`, `labels` (with spans), `notes`, and `suggestion` fields. Runtime errors raised from the Cranelift JIT (opt-in via `--jit`) populate `labels` with the source span of the failing operation, matching tree and VM behaviour. Span coverage threads through every JIT runtime helper (unwrap, panic-unwrap, list-get, slice, index, jpth, mget, record-field strict access, builtin dispatch, dynamic call); AOT-compiled binaries inherit the same coverage. Pre-v0.11.6 builds surfaced `{"labels":[]}` for these shapes - if you see an empty labels array on a runtime error, the binary is out of date. AOT binaries also install an async-signal-safe handler in `ilo_aot_init` that catches fatal signals (SIGSEGV, SIGBUS, SIGFPE, SIGILL, SIGABRT) and writes a single JSON line on stderr identifying the signal before the process terminates with the conventional 128+signo exit code. The diagnostic uses `ILO-R015` (AOT runtime fault). Without the handler, a hard fault inside compiled native code would leave the process with raw signal exit (e.g. 139 for SIGSEGV) and no diagnostic — agents driving ilo couldn't distinguish a clean non-zero exit from a hard fault. A SIGSEGV from an AOT binary is always a bug in ilo (codegen or runtime helper); file an issue with the source program and the JSON line. AOT binaries also install an async-signal-safe handler in `ilo_aot_init` that catches fatal signals (SIGSEGV, SIGBUS, SIGFPE, SIGILL, SIGABRT) and writes a single JSON line on stderr identifying the signal before the process terminates with the conventional 128+signo exit code. The diagnostic uses `ILO-R015` (AOT runtime fault). Without the handler, a hard fault inside compiled native code would leave the process with raw signal exit (e.g. 139 for SIGSEGV) and no diagnostic — agents driving ilo couldn't distinguish a clean non-zero exit from a hard fault. A SIGSEGV from an AOT binary is always a bug in ilo (codegen or runtime helper); file an issue with the source program and the JSON line. [Top-level program output] For a program whose entry function returns a Result, the `~`/`^` wrapper is split across streams and exit codes so shell callers do not have to strip a prefix: `~v` (Ok)=`v` (bare)=-=0 `^e` (Err)=-=`^e`=1 any non-Result=`v`=-=0 In `--json` mode the value is always wrapped (`{"schemaVersion": 1, "ok": v}` / `{"schemaVersion": 1, "error": {...}}`) and emitted to stdout; exit codes match the plain-mode table. The `schemaVersion` field was added in 0.12.1 to every CLI `--json` envelope (`run`, `graph`, `--ast`, `serv`, `tools --json`, `spec --json`) so agents can route on a single field across every command. See `JSON_OUTPUT.md` for the full audit table. `Display` on `Value::Ok` / `Value::Err` still renders `~v` / `^e` in every other context (nested values, `prnt`, REPL prompts, error messages, debug output) - only the top-level program-return print path is split. The contract applies uniformly to in-process runners (`ilo prog.ilo`, `--vm`, `--jit`) and to AOT-compiled standalone binaries from `ilo compile`. Both strip the top-level `~`/`^` wrapper on stdout, route `^e` to stderr, and use the same exit codes - output is byte-for-byte identical across every backend. **Auto-echo suppression for `prnt` + status sentinel.** When the entry function has at least one *unconditional top-level* `prnt` call AND the tail expression is a bare wrapped string literal (`~"text"` or `^"text"`), the top-level auto-echo is suppressed. The wrapped literal is treated as a status sentinel rather than a value the caller wants captured. Without this rule, a function shaped like `m>R t t;prnt "report";~"ok"` emits `report\nok\n` on stdout and shell callers piping the output have to strip the trailing `ok`. The rule does NOT fire when (a) there is no `prnt` in the body — `m>R t t;~"ok"` still prints `ok` because the wrapped literal IS the program's output (the `cli-tasks-save-ok.ilo` pattern); (b) the `prnt` is nested inside a guard, loop, or match arm — those are conditional and the `prnt` may never run; (c) the tail is `~v` where `v` is a binding or call — that's a real return value. `^"text"` errors still go to stderr with exit 1; the suppression rule never silently swallows an Err. Pinned by `tests/regression_tilde_str_noecho.rs` and `examples/tilde-str-noecho.ilo`. [Idiomatic hints] After successful execution, ilo scans the source for non-canonical forms and emits hints to stderr: hint: `==` → `=` saves 1 char (both mean equality in ilo) hint: `length` → `len` (canonical short form) Builtin alias hints appear at most once per program (the first long-form name found). In JSON mode, hints appear as `{"hints":["..."]}` on stderr. Suppress with `--no-hints` / `-nh`. [CLI invocation] ilo 'code' [args...] -- inline program; default-runs the entry function ilo program.ilo [func] [args] -- if `func` is omitted and the file declares exactly one function, that function runs automatically ilo run program.ilo [func] [a] -- verb form; same dispatch as the bare positional ilo check program.ilo [--json] [--strict] -- run the verifier without executing (exit 0 = clean; --strict treats warnings as exit-code errors) ilo test [path] [--engine vm|jit|all] -- run `-- run:` / `-- out:` / `-- err:` assertions in .ilo files (exit 0 on all-pass, 1 on any failure) ilo build program.ilo -o out -- AOT compile to a standalone binary (alias for `compile`) ilo program.ilo --ast -- print parsed AST as JSON and exit ilo --explain ILO-T004 -- print error explanation and exit ilo help ai -- compact AI spec to stdout (= contents of ai.txt) ilo serv -- long-lived JSON request/response loop ilo --max-ast-depth N -- cap parser nesting at N (default 256; protects `ilo serv` and other untrusted-source paths from DoS payloads, raises ILO-P103) ilo --max-runtime SECS -- cap wall-clock runtime at SECS (default 60; 0 disables; raises ILO-R016) ilo --max-output-bytes BYTES -- cap stdout output at BYTES (default ~100 MB; 0 disables; raises ILO-R017) ilo run --allow-net[=HOSTS] -- restrict outbound net to comma-separated hosts (* = all, empty = none) ilo run --allow-read[=PATHS] -- restrict file reads to comma-separated path prefixes ilo run --allow-write[=PATHS] -- restrict file writes to comma-separated path prefixes ilo run --allow-run[=CMDS] -- restrict subprocess spawning to comma-separated command names **Capability flags (`ILO-CAP-001`).** `ilo run --allow-net=HOSTS --allow-read=PATHS --allow-write=PATHS --allow-run=CMDS` gates IO builtins at the process level. Any `--allow-*` flag present switches the runtime from **permissive** (default — no restrictions, full backwards compatibility) to **restricted** (only listed targets are permitted). Denial returns a normal `R` Err value with code `ILO-CAP-001`; programs can pattern-match it. Capability matrix: `get`/`post`/`put`/`patch`/`del`/`fetch` → `--allow-net`; `rd`/`rd-lines`/`ls`/`lsr` → `--allow-read`; `wr`/`wr-lines`/`wr-app` → `--allow-write`; `run`/`run2` → `--allow-run`. Value syntax: omit = unrestricted; `*` = all permitted; empty (`--allow-net=`) = all blocked; comma list = only those targets. Matching: net = hostname extracted from URL, exact or `*.domain` wildcard; read/write = path-prefix with separator boundary; run = basename or full-path match. See `SANDBOX.md` for the operator guide and `examples/capability-sandbox.ilo` for a runnable demo. **Production-safety guards (`ILO-R016`, `ILO-R017`).** `ilo run` caps wall-clock runtime at 60 s and stdout output at ~100 MB by default. A runaway loop (missing increment, recursion with no base case) aborts with `ILO-R016` once the time budget hits, instead of burning CPU forever; a `prnt` loop without termination aborts with `ILO-R017` once the byte budget hits, instead of filling the agent transcript with megabytes of garbage. Both guards write a structured diagnostic to stderr and exit 1. Defaults are well above any legitimate program (real agent tasks finish under 10 s and produce kilobytes); raise with `--max-runtime SECS` / `--max-output-bytes BYTES`, set either to `0` to disable. The guards were installed by the mandelbrot persona report (2026-05-20) which spun in an infinite loop and wrote 165 MB of stdout before the harness intervened. **Verb-noun aliases.** `ilo run ` is an exact alias for the bare positional `ilo ` - same dispatch, same engine selection, same arg handling. `ilo build -o ` is an alias for `ilo compile -o `. Both exist to match the toolchain conventions used by `cargo`, `go`, and `zero` so agents and humans can guess the command name without consulting the help text. The bare positional forms remain fully supported for backwards compatibility; nothing has been removed. **`ilo check`.** Standalone verifier invocation: lex, parse, resolve imports, and run the type verifier without proceeding to bytecode compilation or execution. Exit code 0 means the program is well-typed and verifier-clean; exit code 1 means at least one diagnostic was emitted on stderr. The output mode follows the global flags (`--json` for NDJSON diagnostics, `--text` for plain text, `--ansi` for coloured output; auto-detected when omitted - JSON when stderr is not a TTY, ANSI otherwise). `ilo check` works on both files and inline code; on a syntactically-broken input it still reports the parse error rather than crashing, which is important for editor and agent loops that may feed in half-written programs. **`ilo test`.** Runs the `-- run: ` / `-- out: ` (or `-- err: `) annotations embedded in `.ilo` source files - the same format the in-tree `tests/examples_engines.rs` integration harness already uses. A file path tests that one file; a directory walks `*.ilo` recursively. Each case runs as a subprocess (`ilo --vm `), output is asserted against the expected payload, and the result prints as `PASS path::fn (line N)` / `FAIL path::fn (line N) (got: X, want: Y)`. The final line reports `N passed, M failed`. Exit 0 if everything passed, 1 if any case failed or no annotations were found. The default engine is `--vm`; pass `--engine jit` or `--engine all` to widen the matrix. Per-file `-- engine-skip: vm jit` annotations skip the listed engines, matching the integration harness. Because every example under `examples/` uses this annotation format already, `ilo test examples/` doubles as a smoke test for the language itself and as a worked reference an agent can read when writing tests for its own programs. **`ilo check --strict`.** Treats every warning-severity diagnostic (ILO-T032 bare `fmt`, ILO-T033 bare `mset` / `+=` / `mdel`, ILO-W002 `@x (jpar! …){…}` steering to `jpar-list!`, future warning codes) as a hard exit-code failure. The diagnostic stream itself is unchanged: warnings still emit with `severity: "warning"` in the JSON output, so editor integrations that route by severity stay correct. Only the exit code is elevated. CI harnesses that gate merges on `ilo check` should use `--strict` so warnings can't slip through silently; for interactive use, the default (warnings-are-advisory) is the right behaviour. **Default-run.** Inline programs (`ilo 'code'`) and single-function files run their entry function with the remaining CLI args; no explicit function name needed. Multi-function files auto-pick a function called `main` when no positional func arg is supplied. The same heuristic applies to the explicit engine flags - `--vm` and `--jit` both auto-pick `main` on multi-fn files, matching the default-engine behaviour. With no `main` declared, supply a function-name argument. **AOT entry-pick.** `ilo compile file.ilo -o out` (alias `ilo build`) follows the same entry-pick rules as the in-process engines: a single user-defined function is used directly; on multi-function files the entry is `main` if defined, otherwise the explicit positional `func` arg (`ilo compile file.ilo -o out run`); otherwise the compile fails with `ILO-E801` and exits 1 without writing a binary. AOT does not fall back to "first declared function" - that historical default produced binaries that called the wrong entry symbol and SIGSEGV'd at runtime. **Default engine.** The bytecode register VM is the default execution path. It supports every opcode (closures with Phase 2 capture, listview windows, fused len-of-filter, every modern shape), and avoids the JIT compile-and-bail cost paid by the pre-v0.11.9 Cranelift-first default whenever a program touched an opcode the JIT couldn't handle. Cranelift JIT is opt-in via `--jit`; on opt-in, the JIT runs hot numeric loops and falls back to the VM on bailout. Phase 2 captures run natively on every public backend - VM, JIT, and AOT (`ilo compile`); AOT embeds the postcard `CompiledProgram` blob into the binary's `.rodata` so dispatch helpers can re-enter the VM on user-fn callbacks the same way the in-process runners do. For long-running workloads where the JIT pays for itself, opt in explicitly; for most agent workloads the VM is the right default. **Tree-walker is internal-only.** The tree-walking interpreter is no longer user-selectable: `--run-tree` and its `--run` alias were removed from the public CLI in 0.12.1 (they now error with the unknown-flag guard). The interpreter stays in-tree as the dispatch target for HOF / regex / fmt-variadic / IO / sleep / ct / rsrt / closure-bind-ctx shapes the VM and Cranelift haven't lifted natively yet - the VM bails to it transparently for the ops listed by `is_tree_bridge_eligible` (`rgx`, `rgxall`, `rgxall1`, `rgxall-multi`, `rgxsub`, `fmt`, `fmt2`, `rd`, `rdb`, `rdjl`, `rdin`, `rdinl`, `sleep`, `lsd`, `walk`, `glob`, `dirname`, `basename`, `pathjoin`, `fsize`, `mtime`, `isfile`, `isdir`, `run`, `env-all`, `jkeys`, `tz-offset`, `ct` 2-arg and 3-arg, `rsrt` 2-arg and 3-arg, `dur-parse`, `dur-fmt`, and the closure-bind ctx variants of `map`/`flt`/`fld`/`srt`). Cross-engine parity for those shapes is pinned by `tests/regression_builtin_bridge.rs` and `tests/regression_tree_bridge_invariants.rs`. 0.13.0+ is on track for a hard drop once the bridge consumers are lifted natively and the shared runtime types (`Value`, `MapKey`, `RuntimeError`, math helpers) are extracted from `src/interpreter/` to a non-engine module. **Subcommand dispatch.** The first positional argument is interpreted as a function name when it has the shape of an ilo identifier - `[a-z][a-z0-9]*(-[a-z0-9]+)*` - so `ilo file.ilo list-orders` routes to the `list-orders` function. Args that don't match the ident shape (file paths like `/tmp/data.json`, numbers, sigils, bracketed lists, anything with a `.` or `/`) route to `main` (or the entry function) as a positional CLI arg instead. Trailing dashes (`foo-`), doubled dashes (`foo--bar`), and negative numbers (`-1`) are not idents and pass through as data. **Unknown `--flag` guard.** Any token in the positional tail matching the clean long-flag shape `--word` or `--word-with-dashes` that isn't a recognised flag is rejected upfront with `error: unrecognised flag '--'. Use 'ilo --help' for valid flags. To pass it as a literal arg, separate with '--' first.` and exit 1. This prevents `ilo main.ilo --engine tree` from silently consuming `--engine` as a positional arg (which used to surface as misleading `ILO-R012 no functions defined` or `ILO-R004 main: expected N args, got N+1`). To pass a hyphen-prefixed token through as literal data, place the `--` separator first: `ilo main.ilo -- --foo`. Anything after the first `--` is data. Tokens with `=` (`--key=val`), trailing or doubled dashes (`--foo-`, `--foo--bar`), and negative numbers (`-1`) are not clean flag shapes and pass through unchanged. **Text-typed params.** When the entry function declares a parameter of type `t`, the CLI passes the raw arg through without numeric coercion. `ilo 'f x:t>t;x' 42` returns the string `"42"`, not the number 42. **Exit codes.** A program returning `Value::Err` (or `^reason` from the entry function) exits with code 1 and prints the err payload on stderr. `~v` (Ok) and any non-Result return value exit 0. Verifier and parser errors exit 2. **List args from the CLI.** Comma-separated args become `L n` or `L t` automatically: `ilo 'f xs:L n>n;sum xs' 1,2,3`. FORMATTER: Dense output is the default - newlines are for humans, not agents. No flag needed for dense format: ilo 'code' Dense wire format (default) ilo 'code' --dense / -d Same, explicit ilo 'code' --expanded / -e Expanded human format (for code review) [Dense format] Single line per declaration, minimal whitespace. Operators glue to first operand: cls sp:n>t;>=sp 1000{"gold"};>=sp 500{"silver"};"bronze" [Expanded format] Multi-line with 2-space indentation. Operators spaced from operands: cls sp:n > t >= sp 1000 { "gold" } >= sp 500 { "silver" } "bronze" Dense format is canonical - `dense(parse(dense(parse(src)))) == dense(parse(src))`. COMPLETE EXAMPLE: tool get-user"Retrieve user by ID" uid:t>R profile t timeout:5,retry:2 tool send-email"Send an email" to:t subject:t body:t>R _ t timeout:10,retry:1 type profile{id:t;name:t;email:t;verified:b} ntf uid:t msg:t>R _ t;get-user uid;?{^e:^+"Lookup failed: "e;~d:!d.verified{^"Email not verified"};send-email d.email "Notification" msg;?{^e:^+"Send failed: "e;~_:~_}} [Recursive Example] Factorial and Fibonacci as standalone functions: fac n:n>n;<=n 1 1;r=fac -n 1;*n r fib n:n>n;<=n 1 n;a=fib -n 1;b=fib -n 2;+a b diff --git a/examples/capability-sandbox.ilo b/examples/capability-sandbox.ilo new file mode 100644 index 00000000..e055ace0 --- /dev/null +++ b/examples/capability-sandbox.ilo @@ -0,0 +1,36 @@ +-- capability-sandbox.ilo: demonstrates CLI capability flags (ILO-59). +-- +-- Run with all capabilities denied (except net to httpbin.org): +-- ilo run --allow-net=httpbin.org examples/capability-sandbox.ilo +-- +-- Run with no flags (permissive / legacy mode — all IO allowed): +-- ilo run examples/capability-sandbox.ilo +-- +-- Run with all IO denied: +-- ilo run --allow-net= --allow-read= --allow-write= --allow-run= examples/capability-sandbox.ilo + +-- permitted-read: reads a file that is inside the allowed prefix. +-- Requires: --allow-read=/tmp (or permissive mode) +permitted-read>R t t + wr "/tmp/ilo_sandbox_demo.txt" "sandbox ok" + rd "/tmp/ilo_sandbox_demo.txt" + +-- denied-read: reads /etc/passwd which is outside /tmp. +-- Expected to return Err when --allow-read=/tmp is set. +denied-read>R t t + rd "/etc/passwd" + +-- check-net: attempts a network GET. +-- Expected to return Err when --allow-net= (empty) is set. +check-net>R t t + get "https://httpbin.org/get" + +-- main: exercises both a permitted capability (write+read in /tmp) and a +-- denied one (read outside the prefix), printing the outcomes. +main>_ + res-ok = permitted-read() + ?{res-ok|er: prnt +"file read denied: " er + ~v: prnt +"file read ok, contents: " v} + res-deny = denied-read() + ?{res-deny|er: prnt +"denied read blocked as expected: " er + ~v: prnt +"WARNING: denied read returned value " v} diff --git a/src/caps.rs b/src/caps.rs index 41ca1429..e5fb466a 100644 --- a/src/caps.rs +++ b/src/caps.rs @@ -90,7 +90,7 @@ impl Caps { Ok(()) } else { Err(format!( - "blocked by --allow-net policy: host={host} is not in the allowlist" + "ILO-CAP-001 blocked by --allow-net policy: host={host} is not in the allowlist" )) } } @@ -109,7 +109,7 @@ impl Caps { Ok(()) } else { Err(format!( - "blocked by --allow-read policy: path={path} is not in the allowlist" + "ILO-CAP-001 blocked by --allow-read policy: path={path} is not in the allowlist" )) } } @@ -128,7 +128,7 @@ impl Caps { Ok(()) } else { Err(format!( - "blocked by --allow-write policy: path={path} is not in the allowlist" + "ILO-CAP-001 blocked by --allow-write policy: path={path} is not in the allowlist" )) } } @@ -149,7 +149,7 @@ impl Caps { Ok(()) } else { Err(format!( - "blocked by --allow-run policy: cmd={cmd} is not in the allowlist" + "ILO-CAP-001 blocked by --allow-run policy: cmd={cmd} is not in the allowlist" )) } } diff --git a/tests/capability_flags.rs b/tests/capability_flags.rs index a814edae..d77d7aa8 100644 --- a/tests/capability_flags.rs +++ b/tests/capability_flags.rs @@ -91,6 +91,10 @@ fn allow_net_empty_blocks_get() { assert!(is_err_value(&tree), "tree: expected Err, got {tree:?}"); assert!(is_err_value(&vm_val), "vm: expected Err, got {vm_val:?}"); let msg = err_text(&tree); + assert!( + msg.contains("ILO-CAP-001"), + "err should include ILO-CAP-001 code, got: {msg}" + ); assert!( msg.contains("--allow-net"), "err should mention --allow-net, got: {msg}" @@ -164,6 +168,10 @@ fn allow_read_blocks_outside_prefix() { "vm: expected Err for /etc/passwd when read limited to /tmp" ); let msg = err_text(&tree); + assert!( + msg.contains("ILO-CAP-001"), + "err should include ILO-CAP-001 code, got: {msg}" + ); assert!( msg.contains("--allow-read"), "err should mention --allow-read, got: {msg}" @@ -214,6 +222,10 @@ fn allow_write_blocks_outside_prefix() { "vm: expected Err for write outside prefix" ); let msg = err_text(&tree); + assert!( + msg.contains("ILO-CAP-001"), + "err should include ILO-CAP-001 code, got: {msg}" + ); assert!( msg.contains("--allow-write"), "err should mention --allow-write, got: {msg}" @@ -239,6 +251,10 @@ fn allow_run_empty_blocks_run() { "expected Err when run allowlist is empty, got {result:?}" ); let msg = err_text(&result); + assert!( + msg.contains("ILO-CAP-001"), + "err should include ILO-CAP-001 code, got: {msg}" + ); assert!( msg.contains("--allow-run"), "err should mention --allow-run, got: {msg}"