Skip to content

v0.0.178

Choose a tag to compare

@aallan aallan released this 25 Jun 16:15
832f1d8

Added

  • Compiler self-introspection: vera builtins/effects/errors --json (#539). Three new CLI subcommands enumerate the compiler's own registries as a machine-readable {schema, items} envelope (or an aligned text table without --json): builtins (the built-in functions), effects (the effects and abilities, kind-tagged, including the parameterised Exn<T> exception effect), and errors (the diagnostic codes E001E702 and the W001 warning, with their compiler phase). Each item carries a best-effort since version — built-ins, effects, and abilities are git-attributed and full-coverage-gated by a test; error codes report null — and the published counts are now registry differentials: a test pins len(items) == len(registry), so a hand-maintained claim like "164 built-in functions" can no longer silently drift from the compiler. New vera/introspect.py (pure payload builders) + vera/_since.py (the version map); thin cmd_* glue in cli.py. The toolchain is documented end-to-end in the new TOOLCHAIN.md cookbook. Surfaced and motivated by the doc-consolidation brief (#528).
  • Characterization harness for execute() (#734). A consolidated, mutation-validated test harness (tests/test_execute_characterization.py) pinning the observable contract of execute() (vera/codegen/api.py) — every ExecuteResult field (value int/float/str/heap-pointer/None, stdout, state, exit_code, stderr) crossed with the three completion modes (normal return; WASM trap, which raises WasmTrapError rather than returning a result; interrupt/exit), plus the positional-constructor compatibility shape and capture_stderr True-vs-default. This is the green gate for the upcoming vera/runtime/ decomposition (#421): every host-binding extraction must keep it passing. Each cell was confirmed to flip RED when its target return path in api.py is deliberately broken (9 mutations), so none is green-for-the-wrong-reason. Test-only; no behaviour change.
  • Mutation-testing infrastructure and soundness-core baseline (#387). Adds mutmut as a [mutation] extra (chosen over cosmic-ray after a Phase-0 spike — its coverage-guided test selection is ~90× faster on this Z3/WASM suite, and its copy-based model never mutates the working tree in place), a [tool.mutmut] config wiring the in-process unit-test oracle (the subprocess suites test_conformance / test_cli / test_browser import the un-mutated package, so they cannot kill mutants), and a runbook (MUTATION.md) covering the tool decision, the oracle caveat, resume-after-hard-kill, the Z3-flakiness guardrail, and the equivalent-mutant # pragma: no mutate convention. The first sweep measures the soundness core (verifier.py, smt.py, checker/, obligations/ — 10,620 mutants): 80.8% caught, 2,038 survivors, emitted by scripts/mutation_report.py as a diff-able per-module mutation-summary.csv and a README mutation-score badge (mutation.json), with the full survivor inventory and per-module chart attached to the tracking issue. The whole-vera/ sweep is deferred behind #421: codegen/api.py's ~3,500-line execute() inflates a 774 MB mutant file mutmut cannot index. Tooling and tests only — no behaviour change; the issue stays open for soundness-core triage and per-module follow-ups.
  • Soundness-core test hardening (Phase 3 of #387). Strengthens type-checker tests in the call-, control-, and expression-checking core (vera/checker/calls.py, vera/checker/control.py, vera/checker/core.py, vera/checker/expressions.py) that stayed green when behaviour was deliberately broken under mutation: each hardened or added test now asserts a specific observable — the stable diagnostic error_code (call/constructor E203E204, E210E215, E220, E230E233, E240E241; match-arm E302; exhaustiveness E310E313; pattern E320E322; handler E330E335 plus body-type E121; data-invariant E120; expression E146, E160E161, E173E175), the diagnostic severity, or a type-propagation result — rather than free-text message prose. Test-only; no behaviour change. The tracking issue stays open for the whole-corpus sweep gated on #421.
  • Soundness-core test hardening — contract-verifier discharge / report / projection layer (#387). Extends the Phase 3 campaign from the type checker to verifier.py. TestObligationRecordCompleteness387 + TestProjectionHelpers387 (38 tests) pin the obligation-record and diagnostic-content surface that the coarse _verify_err / _verify_ok helpers left unasserted — every ProofObligation field (fn_name, error_code, counterexample, expr_text) and the _report_* diagnostic content (description / rationale / fix / spec_ref, plus the "modulo"-vs-"division" branch) for div_zero (E526), nat_sub (E502), index_bounds (E527), nat_bind (E503), refine_bind (E505), and postcondition (E500) sites — plus the @Float64-divisor exemption guard and the verified / tier3 summary counter bookkeeping (tier1_verified / tier3_runtime / total) via exact-count assertions. It also pins the De Bruijn projection / term helpers (_obligate_subpattern_narrowings, _obligate_destructure_narrowings, _check_nat_binding_obligation_term, _check_refined_binding_obligation_term) via ADT sub-pattern (Some(@Nat) / Some(@PosInt) on Option<Int>) and tuple-destructure narrowings — including a projection-index soundness differential where a mutated De Bruijn accessor would mis-source the obligation — and adds counterexample-line assertions to the _report_* builders, and pins the typing-predicate soundness gates (_is_nat_typed, _narrows_into_nat, _narrows_into_refined, _has_underflow_leaf, _meet_status) that decide whether an obligation fires. TestSmtTranslation387 (26 tests) adds differential regression coverage of the smt.py Z3-translation layer (translate / check / counterexample, via the call-precondition and Tier-1 levers; kills 8 of its 152 survivors, with the deeper translate / sort layer tracked in #792). Test-only; no behaviour change. The soundness surface (obligation fields, branch guards, codes, projection indices, tier3 routing) is pinned with == / exact assertions; the documented residual is message-cosmetic string-wrap mutations in the pure _report_* builders and per-branch reporting-counter bookkeeping. With this batch, #387's soundness-core scope is complete — the core mutation score rises from the 80.8% baseline to 83.3% caught; the remaining whole-vera/ sweep, gated on making the full marathon sweep reliable (the 10,620-mutant run deadlocks on mutmut 3.6 / Python 3.14), moves to #795, and the deep smt.py translate-layer / verifier-aggregation hardening plus the verifier timeout probe to #792.

Changed

  • codegen/api.py decomposition (#421). Splitting the 4,358-line public-API module into focused units, one layer per commit behind the #734 characterization harness. So far: (1) the pure ADT memory-layout utilities (ConstructorLayout, _wasm_type_size / _wasm_type_align / _align_up) → a focused vera/codegen/memory.py; (2) runtime-trap classification and source-backtrace resolution (WasmTrapError, TrapFrame, _classify_trap, _resolve_trap_frames) → a new vera/runtime/ package (traps.py); (3) the WASM heap-marshalling layer — memory read/write, GC shadow-rooting, the Map/Set bucket codec, the #578 wrap-handle range guard (_validate_wrap_handle), and Result/Option/Array allocation, all parameterised by the wasmtime.Caller — → vera/runtime/heap.py. The public surface is unchanged — from vera.codegen import ConstructorLayout and from vera.codegen.api import compile / execute / CompileResult / ExecuteResult / WasmTrapError all still resolve. (4) the twelve optional effect families are factored into per-effect vera/runtime/<family>.py modules, each a register_<family>(linker, ...) function (random, math, md, json, regex, html, map, set, decimal, http, inference, state; the value-type dispatch table lives in vera/runtime/collections.py, the LLM provider registry + HTTP call helper in vera/runtime/inference.py, and the shared request timeout (_HTTP_TIMEOUT) in vera/runtime/http.py, and Decimal's value store is created in execute() so the shared GC decref hook can close over it), and the shared collection-marshalling helpers (_read_i32/_write_f64 plus the Option/Array allocation wrappers used by Map/Set/Decimal/JSON/HTML) joined vera/runtime/heap.py. api.py is down from 4,358 to 1,182 lines (−73%). The always-on IO family intentionally stays inline in execute() — it is execute()'s observation channel (its output_buf/stderr_buf/last_violation become ExecuteResult fields), not a pluggable adapter, so extracting it would relocate rather than reduce coupling; the boundary rationale is documented in vera/README.md ("Host-binding families"). Lifting the runtime out of the public-API module also unblocks the deferred whole-vera/ mutation sweep (#387). Internal refactor; no behaviour change.

Security

  • Http rejects non-HTTP(S) URLs (#789). Http.get and Http.post now validate the URL scheme and return Result.Err for file://, ftp://, data:, and any other non-http(s) scheme, instead of handing it to urllib.request.urlopen — which would otherwise read local files or speak arbitrary protocols on behalf of a Vera program. Surfaced by CodeRabbit's review of the #421 decomposition.

Dependencies

  • Bump actions/checkout from 6 to 7 (#780).
  • Bump ruff from 0.15.17 to 0.15.18 (#781).
  • Bump pytest from 9.1.0 to 9.1.1 (#782).