You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Added
Compiler self-introspection: vera builtins/effects/errors --json (#539). Three new CLI subcommands enumerate the compiler's own registries as a machine-readable {schema, items} envelope (or an aligned text table without --json): builtins (the built-in functions), effects (the effects and abilities, kind-tagged, including the parameterised Exn<T> exception effect), and errors (the diagnostic codes E001–E702 and the W001 warning, with their compiler phase). Each item carries a best-effort since version — built-ins, effects, and abilities are git-attributed and full-coverage-gated by a test; error codes report null — and the published counts are now registry differentials: a test pins len(items) == len(registry), so a hand-maintained claim like "164 built-in functions" can no longer silently drift from the compiler. New vera/introspect.py (pure payload builders) + vera/_since.py (the version map); thin cmd_* glue in cli.py. The toolchain is documented end-to-end in the new TOOLCHAIN.md cookbook. Surfaced and motivated by the doc-consolidation brief (#528).
Characterization harness for execute() (#734). A consolidated, mutation-validated test harness (tests/test_execute_characterization.py) pinning the observable contract of execute() (vera/codegen/api.py) — every ExecuteResult field (value int/float/str/heap-pointer/None, stdout, state, exit_code, stderr) crossed with the three completion modes (normal return; WASM trap, which raisesWasmTrapError rather than returning a result; interrupt/exit), plus the positional-constructor compatibility shape and capture_stderr True-vs-default. This is the green gate for the upcoming vera/runtime/ decomposition (#421): every host-binding extraction must keep it passing. Each cell was confirmed to flip RED when its target return path in api.py is deliberately broken (9 mutations), so none is green-for-the-wrong-reason. Test-only; no behaviour change.
Mutation-testing infrastructure and soundness-core baseline (#387). Adds mutmut as a [mutation] extra (chosen over cosmic-ray after a Phase-0 spike — its coverage-guided test selection is ~90× faster on this Z3/WASM suite, and its copy-based model never mutates the working tree in place), a [tool.mutmut] config wiring the in-process unit-test oracle (the subprocess suites test_conformance / test_cli / test_browser import the un-mutated package, so they cannot kill mutants), and a runbook (MUTATION.md) covering the tool decision, the oracle caveat, resume-after-hard-kill, the Z3-flakiness guardrail, and the equivalent-mutant # pragma: no mutate convention. The first sweep measures the soundness core (verifier.py, smt.py, checker/, obligations/ — 10,620 mutants): 80.8% caught, 2,038 survivors, emitted by scripts/mutation_report.py as a diff-able per-module mutation-summary.csv and a README mutation-score badge (mutation.json), with the full survivor inventory and per-module chart attached to the tracking issue. The whole-vera/ sweep is deferred behind #421: codegen/api.py's ~3,500-line execute() inflates a 774 MB mutant file mutmut cannot index. Tooling and tests only — no behaviour change; the issue stays open for soundness-core triage and per-module follow-ups.
Soundness-core test hardening (Phase 3 of #387). Strengthens type-checker tests in the call-, control-, and expression-checking core (vera/checker/calls.py, vera/checker/control.py, vera/checker/core.py, vera/checker/expressions.py) that stayed green when behaviour was deliberately broken under mutation: each hardened or added test now asserts a specific observable — the stable diagnostic error_code (call/constructor E203–E204, E210–E215, E220, E230–E233, E240–E241; match-arm E302; exhaustiveness E310–E313; pattern E320–E322; handler E330–E335 plus body-type E121; data-invariant E120; expression E146, E160–E161, E173–E175), the diagnostic severity, or a type-propagation result — rather than free-text message prose. Test-only; no behaviour change. The tracking issue stays open for the whole-corpus sweep gated on #421.
Soundness-core test hardening — contract-verifier discharge / report / projection layer (#387). Extends the Phase 3 campaign from the type checker to verifier.py. TestObligationRecordCompleteness387 + TestProjectionHelpers387 (38 tests) pin the obligation-record and diagnostic-content surface that the coarse _verify_err / _verify_ok helpers left unasserted — every ProofObligation field (fn_name, error_code, counterexample, expr_text) and the _report_* diagnostic content (description / rationale / fix / spec_ref, plus the "modulo"-vs-"division" branch) for div_zero (E526), nat_sub (E502), index_bounds (E527), nat_bind (E503), refine_bind (E505), and postcondition (E500) sites — plus the @Float64-divisor exemption guard and the verified / tier3 summary counter bookkeeping (tier1_verified / tier3_runtime / total) via exact-count assertions. It also pins the De Bruijn projection / term helpers (_obligate_subpattern_narrowings, _obligate_destructure_narrowings, _check_nat_binding_obligation_term, _check_refined_binding_obligation_term) via ADT sub-pattern (Some(@Nat) / Some(@PosInt) on Option<Int>) and tuple-destructure narrowings — including a projection-index soundness differential where a mutated De Bruijn accessor would mis-source the obligation — and adds counterexample-line assertions to the _report_* builders, and pins the typing-predicate soundness gates (_is_nat_typed, _narrows_into_nat, _narrows_into_refined, _has_underflow_leaf, _meet_status) that decide whether an obligation fires. TestSmtTranslation387 (26 tests) adds differential regression coverage of the smt.py Z3-translation layer (translate / check / counterexample, via the call-precondition and Tier-1 levers; kills 8 of its 152 survivors, with the deeper translate / sort layer tracked in #792). Test-only; no behaviour change. The soundness surface (obligation fields, branch guards, codes, projection indices, tier3 routing) is pinned with == / exact assertions; the documented residual is message-cosmetic string-wrap mutations in the pure _report_* builders and per-branch reporting-counter bookkeeping. With this batch, #387's soundness-core scope is complete — the core mutation score rises from the 80.8% baseline to 83.3% caught; the remaining whole-vera/ sweep, gated on making the full marathon sweep reliable (the 10,620-mutant run deadlocks on mutmut 3.6 / Python 3.14), moves to #795, and the deep smt.py translate-layer / verifier-aggregation hardening plus the verifier timeout probe to #792.
Changed
codegen/api.py decomposition (#421). Splitting the 4,358-line public-API module into focused units, one layer per commit behind the #734 characterization harness. So far: (1) the pure ADT memory-layout utilities (ConstructorLayout, _wasm_type_size / _wasm_type_align / _align_up) → a focused vera/codegen/memory.py; (2) runtime-trap classification and source-backtrace resolution (WasmTrapError, TrapFrame, _classify_trap, _resolve_trap_frames) → a new vera/runtime/ package (traps.py); (3) the WASM heap-marshalling layer — memory read/write, GC shadow-rooting, the Map/Set bucket codec, the #578 wrap-handle range guard (_validate_wrap_handle), and Result/Option/Array allocation, all parameterised by the wasmtime.Caller — → vera/runtime/heap.py. The public surface is unchanged — from vera.codegen import ConstructorLayout and from vera.codegen.api import compile / execute / CompileResult / ExecuteResult / WasmTrapError all still resolve. (4) the twelve optional effect families are factored into per-effect vera/runtime/<family>.py modules, each a register_<family>(linker, ...) function (random, math, md, json, regex, html, map, set, decimal, http, inference, state; the value-type dispatch table lives in vera/runtime/collections.py, the LLM provider registry + HTTP call helper in vera/runtime/inference.py, and the shared request timeout (_HTTP_TIMEOUT) in vera/runtime/http.py, and Decimal's value store is created in execute() so the shared GC decref hook can close over it), and the shared collection-marshalling helpers (_read_i32/_write_f64 plus the Option/Array allocation wrappers used by Map/Set/Decimal/JSON/HTML) joined vera/runtime/heap.py. api.py is down from 4,358 to 1,182 lines (−73%). The always-on IO family intentionally stays inline in execute() — it is execute()'s observation channel (its output_buf/stderr_buf/last_violation become ExecuteResult fields), not a pluggable adapter, so extracting it would relocate rather than reduce coupling; the boundary rationale is documented in vera/README.md ("Host-binding families"). Lifting the runtime out of the public-API module also unblocks the deferred whole-vera/ mutation sweep (#387). Internal refactor; no behaviour change.
Security
Http rejects non-HTTP(S) URLs (#789). Http.get and Http.post now validate the URL scheme and return Result.Err for file://, ftp://, data:, and any other non-http(s) scheme, instead of handing it to urllib.request.urlopen — which would otherwise read local files or speak arbitrary protocols on behalf of a Vera program. Surfaced by CodeRabbit's review of the #421 decomposition.