Add coherence safeguards and README narratives by aallan · Pull Request #4 · aallan/vera

aallan · 2026-02-23T08:52:18Z

Summary

Coherence-over-scale protections to prevent the project from drifting as it grows.

Spec code block validator (scripts/check_spec_examples.py) — extracts 154 code blocks from the spec, classifies them, and verifies the 21 currently-parseable blocks still parse. Catches spec rot when grammar changes break spec examples. Categorised allowlist tracks 30 spec/parser mismatches (the @T notation difference) and 4 future-syntax design proposals. Stale allowlist detection catches when spec edits shift line numbers.
Version sync check (scripts/check_version_sync.py) — verifies pyproject.toml and vera/__init__.py agree.
Dependabot — weekly PRs for pip and GitHub Actions dependency updates.
CODEOWNERS — automatic review requests on PRs.
CodeQL — security scanning on PRs and weekly schedule.
macOS CI — test matrix expanded from 3 jobs (ubuntu × 3 Python versions) to 6 jobs (ubuntu + macos × 3 Python versions).
CONTRIBUTING.md — pre-commit setup, validation scripts, branch protection rules documented.
README narratives — sub-headings and explanations for each code example in "What Vera Looks Like": Hello World (effects/contracts), absolute_value (postconditions/SMT), safe_divide (preconditions), increment (algebraic effects/state).

Test plan

All 284 tests pass
mypy clean (9 source files)
All 13 examples pass
Version sync passes
Spec validator passes (21 parsed, 34 allowlisted, 0 failures)
All pre-commit hooks pass
CI passes on GitHub (ubuntu + macOS matrix)

🤖 Generated with Claude Code

Spec rot prevention: - scripts/check_spec_examples.py extracts code blocks from spec/*.md, classifies them (parseable vs fragment), and verifies parseable blocks still parse with the current grammar. Categorised allowlist tracks FUTURE (design proposals), MISMATCH (spec/parser notation differences), and FRAGMENT (heuristic overrides). Stale allowlist entries are detected when spec edits shift line numbers. - scripts/check_version_sync.py verifies pyproject.toml and vera/__init__.py agree on the version number. - Both scripts run in CI lint job. GitHub configuration: - Dependabot for pip and GitHub Actions dependency updates (weekly). - CODEOWNERS for automatic review requests. - CodeQL security scanning workflow (weekly + on PRs). CI improvements: - Test matrix expanded to ubuntu-latest + macos-latest (6 jobs). - Coverage gated on ubuntu/3.12 only. Documentation: - CONTRIBUTING.md: pre-commit setup, validation scripts, branch protection rules documented. - README.md: sub-headings and narratives for each code example in "What Vera Looks Like" — explains contracts, slot references, preconditions, and algebraic effects for human readers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-advanced-security · 2026-02-23T08:53:15Z

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

Add DE_BRUIJN.md cross-references to SKILL.md, CLAUDE.md, vera/README.md, and DESIGN.md (table row, design principle #4, prior art entry). Rewrite DESIGN.md to reflect current language state: - Fix 'no exceptions' (Exn<T> effect exists) - Add Set<T> to collections - Add Inference effect row with LLM-call rationale - Add modules row - Replace bare 'Z3 via SMT-LIB' with three-tier verification section - Add full effect system table (IO, Http, State, Exn, Async, Inference) - Expand diagnostics row (LLM-instruction format, error codes E001-E702) - Add 122-function standard library row - Add WebAssembly to prior art - Restructure: principles → decisions → verification pipeline → effects → prior art Co-Authored-By: Claude <noreply@anthropic.invalid>

Two new issues surfaced by the same agent run: - #515: $gc_collect itself faults with out-of-bounds memory access under sustained allocation pressure. 40×20 Conway's Game of Life over 200 generations reliably reproduces. The collector walks past $heap_ptr to the linear-memory bound and traps — gc_collect is the top frame of the crashing stack, not the program. This is distinct from #487 (alloc path) and #484 (sweeper 16-bit truncation). - #516: Runtime traps bubble up as raw wasmtime stack traces; the CLI mis-labels every Trap/WasmtimeError as "Runtime contract violation" even when the actual cause is out-of-bounds memory access, integer overflow, unreachable, etc. No source-line mapping, no actionable guidance — the exact opposite of the carefully crafted compile-time diagnostics. Three-stage proposed scope: categorise trap reason, source-map the Vera function that trapped, specialise help for common trap classes. KNOWN_ISSUES.md updated with rows for both. ROADMAP.md implementation-order table re-sorted: 1. #514 (nested closures + captured-scalar indirection) 2. #515 (GC collect faults) 3. #516 (runtime trap diagnostics) 4. #475 (WASM translator bug cleanup — previously #1) 5. #507 (Eq/Ord-dispatched array ops) The top three are all surfaced by the single Game of Life agent run — the empirical signal is the clearest prioritisation we've had. #475 was promoted to #1 after PR #511 merged; pushed to #4 as the new issues outrank it on agent-adoption impact. Agent-noted quote, worth capturing as the motivation for #516 specifically: "the gap between 'the type system is happy' and 'the compiled artefact actually runs' is wider than you'd expect from a language with SMT-verified contracts. The verifier can prove your termination argument is sound while the codegen silently miscompiles your closure environment out from under you." Also posted comment on #514 documenting the third shape the agent is now isolating (closure body with i32/i64 type mismatch, distinct from direct-nesting and captured-scalar-indirection). No code changes. Co-Authored-By: Claude <noreply@anthropic.invalid>

…nt-integration Reframed "What's next" to acknowledge the stabilisation work uncovered by the v0.0.119–v0.0.137 bug-killing campaign. Two patterns worth closing out before declaring runtime-correctness floor raised: 1. Scale-dependent bugs slip past the standard test suite (#515, #570, #487, now #593 — all required real-world programs at scale to surface; none would have been caught by the existing 3,747 small focused tests). 2. Walker-completeness gaps are silent failures of the same shape (#588 found _walk_free_vars missing 8 AST branches; we don't yet know if other walkers in the codebase have similar incompleteness). The "Implementation order" table is now split into two tiers: - Stabilisation tier (priority): #593, #596, #597, #595 - Agent-integration tier (resumes after): #222, #523, #370 #596 (stress-test harness) and #597 (walker-completeness audit) are new issues filed in this PR — they convert "I think we're stabilising" into evidence by building the infrastructure that prevents recurrence of the bug classes #588 and #593 surfaced. #593 retains the lead position because it's the largest open codegen unknown — bisection narrows it to 12x30 + recursive run_loop with allocating call argument, but the root cause hasn't been isolated. Until then, "WASM codegen is stable at scale" is unverified. #595 sits last in the stabilisation tier because it likely shares a root cause with #593 (both surface together at the same scale); fixing #593 may close it. The v0.0.137 KeyboardInterrupt guard already prevents the Python-traceback half. Order #5 (LSP) doesn't start until orders #1-#4 close (or are explicitly deferred with an open follow-up). Tier separator is explicit so future readers don't have to reverse-engineer the ordering rationale. Co-Authored-By: Claude <noreply@anthropic.invalid>

@Bool

1. HISTORY.md "137 tagged releases" -> 138 (line 315): the v0.0.138 row was added but the rollup footer count still said 137. 2. ROADMAP.md stabilisation-gate sentence: "order #5 starts when #1-#4 are closed" -> "order #4 starts when #1-#3 are closed". The earlier deletion of the #593 row (now closed) shifted the stabilisation tier to three items; the agent-integration tier numbering is also renumbered 5/6/7 -> 4/5/6 to match. 3. New test: test_eager_gc_array_mapi_with_non_allocating_string_closure mirrors the existing array_map String eager-GC test for the array_mapi unwind path with a fn(@Bool, @nat -> @string) closure returning pick(@Bool.0). The existing array_mapi ADT test (test_array_mapi_non_allocating_closure_emits_balanced_push) covered the i32 branch; this new test covers the i32-pair branch. 4. Tighten the structural #593 predicate. Pre-fix it accepted any "global.set $gc_sp" + absent "call $alloc"; CodeRabbit observed that a bare "global.set $gc_sp" could in principle be a stack- restore rather than a push. In practice the prologue's gc_sp_save/restore is gated on ctx.needs_alloc and the regex already excludes call $alloc, so a non-allocating body cannot contain a restore -- but the literal predicate was weaker than the intent. Adds _has_gc_shadow_push helper that anchors on the canonical "i32.const 4 / i32.add / global.set $gc_sp" SP-increment sequence emitted by gc_shadow_push (vera/wasm/helpers.py). Used by both the i32-pair and i32-ADT structural tests; a non-push artefact in the body can no longer satisfy the assertion. Plus: HISTORY.md tooling section gains a v0.0.138 row for VERA_EAGER_GC alongside the existing tooling milestones (cross- linked to ENVIRONMENT.md, which already has the full reference section for it). Suite: 3,747 passed (+1 from new array_mapi String test), 14 skipped. Co-Authored-By: Claude <noreply@anthropic.invalid>

Two Conway's Life agent experiments on --target browser surfaced five concrete blockers (issues #602/#603/#604/#609/#610) and an explicit design memo (#608) mapping each obstacle to a concrete runtime-only fix. The current stabilisation framing covers codegen reliability and walker completeness but treats browser- target reliability as a separate concern. The agent's diagnosis makes that split harder to defend: "write once, run anywhere" is currently true for pure computation and approximate-to-false for anything with timing or screen output. Two of the fixes (#609 JSPI-driven IO.sleep, #610 ANSI subset interpreter) close the timing and rendering halves of the seam without language changes -- adding them to the stabilisation tier commits to "browser-target Vera is something you'd actually use" before the agent-integration push. Changes: - Expanded the campaign-pattern list from two patterns to three; third pattern documents the browser-target seam with links to the umbrella issue (#608) and the five concrete blockers. - Added #609 (JSPI-driven IO.sleep) as item 3 in the stabilisation tier, with rationale + the WebAssembly JSPI / Asyncify mechanism. - Added #610 (ANSI subset interpreter) as item 4 in the same tier, paired with #609 -- together they close the seam and let life.vera (terminal version) run unchanged on --target browser. - Renumbered #595 from item 3 to item 5 (now last in stabilisation, since it's contingent on an upstream wasmtime-py release). - Renumbered the agent-integration tier items 4/5/6 -> 6/7/8 to match. - Updated the "What moves when" gate from "#4 starts when #1-#3 are closed" to "#6 starts when #1-#5 are closed". Co-Authored-By: Claude <noreply@anthropic.invalid>

aallan and others added 2 commits February 23, 2026 08:51

Add changelog entry for v0.0.5-2

8726158

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

aallan merged commit ce37d0f into main Feb 23, 2026
10 checks passed

aallan deleted the housekeeping-coherence branch February 23, 2026 09:03

aallan mentioned this pull request May 6, 2026

Fix #589: host_print never escapes Python UnicodeDecodeError on bad bytes #590

Merged

5 tasks

coderabbitai Bot mentioned this pull request May 7, 2026

v0.0.138: fix #593 closure-return shadow-push + ENVIRONMENT.md #598

Merged

6 tasks

aallan mentioned this pull request May 7, 2026

ROADMAP: add browser-target reliability (#609 + #610) to stabilisation tier #611

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add coherence safeguards and README narratives#4

Add coherence safeguards and README narratives#4
aallan merged 2 commits intomainfrom
housekeeping-coherence

aallan commented Feb 23, 2026

Uh oh!

github-advanced-security AI commented Feb 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

aallan commented Feb 23, 2026

Summary

Test plan

Uh oh!

github-advanced-security AI commented Feb 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants