Add coherence safeguards and README narratives#4
Merged
Conversation
Spec rot prevention: - scripts/check_spec_examples.py extracts code blocks from spec/*.md, classifies them (parseable vs fragment), and verifies parseable blocks still parse with the current grammar. Categorised allowlist tracks FUTURE (design proposals), MISMATCH (spec/parser notation differences), and FRAGMENT (heuristic overrides). Stale allowlist entries are detected when spec edits shift line numbers. - scripts/check_version_sync.py verifies pyproject.toml and vera/__init__.py agree on the version number. - Both scripts run in CI lint job. GitHub configuration: - Dependabot for pip and GitHub Actions dependency updates (weekly). - CODEOWNERS for automatic review requests. - CodeQL security scanning workflow (weekly + on PRs). CI improvements: - Test matrix expanded to ubuntu-latest + macos-latest (6 jobs). - Coverage gated on ubuntu/3.12 only. Documentation: - CONTRIBUTING.md: pre-commit setup, validation scripts, branch protection rules documented. - README.md: sub-headings and narratives for each code example in "What Vera Looks Like" — explains contracts, slot references, preconditions, and algebraic effects for human readers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation. |
aallan
added a commit
that referenced
this pull request
Mar 29, 2026
Add DE_BRUIJN.md cross-references to SKILL.md, CLAUDE.md, vera/README.md, and DESIGN.md (table row, design principle #4, prior art entry). Rewrite DESIGN.md to reflect current language state: - Fix 'no exceptions' (Exn<T> effect exists) - Add Set<T> to collections - Add Inference effect row with LLM-call rationale - Add modules row - Replace bare 'Z3 via SMT-LIB' with three-tier verification section - Add full effect system table (IO, Http, State, Exn, Async, Inference) - Expand diagnostics row (LLM-instruction format, error codes E001-E702) - Add 122-function standard library row - Add WebAssembly to prior art - Restructure: principles → decisions → verification pipeline → effects → prior art Co-Authored-By: Claude <noreply@anthropic.invalid>
aallan
added a commit
that referenced
this pull request
Apr 23, 2026
Two new issues surfaced by the same agent run: - #515: $gc_collect itself faults with out-of-bounds memory access under sustained allocation pressure. 40×20 Conway's Game of Life over 200 generations reliably reproduces. The collector walks past $heap_ptr to the linear-memory bound and traps — gc_collect is the top frame of the crashing stack, not the program. This is distinct from #487 (alloc path) and #484 (sweeper 16-bit truncation). - #516: Runtime traps bubble up as raw wasmtime stack traces; the CLI mis-labels every Trap/WasmtimeError as "Runtime contract violation" even when the actual cause is out-of-bounds memory access, integer overflow, unreachable, etc. No source-line mapping, no actionable guidance — the exact opposite of the carefully crafted compile-time diagnostics. Three-stage proposed scope: categorise trap reason, source-map the Vera function that trapped, specialise help for common trap classes. KNOWN_ISSUES.md updated with rows for both. ROADMAP.md implementation-order table re-sorted: 1. #514 (nested closures + captured-scalar indirection) 2. #515 (GC collect faults) 3. #516 (runtime trap diagnostics) 4. #475 (WASM translator bug cleanup — previously #1) 5. #507 (Eq/Ord-dispatched array ops) The top three are all surfaced by the single Game of Life agent run — the empirical signal is the clearest prioritisation we've had. #475 was promoted to #1 after PR #511 merged; pushed to #4 as the new issues outrank it on agent-adoption impact. Agent-noted quote, worth capturing as the motivation for #516 specifically: "the gap between 'the type system is happy' and 'the compiled artefact actually runs' is wider than you'd expect from a language with SMT-verified contracts. The verifier can prove your termination argument is sound while the codegen silently miscompiles your closure environment out from under you." Also posted comment on #514 documenting the third shape the agent is now isolating (closure body with i32/i64 type mismatch, distinct from direct-nesting and captured-scalar-indirection). No code changes. Co-Authored-By: Claude <noreply@anthropic.invalid>
5 tasks
aallan
added a commit
that referenced
this pull request
May 7, 2026
…nt-integration Reframed "What's next" to acknowledge the stabilisation work uncovered by the v0.0.119–v0.0.137 bug-killing campaign. Two patterns worth closing out before declaring runtime-correctness floor raised: 1. Scale-dependent bugs slip past the standard test suite (#515, #570, #487, now #593 — all required real-world programs at scale to surface; none would have been caught by the existing 3,747 small focused tests). 2. Walker-completeness gaps are silent failures of the same shape (#588 found _walk_free_vars missing 8 AST branches; we don't yet know if other walkers in the codebase have similar incompleteness). The "Implementation order" table is now split into two tiers: - Stabilisation tier (priority): #593, #596, #597, #595 - Agent-integration tier (resumes after): #222, #523, #370 #596 (stress-test harness) and #597 (walker-completeness audit) are new issues filed in this PR — they convert "I think we're stabilising" into evidence by building the infrastructure that prevents recurrence of the bug classes #588 and #593 surfaced. #593 retains the lead position because it's the largest open codegen unknown — bisection narrows it to 12x30 + recursive run_loop with allocating call argument, but the root cause hasn't been isolated. Until then, "WASM codegen is stable at scale" is unverified. #595 sits last in the stabilisation tier because it likely shares a root cause with #593 (both surface together at the same scale); fixing #593 may close it. The v0.0.137 KeyboardInterrupt guard already prevents the Python-traceback half. Order #5 (LSP) doesn't start until orders #1-#4 close (or are explicitly deferred with an open follow-up). Tier separator is explicit so future readers don't have to reverse-engineer the ordering rationale. Co-Authored-By: Claude <noreply@anthropic.invalid>
6 tasks
aallan
added a commit
that referenced
this pull request
May 7, 2026
1. HISTORY.md "137 tagged releases" -> 138 (line 315): the v0.0.138 row was added but the rollup footer count still said 137. 2. ROADMAP.md stabilisation-gate sentence: "order #5 starts when #1-#4 are closed" -> "order #4 starts when #1-#3 are closed". The earlier deletion of the #593 row (now closed) shifted the stabilisation tier to three items; the agent-integration tier numbering is also renumbered 5/6/7 -> 4/5/6 to match. 3. New test: test_eager_gc_array_mapi_with_non_allocating_string_closure mirrors the existing array_map String eager-GC test for the array_mapi unwind path with a fn(@Bool, @nat -> @string) closure returning pick(@Bool.0). The existing array_mapi ADT test (test_array_mapi_non_allocating_closure_emits_balanced_push) covered the i32 branch; this new test covers the i32-pair branch. 4. Tighten the structural #593 predicate. Pre-fix it accepted any "global.set $gc_sp" + absent "call $alloc"; CodeRabbit observed that a bare "global.set $gc_sp" could in principle be a stack- restore rather than a push. In practice the prologue's gc_sp_save/restore is gated on ctx.needs_alloc and the regex already excludes call $alloc, so a non-allocating body cannot contain a restore -- but the literal predicate was weaker than the intent. Adds _has_gc_shadow_push helper that anchors on the canonical "i32.const 4 / i32.add / global.set $gc_sp" SP-increment sequence emitted by gc_shadow_push (vera/wasm/helpers.py). Used by both the i32-pair and i32-ADT structural tests; a non-push artefact in the body can no longer satisfy the assertion. Plus: HISTORY.md tooling section gains a v0.0.138 row for VERA_EAGER_GC alongside the existing tooling milestones (cross- linked to ENVIRONMENT.md, which already has the full reference section for it). Suite: 3,747 passed (+1 from new array_mapi String test), 14 skipped. Co-Authored-By: Claude <noreply@anthropic.invalid>
3 tasks
aallan
added a commit
that referenced
this pull request
May 7, 2026
Two Conway's Life agent experiments on --target browser surfaced five concrete blockers (issues #602/#603/#604/#609/#610) and an explicit design memo (#608) mapping each obstacle to a concrete runtime-only fix. The current stabilisation framing covers codegen reliability and walker completeness but treats browser- target reliability as a separate concern. The agent's diagnosis makes that split harder to defend: "write once, run anywhere" is currently true for pure computation and approximate-to-false for anything with timing or screen output. Two of the fixes (#609 JSPI-driven IO.sleep, #610 ANSI subset interpreter) close the timing and rendering halves of the seam without language changes -- adding them to the stabilisation tier commits to "browser-target Vera is something you'd actually use" before the agent-integration push. Changes: - Expanded the campaign-pattern list from two patterns to three; third pattern documents the browser-target seam with links to the umbrella issue (#608) and the five concrete blockers. - Added #609 (JSPI-driven IO.sleep) as item 3 in the stabilisation tier, with rationale + the WebAssembly JSPI / Asyncify mechanism. - Added #610 (ANSI subset interpreter) as item 4 in the same tier, paired with #609 -- together they close the seam and let life.vera (terminal version) run unchanged on --target browser. - Renumbered #595 from item 3 to item 5 (now last in stabilisation, since it's contingent on an upstream wasmtime-py release). - Renumbered the agent-integration tier items 4/5/6 -> 6/7/8 to match. - Updated the "What moves when" gate from "#4 starts when #1-#3 are closed" to "#6 starts when #1-#5 are closed". Co-Authored-By: Claude <noreply@anthropic.invalid>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Coherence-over-scale protections to prevent the project from drifting as it grows.
scripts/check_spec_examples.py) — extracts 154 code blocks from the spec, classifies them, and verifies the 21 currently-parseable blocks still parse. Catches spec rot when grammar changes break spec examples. Categorised allowlist tracks 30 spec/parser mismatches (the@Tnotation difference) and 4 future-syntax design proposals. Stale allowlist detection catches when spec edits shift line numbers.scripts/check_version_sync.py) — verifiespyproject.tomlandvera/__init__.pyagree.Test plan
🤖 Generated with Claude Code