Skip to content

Add coherence safeguards and README narratives#4

Merged
aallan merged 2 commits intomainfrom
housekeeping-coherence
Feb 23, 2026
Merged

Add coherence safeguards and README narratives#4
aallan merged 2 commits intomainfrom
housekeeping-coherence

Conversation

@aallan
Copy link
Copy Markdown
Owner

@aallan aallan commented Feb 23, 2026

Summary

Coherence-over-scale protections to prevent the project from drifting as it grows.

  • Spec code block validator (scripts/check_spec_examples.py) — extracts 154 code blocks from the spec, classifies them, and verifies the 21 currently-parseable blocks still parse. Catches spec rot when grammar changes break spec examples. Categorised allowlist tracks 30 spec/parser mismatches (the @T notation difference) and 4 future-syntax design proposals. Stale allowlist detection catches when spec edits shift line numbers.
  • Version sync check (scripts/check_version_sync.py) — verifies pyproject.toml and vera/__init__.py agree.
  • Dependabot — weekly PRs for pip and GitHub Actions dependency updates.
  • CODEOWNERS — automatic review requests on PRs.
  • CodeQL — security scanning on PRs and weekly schedule.
  • macOS CI — test matrix expanded from 3 jobs (ubuntu × 3 Python versions) to 6 jobs (ubuntu + macos × 3 Python versions).
  • CONTRIBUTING.md — pre-commit setup, validation scripts, branch protection rules documented.
  • README narratives — sub-headings and explanations for each code example in "What Vera Looks Like": Hello World (effects/contracts), absolute_value (postconditions/SMT), safe_divide (preconditions), increment (algebraic effects/state).

Test plan

  • All 284 tests pass
  • mypy clean (9 source files)
  • All 13 examples pass
  • Version sync passes
  • Spec validator passes (21 parsed, 34 allowlisted, 0 failures)
  • All pre-commit hooks pass
  • CI passes on GitHub (ubuntu + macOS matrix)

🤖 Generated with Claude Code

aallan and others added 2 commits February 23, 2026 08:51
Spec rot prevention:
- scripts/check_spec_examples.py extracts code blocks from spec/*.md,
  classifies them (parseable vs fragment), and verifies parseable blocks
  still parse with the current grammar. Categorised allowlist tracks
  FUTURE (design proposals), MISMATCH (spec/parser notation differences),
  and FRAGMENT (heuristic overrides). Stale allowlist entries are detected
  when spec edits shift line numbers.
- scripts/check_version_sync.py verifies pyproject.toml and
  vera/__init__.py agree on the version number.
- Both scripts run in CI lint job.

GitHub configuration:
- Dependabot for pip and GitHub Actions dependency updates (weekly).
- CODEOWNERS for automatic review requests.
- CodeQL security scanning workflow (weekly + on PRs).

CI improvements:
- Test matrix expanded to ubuntu-latest + macos-latest (6 jobs).
- Coverage gated on ubuntu/3.12 only.

Documentation:
- CONTRIBUTING.md: pre-commit setup, validation scripts, branch
  protection rules documented.
- README.md: sub-headings and narratives for each code example in
  "What Vera Looks Like" — explains contracts, slot references,
  preconditions, and algebraic effects for human readers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-advanced-security
Copy link
Copy Markdown

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

@aallan aallan merged commit ce37d0f into main Feb 23, 2026
10 checks passed
@aallan aallan deleted the housekeeping-coherence branch February 23, 2026 09:03
aallan added a commit that referenced this pull request Mar 29, 2026
Add DE_BRUIJN.md cross-references to SKILL.md, CLAUDE.md, vera/README.md,
and DESIGN.md (table row, design principle #4, prior art entry).

Rewrite DESIGN.md to reflect current language state:
- Fix 'no exceptions' (Exn<T> effect exists)
- Add Set<T> to collections
- Add Inference effect row with LLM-call rationale
- Add modules row
- Replace bare 'Z3 via SMT-LIB' with three-tier verification section
- Add full effect system table (IO, Http, State, Exn, Async, Inference)
- Expand diagnostics row (LLM-instruction format, error codes E001-E702)
- Add 122-function standard library row
- Add WebAssembly to prior art
- Restructure: principles → decisions → verification pipeline → effects → prior art

Co-Authored-By: Claude <noreply@anthropic.invalid>
aallan added a commit that referenced this pull request Apr 23, 2026
Two new issues surfaced by the same agent run:

- #515: $gc_collect itself faults with out-of-bounds memory access
  under sustained allocation pressure.  40×20 Conway's Game of Life
  over 200 generations reliably reproduces.  The collector walks
  past $heap_ptr to the linear-memory bound and traps — gc_collect
  is the top frame of the crashing stack, not the program.  This
  is distinct from #487 (alloc path) and #484 (sweeper 16-bit
  truncation).
- #516: Runtime traps bubble up as raw wasmtime stack traces; the
  CLI mis-labels every Trap/WasmtimeError as "Runtime contract
  violation" even when the actual cause is out-of-bounds memory
  access, integer overflow, unreachable, etc.  No source-line
  mapping, no actionable guidance — the exact opposite of the
  carefully crafted compile-time diagnostics.  Three-stage
  proposed scope: categorise trap reason, source-map the Vera
  function that trapped, specialise help for common trap classes.

KNOWN_ISSUES.md updated with rows for both.

ROADMAP.md implementation-order table re-sorted:
  1. #514 (nested closures + captured-scalar indirection)
  2. #515 (GC collect faults)
  3. #516 (runtime trap diagnostics)
  4. #475 (WASM translator bug cleanup — previously #1)
  5. #507 (Eq/Ord-dispatched array ops)

The top three are all surfaced by the single Game of Life agent
run — the empirical signal is the clearest prioritisation we've
had.  #475 was promoted to #1 after PR #511 merged; pushed to #4
as the new issues outrank it on agent-adoption impact.

Agent-noted quote, worth capturing as the motivation for #516
specifically:

  "the gap between 'the type system is happy' and 'the compiled
   artefact actually runs' is wider than you'd expect from a
   language with SMT-verified contracts. The verifier can prove
   your termination argument is sound while the codegen silently
   miscompiles your closure environment out from under you."

Also posted comment on #514 documenting the third shape the
agent is now isolating (closure body with i32/i64 type mismatch,
distinct from direct-nesting and captured-scalar-indirection).

No code changes.

Co-Authored-By: Claude <noreply@anthropic.invalid>
aallan added a commit that referenced this pull request May 7, 2026
…nt-integration

Reframed "What's next" to acknowledge the stabilisation work uncovered
by the v0.0.119–v0.0.137 bug-killing campaign. Two patterns worth
closing out before declaring runtime-correctness floor raised:

1. Scale-dependent bugs slip past the standard test suite (#515,
   #570, #487, now #593 — all required real-world programs at scale
   to surface; none would have been caught by the existing 3,747
   small focused tests).

2. Walker-completeness gaps are silent failures of the same shape
   (#588 found _walk_free_vars missing 8 AST branches; we don't yet
   know if other walkers in the codebase have similar incompleteness).

The "Implementation order" table is now split into two tiers:

  - Stabilisation tier (priority): #593, #596, #597, #595
  - Agent-integration tier (resumes after): #222, #523, #370

#596 (stress-test harness) and #597 (walker-completeness audit) are
new issues filed in this PR — they convert "I think we're stabilising"
into evidence by building the infrastructure that prevents recurrence
of the bug classes #588 and #593 surfaced.

#593 retains the lead position because it's the largest open codegen
unknown — bisection narrows it to 12x30 + recursive run_loop with
allocating call argument, but the root cause hasn't been isolated.
Until then, "WASM codegen is stable at scale" is unverified.

#595 sits last in the stabilisation tier because it likely shares a
root cause with #593 (both surface together at the same scale); fixing
#593 may close it. The v0.0.137 KeyboardInterrupt guard already
prevents the Python-traceback half.

Order #5 (LSP) doesn't start until orders #1-#4 close (or are
explicitly deferred with an open follow-up). Tier separator is
explicit so future readers don't have to reverse-engineer the
ordering rationale.

Co-Authored-By: Claude <noreply@anthropic.invalid>
aallan added a commit that referenced this pull request May 7, 2026
1. HISTORY.md "137 tagged releases" -> 138 (line 315): the v0.0.138
   row was added but the rollup footer count still said 137.

2. ROADMAP.md stabilisation-gate sentence: "order #5 starts when
   #1-#4 are closed" -> "order #4 starts when #1-#3 are closed".
   The earlier deletion of the #593 row (now closed) shifted the
   stabilisation tier to three items; the agent-integration tier
   numbering is also renumbered 5/6/7 -> 4/5/6 to match.

3. New test: test_eager_gc_array_mapi_with_non_allocating_string_closure
   mirrors the existing array_map String eager-GC test for the
   array_mapi unwind path with a fn(@Bool, @nat -> @string) closure
   returning pick(@Bool.0).  The existing array_mapi ADT test
   (test_array_mapi_non_allocating_closure_emits_balanced_push)
   covered the i32 branch; this new test covers the i32-pair branch.

4. Tighten the structural #593 predicate.  Pre-fix it accepted any
   "global.set $gc_sp" + absent "call $alloc"; CodeRabbit observed
   that a bare "global.set $gc_sp" could in principle be a stack-
   restore rather than a push.  In practice the prologue's
   gc_sp_save/restore is gated on ctx.needs_alloc and the regex
   already excludes call $alloc, so a non-allocating body cannot
   contain a restore -- but the literal predicate was weaker than
   the intent.  Adds _has_gc_shadow_push helper that anchors on the
   canonical "i32.const 4 / i32.add / global.set $gc_sp" SP-increment
   sequence emitted by gc_shadow_push (vera/wasm/helpers.py).  Used
   by both the i32-pair and i32-ADT structural tests; a non-push
   artefact in the body can no longer satisfy the assertion.

Plus: HISTORY.md tooling section gains a v0.0.138 row for
VERA_EAGER_GC alongside the existing tooling milestones (cross-
linked to ENVIRONMENT.md, which already has the full reference
section for it).

Suite: 3,747 passed (+1 from new array_mapi String test), 14 skipped.

Co-Authored-By: Claude <noreply@anthropic.invalid>
aallan added a commit that referenced this pull request May 7, 2026
Two Conway's Life agent experiments on --target browser surfaced
five concrete blockers (issues #602/#603/#604/#609/#610) and an
explicit design memo (#608) mapping each obstacle to a concrete
runtime-only fix.  The current stabilisation framing covers
codegen reliability and walker completeness but treats browser-
target reliability as a separate concern.  The agent's diagnosis
makes that split harder to defend: "write once, run anywhere"
is currently true for pure computation and approximate-to-false
for anything with timing or screen output.  Two of the fixes
(#609 JSPI-driven IO.sleep, #610 ANSI subset interpreter) close
the timing and rendering halves of the seam without language
changes -- adding them to the stabilisation tier commits to
"browser-target Vera is something you'd actually use" before
the agent-integration push.

Changes:
- Expanded the campaign-pattern list from two patterns to three;
  third pattern documents the browser-target seam with links to
  the umbrella issue (#608) and the five concrete blockers.
- Added #609 (JSPI-driven IO.sleep) as item 3 in the stabilisation
  tier, with rationale + the WebAssembly JSPI / Asyncify mechanism.
- Added #610 (ANSI subset interpreter) as item 4 in the same tier,
  paired with #609 -- together they close the seam and let life.vera
  (terminal version) run unchanged on --target browser.
- Renumbered #595 from item 3 to item 5 (now last in stabilisation,
  since it's contingent on an upstream wasmtime-py release).
- Renumbered the agent-integration tier items 4/5/6 -> 6/7/8 to
  match.
- Updated the "What moves when" gate from "#4 starts when #1-#3
  are closed" to "#6 starts when #1-#5 are closed".

Co-Authored-By: Claude <noreply@anthropic.invalid>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants