Skip to content

ProveKit: closed-loop bug-report → verified bundle with substrate self-extension#4

Merged
TSavo merged 89 commits into
mainfrom
fix-loop
Apr 24, 2026
Merged

ProveKit: closed-loop bug-report → verified bundle with substrate self-extension#4
TSavo merged 89 commits into
mainfrom
fix-loop

Conversation

@TSavo
Copy link
Copy Markdown
Owner

@TSavo TSavo commented Apr 24, 2026

Summary

ProveKit: The Kit to Prove It's Fixed.

Implements the complete fix-loop plan (docs/plans/2026-04-23-fix-loop.md) plus the capture-the-change architectural shift, non-code artifact kinds, product rename, and documentation pass. Takes a bug report → produces a mechanically-verified bundle (patch + regression test + principle + optional new capability) gated by 18 oracles. Every LLM output feeds at least one mechanical verifier. The system extends its own substrate through the same atomic-bundle machinery that fixes bugs.

Architecture at a glance

  • 5 primitive registries (data, not code): capabilities, DSL relations, intake adapters, remediation layers, artifact kinds
  • 18 oracles gate every bundle across generation + verification
  • 2 bundle types (fix / substrate); substrate bundles self-extend the capability set
  • Capture-the-change pipeline: Claude edits files in a scratch overlay via agent(); git diff captures the structured change; oracles verify the result
  • 14 artifact kinds covering code, tests, docs, config, dependencies, prompts, lint rules, migrations — every kind of real-world fix is first-class

See README.md, ARCHITECTURE.md, RETROSPECTIVE.md.

What landed

Core plan (23 tasks + splits)

Architectural follow-ups

  • Oracle feat(self-contracts): encode 5-7 catalog-format spec rules as contract mementos #16 full execution: transpiles LLM-proposed extractor via ts.transpileModule → dynamic-imports → runs against positive/negative fixtures in a scratch SAST DB
  • Dogfood test (src/fix/dogfood.empty-catch.test.ts): proved the substrate-extension path closes the empty-catch gap end-to-end; oracle feat(py): port binaryCid + targetProofCid + sourceContractCid + EvidenceTerm to Python kit #14 rejected a DROP TABLE migration
  • CLI completion: provekit fix <ref> [--apply] [--max-sites N] runs the whole loop end-to-end with prDraft or autoApply modes
  • LLMProvider.agent(): Claude edits files in the overlay directly; C3–C6 migrated to capture-the-change; JSON-patch paths retained as backward-compat fallbacks
  • Artifact-kind expansion: 8 → 14 kinds covering test fixes, config updates, doc updates, dependency updates, prompt updates, lint rules, migration fixes
  • Rename: neurallogprovekit across the entire codebase (108 files, no legacy fallback); tagline wired into --version and CLI help
  • Documentation: README + ARCHITECTURE + RETROSPECTIVE for new readers

Stats

  • ~45 commits on this branch
  • 560 tests passing + 3 intentionally skipped + 1 flaky (apply.test.ts test 7 under parallel-load tmpdir race; passes in isolation)
  • tsc clean on src/fix/, src/sast/, src/dsl/, src/db/, src/llm/

Test plan

Known MVP pass-throughs (documented in RETROSPECTIVE.md)

🤖 Generated with Claude Code

TSavo and others added 30 commits April 23, 2026 12:21
Accepted design for a persistent-relational rebuild of the pipeline:
SQLite via Drizzle as the data layer, shadow AST with capability tables
over ts-morph, Datalog-flavored principle DSL compiling to Drizzle
queries, multi-level structural hashes for incremental invalidation,
instrumented execution + encoding-gap detector as the thesis-distinctive
oracle layer.

Scope honestly named: 4-6 months of focused work, phased A-G with each
phase inspectable on its own. Phase D ships the encoding-gap localization
capability THESIS.md commits to.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four parallel review agents surfaced real issues; each addressed:

CAPABILITY REVIEW (showstopper): spec's flagship DSL example referenced
decides.kind=="division" but no division is in the enum. Added seven
capability tables (node_arithmetic, node_assigns, node_returns,
node_member_access, node_non_null_assertion, node_truthiness, node_narrows),
two edge tables (signal_interpolations, post_dominance), and new columns
on existing capability tables (node_calls.callee_name, node_binding.kind,
node_decides.consequent/alternate, node_iterates.executes_at_least_once +
collection_source, node_throws.is_inside_handler). Collapsed redundancies
(try_handler into node_throws; error_throw signal renamed to throw_message).
Declared data_flow.slot vocabulary as closed set.

DSL REVIEW: added reusable predicate definitions (predicate zero_guard(...))
and taint abstractions (source/sink/sanitizer); added missing primitives
(branch_reaches, always_exits, literal_value, call_arity, method_name,
compound_assignment, encloses, mutates); renamed dominates() to directional
$a before $b reading; made parser lenient (no/not/not exists synonyms;
newlines/and both accepted); added compile-time enum/column/arity checks;
added LLM-generation test (20/23 threshold) as freeze gate.

PHASING REVIEW: re-sequenced from A-B-C-D to A-thin -> D-core -> A-full ->
B -> D-validation -> C -> E/F/G-parallel. Thesis-distinctive capability
ships week 3 riding on v1's existing harness/instrumented execution
instead of month 3 on greenfield foundation. Total time unchanged; risk
retirement order reversed.

INDEX REVIEW: added eight indexes (three driver indexes: node_signal(signal_kind,node_id),
node_smt(sort,node_id), contracts(stale_flag); four FK cascade indexes to
keep file delete O(n) not O(n^2); plus node_calls(callee_name,node_id) for
method-name principles). Materialize transitive data-flow closure at SAST
build time (symmetric with dominance) to avoid recursive-CTE cost on
cyclic inputs.

FOLLOW-UP from user ("no JSON columns anywhere; we have a data model,
leverage it"): promoted all runtime values to a relational graph
(runtime_values + runtime_value_object_members + runtime_value_array_elements).
trace_values, runtime_observation_values, gap_reports.smt_value/runtime_value,
clause_witnesses.model_value, and traces.outcome all now reference the graph
instead of storing JSON. Validator evidence split from one JSON column into
three structured tables per validator kind (semantic_classifier, semantic_diff,
adversarial). Design decision updated: no JSON columns in the schema,
period.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Installs better-sqlite3, drizzle-orm, drizzle-kit, and @types/better-sqlite3.
Adds openDb() with WAL mode and foreign-key enforcement; gitignores the DB files.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Task 3: TDD pass — traces (no FK to clauses yet) + trace_values with
composite PK (traceId, nodeId, iterationIndex) and cascade delete.
Migration 0001 generated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds tv_single_point_unique — a partial unique index on (trace_id, node_id)
WHERE iteration_index IS NULL — to catch duplicate writes for the common
single-point snapshot case that SQLite's composite PK silently allows when
iterationIndex is NULL. Column shape unchanged; test coverage added.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds three tables for SMT clause storage: clauses (with verdict enum
proven/violation/unknown/vacuous), clause_bindings (composite PK on
clauseId+smtConstant, cascade delete), and clause_witnesses (composite
PK, NOT NULL modelValueId FK to runtime_values, cw_by_value index).
Migration: drizzle/0003_workable_zarek.sql. 93/93 tests passing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove `sort` from `clause_witnesses` (it duplicated `clause_bindings.sort`
  for the same (clauseId, smtConstant) pair, allowing silent drift).
- Add composite FK (clauseId, smtConstant) → clause_bindings with ON DELETE
  CASCADE, replacing the direct clauseId → clauses FK. Cascade chain is intact:
  clauses → clause_bindings (cascade) → clause_witnesses (cascade).
- Close DB handle before rmSync in afterEach to avoid file-handle leaks.
- Add tests: FK rejects witness with no binding; cascade-deletes witness on
  binding delete.
- Migration 0004 uses SQLite table-rebuild (PRAGMA foreign_keys=OFF, temp table,
  INSERT SELECT, rename) since ALTER TABLE can't add composite FKs or drop cols.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…values

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… truncation)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ion + persistence

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nts + persistence

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…urce_expr, sort)

Adds a Binding Metadata section to the invariant derivation prompt. Every
declared SMT constant must come with a binding record pointing at the
source line and expression it represents, plus its SMT sort. The
downstream gap detector uses these to validate Z3's verdict against
runtime behaviour; without bindings, proofs are unactionable.

Teaches the LLM via:
- concrete field semantics (smt_constant must match declare-const name,
  source_line is 1-indexed, source_expr is a whitespace-normalised
  substring hint, sort matches the declared sort)
- an <abstract> sentinel for pure-model constants with no source line
- a worked example pairing an smt2 block with its bindings block
- the per-block pairing rule when a signal produces multiple blocks

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The prompt test was placed in prompts/ and also mirrored in src/. The
vitest config only covered src/**/*.test.ts so npx vitest run
prompts/invariant_derivation.test.ts returned no-files-found. Extending
the include glob makes both the test path specified in the task and the
full-suite run find it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The snapshot hook used a fixed globalThis.__neurallog_snapshot__, which
two parallel runHarnessWithTrace calls clobbered: run A installed its
array-pushing function, run B overwrote it with its own, run A's
instrumented code then pushed into run B's array, and run A returned
with zero captures. The full test suite exposed this when
gapDetection.test.ts and harness.captureTrace.test.ts ran in parallel.

Fix:
- instrumentForSnapshot accepts snapshotFnName (default kept for tests
  of the module in isolation; harness overrides it).
- runHarnessWithTrace generates an 8-byte random suffix per call, uses
  it for both the global name and the on-disk instrumented file path
  (which also raced when two runs targeted the same source).
- New test: two Promise.all()-parallel harness calls against different
  sources must each see their own captures. Regression-tests this bug
  deterministically rather than relying on pool-ordering luck.

Does not address the Math.random / Date.now re-entrance race (concurrent
runs can leave the stubs installed beyond their intended scope). That
has no active test signal and is noted for follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds explainGaps(db, contractKey) to src/cli.ts. Joins gap_reports to
clauses by contract key and, for each row, renders a header naming the
location and SMT constant, the Z3 model value and runtime observation
looked up through the runtime_values graph, the cause explanation, and
the gap kind.

Wires a --gaps flag into the existing `neurallog explain <file>:<line>`
command. When present, after the v1 ContractStore resolves the contract
by file+line (for its key), we short-circuit and read from the new
SQLite store instead of v1's proven/violations text. Two paths because
the encoding-gap output is meaningless for v1 contracts that predate
Phase D's binding metadata.

Covers three render cases: a normal IEEE gap with both SMT and runtime
values, the empty-store case ('No encoding gaps for X'), and a gap with
no values (invalid_binding) that should still render cleanly.

135/135 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wires the full Phase D pipeline against a committed fixture:
examples/division-by-zero.ts runs divide(0, 0), which produces NaN
(IEEE 754) despite the naive Real-arithmetic encoding that says the
quotient is finite 0.0. The test drives validator → Z3 model parser →
instrumented harness → value serializer → IEEE agent → gap_reports
insert → explainGaps render, and asserts the rendered output contains
'encoding-gap', '— q', 'NaN', and 'ieee_specials'.

This is the deliverable promised in THESIS.md: the tool catches a
specific encoding gap and attributes it to a specific binding with a
specific cause (SMT Real does not model IEEE's NaN). Phase A-thin +
D-core is functionally complete.

136/136 tests pass across 3 consecutive full-suite runs (race fix from
f103ca3 held).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Ships the thesis-distinctive encoding-gap capability on top of the
existing v1 harness. The pipeline now catches IEEE 754 / Real-arithmetic
divergences and attributes them to specific bindings with specific
causes.

Added:
- SQLite data layer via Drizzle (8 tables: runtime_values graph with
  object_members and array_elements, traces, trace_values, clauses,
  clause_bindings, clause_witnesses, gap_reports; 6 migrations).
- Z3 model parser (Real/Int/Bool/String plus NaN/Infinity/div_by_zero
  specials; BigInt for Int to avoid precision loss).
- Runtime value serializer (JS values → relational graph; circular
  refs; string/array truncation).
- ts-morph snapshot instrumentation with unique-per-call globals and
  on-disk paths (race-safe for concurrent harness runs).
- runHarnessWithTrace wiring on the existing harness.
- Line-based binding validator (transitional; SAST-based validation
  lands in Phase B).
- Three comparator agents: IEEE specials, outcome mismatch,
  path-not-taken. Independent files; new sort agents drop in.
- Gap detection orchestrator tying validator → Z3 parse → harness →
  agents → gap_reports persist.
- LLM derivation prompt extended to emit smt_bindings per declared
  constant, with a worked example and abstract-sentinel rule.
- CLI explain --gaps reads gap_reports + runtime_values and renders
  THESIS-style output.
- End-to-end fixture (examples/division-by-zero.ts) driving the full
  pipeline to an IEEE NaN gap report.

Review loops caught and fixed:
- NULL-in-composite-PK silent duplicates on trace_values →
  partial unique index.
- Sort duplication and missing structural FK on clause_witnesses →
  drop duplicate column, add composite FK with cascade chain.
- Global-state race on __neurallog_snapshot__ across concurrent
  harness runs → unique per-call name + per-call instrumented file
  path, with a parallel-runs regression test.

Deferred (flagged in commits):
- Math.random / Date.now stub re-entrance race (no active test signal).
- Object-truncation marker asymmetry with array branch.
- persistWitness transaction wrapping for Phase B scale.

Spec: docs/specs/2026-04-23-provekit-v2-design.md
Plan: docs/plans/2026-04-23-phase-ad-core.md

136/136 tests pass.
8-task plan wiring Phase D gap detection into `neurallog analyze`.
Bindings are derived mechanically from template match data (no LLM per
signal). Dormant invariant_derivation.md loader gets removed. New
GapDetectionPhase runs between derivation and axiom phases.

Also lands the SmtBinding type + optional smt_bindings field on
Violation as the foundation all subsequent tasks build on.

Baseline: 136/136 tests pass.
buildPrompt, loadTemplate, and compiledTemplate had zero call sites. The
pipeline runs entirely on mechanical templates via templateEngine;
invariant_derivation.md is reference documentation, not runtime input.
… data

Each TemplateResult now carries bindings mapping the emitted SMT
constants to their source positions. Derived from AST match metadata
the engine already computes; no per-principle authoring. Binary-expr
principles covered: division-by-zero, modulo-by-zero, addition-overflow,
subtraction-underflow, multiplication-overflow, falsy-default.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two concurrent runHarnessWithTrace calls racing on Math.random/Date.now:
Run A installed its stub, Run B overwrote with its own, Run A's restore
wrote B's stub back, leaving Math.random permanently stubbed as 0.5
after both exited. Post-snapshot-name fix this was silently broken —
not caught by the fixed-name race test because the snapshot fn is now
per-call but the globals stayed shared.

Fix: module-level ref count + origRandom/origNow pinned on 0→1
transition, restored on N→0. Concurrent callers cooperate; the last
to exit restores the reals.

8/8 consecutive full-suite runs green after the fix; flake observed
previously on 1-in-5 interleaved runs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 185a1783f3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/harness.ts
// two concurrent runs against the same source would race on the path.
const callId = randomBytes(8).toString("hex");
const snapshotFnName = `__neurallog_snapshot_${callId}__`;
const instrumentedPath = sourcePath.replace(/\.ts$/, `.__instrumented_${callId}__.ts`);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid deleting non-.ts source files in trace harness

instrumentedPath is derived with sourcePath.replace(/\.ts$/, ...), which is a no-op for .tsx, .js, or .jsx inputs. In those cases writeFileSync(instrumentedPath, instrumented) overwrites the original source file, and the finally block unconditionally unlinkSyncs that same path, deleting the user’s source file after gap detection. This is a destructive bug whenever contracts come from non-.ts files.

Useful? React with 👍 / 👎.

Comment thread src/pipeline/Pipeline.ts
Comment on lines +90 to +92
const contracts = new ContractStore(config.projectRoot).getAll();
const { data: gapDetection } = await this.gapDetectionPhase.execute(
{ derivation, projectRoot: config.projectRoot, contracts },
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restrict gap detection input to newly derived contracts

runFull passes ContractStore.getAll() into GapDetectionPhase, so any run that has at least one stale bundle reprocesses every historical contract, not just this derivation’s output. Because GapDetectionPhase inserts fresh clauses/gap_reports rows per violation, this inflates duplicate rows and runtime on incremental runs unrelated to most contracts. Feed only the contracts derived in this run (or add deduping) to avoid repeated re-ingestion.

Useful? React with 👍 / 👎.

TSavo and others added 6 commits April 24, 2026 14:47
After plan confirmation, the CLI now invokes runFixLoop end-to-end:
- Default: prDraft mode, writes ./neurallog-fix.patch + ./neurallog-fix.md
- --apply: autoApply mode, cherry-picks onto target branch after verification
- --max-sites N: override maxComplementarySites (default 10)

Test seam: runFixArgs.runFixLoopFn is injectable; defaults to the real
orchestrator. CLI tests mock it to exercise flag parsing + stdout shape
+ exit codes without re-running the C1-D3 pipeline.

Exit codes: 0 success, 1 intake, 2 locate, 3 classify, 4 user-declined,
5 orchestrator failure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add LLMProvider.agent() as an optional method to both LLMProvider
interfaces (llm/Provider.ts and fix/types.ts). ClaudeAgentProvider
implements it via the claude-agent-sdk query() with cwd/allowedTools/
maxTurns routing. StubLLMProvider gets an optional agentResponses
constructor arg that, when supplied, assigns the agent() field.

New captureChange.ts reconstructs a CodePatch from git diff after the
agent completes, excluding .neurallog/ overlay internals. The C3 stage
(generateFixCandidate.ts) dispatches to the agent path when llm.agent
is defined, falling back to the existing JSON-patch path for providers
without agent() support. One retry with oracle-#2 feedback is included.

18/18 tests pass (15 existing + 3 new captureChange + 2 new candidateGen).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…N fallback)

C4 (complementary): adds proposeChangeForSiteViaAgent in complementary.ts.
Per-site baseline snapshot captures all files changed vs HEAD before each agent
call; after the agent runs, new-since-baseline files are detected, then reverted
to HEAD or baseline so verifySiteChange can apply the patch from scratch.
generateComplementary dispatches on llm.agent?.

C5 (regression test): adds generateTestCodeViaAgent in testGen.ts. Agent writes
the test file directly; applyPatchToOverlay idempotently keeps overlay.modifiedFiles
in sync. generateRegressionTest dispatches on llm.agent?; oracle #9 unchanged.

C6 (capability substrate): adds proposeCapabilitySpecViaAgent in capabilityGen.ts.
Agent writes capability files under .neurallog/capability-proposal/<name>/
(already filtered by runAgentInOverlay). proposeCapabilitySpec accepts optional
overlay and dispatches; overlay is threaded through orchestrator →
generatePrincipleCandidate → proposeWithCapability → proposeCapabilitySpec.

Tests: agent path + JSON backward-compat path added for all three stages.
All 46 tests pass (1 slow integration test remains skipped).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…c/dep/prompt/lint/migration)

Un-stub startup_assert, documentation, and observability_hook with real isPresent logic
based on complementary change kinds and file path patterns. Add 6 new artifact kinds
(test_fix, config_update, dependency_update, prompt_update, lint_rule, migration_fix)
with a shared helpers.ts for file-pattern detection. Registry grows 8 → 14 kinds.
Tests grow 9 → 17 in artifactKindRegistry.test.ts; full suite 560 pass (apply.test.ts
worktree-cleanup flake is pre-existing, unrelated to this change).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Hard rename across all source, tests, config, and docs:
- CLI bin: neurallog → provekit (package.json + package-lock.json)
- Directory convention: .neurallog/ → .provekit/ (37 principle files mv'd via git mv)
- DB filename: neurallog.db → provekit.db
- Ignore file: .neurallogignore → .provekitignore
- Snapshot global: __neurallog_snapshot__ → __provekit_snapshot__
- Artifact outputs: neurallog-fix.patch/md → provekit-fix.patch/md
- Transport class: NeurallogTransform → ProvekitTransform, createNeurallogTransport → createProvekitTransport
- All path joins, error messages, comments, doc strings updated
- Root-level .md product docs (PITCH, LANDING, PRODUCT, THESIS, SIGNALS, POSTMORTEM) fully renamed
- SPEC.md and docs/plans/** archived with top-of-file note
- Version banner: "provekit v0.3.0 / The Kit to Prove It's Fixed." added to --version, printHelp(), and init command

No legacy fallback paths; no "try .neurallog/ if .provekit/ doesn't exist" logic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@TSavo TSavo changed the title fix-loop: closed-loop bug-report → verified bundle with substrate self-extension ProveKit: closed-loop bug-report → verified bundle with substrate self-extension Apr 24, 2026
@TSavo TSavo merged commit 042c20a into main Apr 24, 2026
1 check passed
TSavo added a commit that referenced this pull request May 4, 2026
…closes #211)

Side A substrate-conformance work for the swift kit. Two related changes,
both required for issue #211 acceptance:

1. SwiftSyntax-based lifter (Path B): replaces regex-v0 in
   `SwiftLifter.swift` with a real AST walker using Apple's official
   swift-syntax 600.x library. The wire shape (declarations, callEdges,
   warnings) is byte-preserved; LSPTests' 11 integration tests pass
   unmodified. Adds InitializerDeclSyntax handling (regex-v0 never
   matched `init`); the addition is additive and downstream-safe.

2. swift-self-contracts lift surface + RPC mode (Path A): mirrors the
   PR #220 daemon-lifecycle pattern (ts Side A) and PR #217 (cpp Side A
   wiring). The swift mint binary now speaks the lift-plugin protocol
   over NDJSON-on-stdio: initialize -> lift (proof-envelope shape) ->
   shutdown, with stdin-EOF graceful exit per architect rule #3.

   * `MintSwiftSelfContracts/main.swift`: thin entry-point that branches
     on `--rpc` argv. Slab moved to `Slab.swift`; RPC handlers in
     `RPC.swift`. SPM compiles all three into one module.
   * `Slab.swift`: pure function `swiftSelfContracts() -> [Declaration]`
     authoring all 11 lift-plugin-protocol contracts (C1-C8, 11 facets),
     callable from both human and RPC entry points.
   * `RPC.swift`: NDJSON loop; `handleLift` walks the slab, computes
     content-meaningful CIDs, emits a proof-envelope.
   * `.provekit/lift/swift-self-contracts/manifest.toml`: declares
     surface + capabilities + binary path. Mirrors the ts/go/cpp peers.
   * `cmd_mint.rs`: KIT_TABLE swift entry routes to swift-self-contracts.

Cross-kit IR byte-equality (per dispatch directive):
- contractSetCid: byte-equivalent. Computed via the canonical formula
  blake3_512(JCS(sorted([contract_cid(c) for c in slab]))) where each
  contract_cid = blake3_512(JCS({name, outBinding, pre?, post?, inv?})),
  matching `provekit-claim-envelope::contract_cid` and
  `provekit-claim-envelope::compute_contract_set_cid` exactly.
- proof-envelope bytes: NOT byte-equivalent. The swift kit emits a
  JCS-JSON catalog with self-identifying `kind:
  "swift-self-contracts-catalog-phase3-pending"` rather than the
  cross-kit CBOR-signed catalog. This is consistent with the existing
  Phase 3 deferral noted in `MintSwiftSelfContracts/main.swift:208-211`
  (now `main.swift:46-48`); a follow-up PR can land the full CBOR +
  Ed25519 envelope (mirroring PR #221 for python). The dispatcher only
  requires `kind:"proof-envelope"`, non-empty `filename_cid`, and
  decodable `bytes_base64` — all satisfied. The filename_cid changes
  whenever the slab changes, satisfying acceptance gate #2.

Acceptance gate #4 ("provekit prove --kit=swift exits 0"): provekit
prove doesn't accept --kit (see `provekit-cli/src/main.rs:119`); the
intent is read as "mint pipeline succeeds end-to-end on macOS." `make
mint-swift` runs cleanly and `verify-self-contracts` returns OK against
the new attestation, demonstrating the full pipeline.

- [x] `swift run conformance` (10/10 PASS)
- [x] `swift run test-swift-lsp` (11/11 PASS — wire shape preserved)
- [x] `swift run mint-swift-self-contracts` human mode (CID matches)
- [x] `mint-swift-self-contracts --rpc` direct subprocess (initialize/lift/shutdown trip)
- [x] `cargo test cmd_mint` (9/9 unit tests pass)
- [x] `cargo test mint_kit_integration swift_kit_pins_expected_contract_set_cid` (PASS)
- [x] `make mint-swift` end-to-end: contractSetCid pinned, attestation OK
- [x] Three independent paths agree on contractSetCid blake3-512:272543ef...

- swift contractSetCid: blake3-512:272543efe7c47b911659e1fc6a7368431b6eaa6010d2560a5d3e6717fcd470b50b24b607b481272941764b731d890d6973ab88e6000bde96fd306163a5742c56

Pre-existing test failures unrelated to #211: rust/cpp self-contracts
binaries are not built in this worktree, causing
`rust_kit_contract_set_cid_is_pinned_to_self_contracts_canonical` and
`cpp_kit_contract_set_cid_is_pinned_to_self_contracts_canonical` to
fall back to empty-set CIDs. Not a regression introduced by #211.

Closes #211.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TSavo added a commit that referenced this pull request May 5, 2026
…370, #384)

Fixes all P1/major/minor issues flagged by Copilot + CodeRabbit:

walk.rs:
- Bug #1: nested-block let-bindings now substituted at nested callsites;
  `CallsiteHit` carries `preceding_inner_stmts` (innermost-first) populated
  during the forward traversal; backward walk applies them before the outer-
  body statements.
- Bug #3: actual-arg lift now delegates to `crate::lift::lift_expr_to_term`
  (semantic, structure-stable) instead of the old token-string fallback;
  binary-expr actuals produce Ctor(op, ...) not `<expr:40 + 2>` tokens.
- Bug #4: arity mismatch now logs a warning and skips the callsite rather
  than silently zip-truncating. New test asserts the skip.
- Bug #11: stale MVP-scope comment updated (if-strengthening is live).

lift.rs:
- Bug #5: `debug_assert!` removed from the macro match arm; it is compiled
  out in release builds and must not contribute to contracts.
- Bug #6: `lift_function_postcondition` now scans for later `let` bindings
  that shadow an entry assertion's free variables and drops unsound asserts.
  Narrowed to LATER rebindings only (a preceding `let y = x; assert!(y ≥ 5)`
  is sound).
- Bug #7: explicit `return expr;` tails now derive `result = <lifted expr>`
  postcondition (previously only trailing-no-semicolon expressions matched).

canonical.rs:
- Bug #8: non-i64 JSON Numbers (floats, large u64) are now encoded as a
  tagged object `{"__provekit_non_i64_number__": "42.5"}` instead of a plain
  string, preventing numeric constants and string literals of the same text
  from sharing a CID.

walk_emit.rs / walk_demo.rs:
- Bug #9: `all_param_names` now synthesizes stable placeholder names
  (`__self`, `__arg{i}`) for non-Ident patterns so formal arity stays
  aligned with actual-argument count.
- Bug #10: NaN% division guarded with `if second_run_total > 0`.

11 new unit tests added (156 lib tests total, up from 145; 12 integration
tests all still passing).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TSavo added a commit that referenced this pull request May 12, 2026
…#596/#598/#599/#600

Per-kit changes:
- rust/provekit-cli: extend KNOWN_SURFACES anchor test to assert ruby-source, csharp-source, swift-source, zig-source are all registered (were in the array, not in the test)
- Makefile: wire test-ruby and test-php into test-all so the new lift kits run under ci (boy-scout -- flagged in #598 review)
- typescript-language-signature/mint.sh: replace find with explicit enumeration to match c11/rust pattern and remove portability footgun (PR #595 nit #3)
- ruby/ruby_source.rb: add instance_variable_get, instance_variable_set, const_get, const_set to METAPROGRAMMING_CALLS (PR #598 nit #4)
- php/PhpSourceCompiler.php: simplify isUnit() to remove precedence-reliant double-condition (PR #600 nit #2)

Already swept on origin/main (codex/lift-kit-nits-cleanup, empty cherry-picks confirmed):
- effect sort-key alignment: 5:unresolved_call -> 5:unresolved: (ruby + swift)
- csharp-source backfill in KNOWN_SURFACES
- swift: emit Refusal for unparseable function signatures + ThrowStmt API fix

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
TSavo added a commit that referenced this pull request May 12, 2026
…#596/#598/#599/#600

Per-kit changes:
- rust/provekit-cli: extend KNOWN_SURFACES anchor test to assert ruby-source, csharp-source, swift-source, zig-source are all registered (were in the array, not in the test)
- Makefile: wire test-ruby and test-php into test-all so the new lift kits run under ci (boy-scout -- flagged in #598 review)
- typescript-language-signature/mint.sh: replace find with explicit enumeration to match c11/rust pattern and remove portability footgun (PR #595 nit #3)
- ruby/ruby_source.rb: add instance_variable_get, instance_variable_set, const_get, const_set to METAPROGRAMMING_CALLS (PR #598 nit #4)
- php/PhpSourceCompiler.php: simplify isUnit() to remove precedence-reliant double-condition (PR #600 nit #2)

Already swept on origin/main (codex/lift-kit-nits-cleanup, empty cherry-picks confirmed):
- effect sort-key alignment: 5:unresolved_call -> 5:unresolved: (ruby + swift)
- csharp-source backfill in KNOWN_SURFACES
- swift: emit Refusal for unparseable function signatures + ThrowStmt API fix

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
TSavo added a commit that referenced this pull request May 12, 2026
…599/#600 (#632)

* chore(lift-kits): sweep accumulated MERGE-WITH-NITS nits from PRs #595/#596/#598/#599/#600

Per-kit changes:
- rust/provekit-cli: extend KNOWN_SURFACES anchor test to assert ruby-source, csharp-source, swift-source, zig-source are all registered (were in the array, not in the test)
- Makefile: wire test-ruby and test-php into test-all so the new lift kits run under ci (boy-scout -- flagged in #598 review)
- typescript-language-signature/mint.sh: replace find with explicit enumeration to match c11/rust pattern and remove portability footgun (PR #595 nit #3)
- ruby/ruby_source.rb: add instance_variable_get, instance_variable_set, const_get, const_set to METAPROGRAMMING_CALLS (PR #598 nit #4)
- php/PhpSourceCompiler.php: simplify isUnit() to remove precedence-reliant double-condition (PR #600 nit #2)

Already swept on origin/main (codex/lift-kit-nits-cleanup, empty cherry-picks confirmed):
- effect sort-key alignment: 5:unresolved_call -> 5:unresolved: (ruby + swift)
- csharp-source backfill in KNOWN_SURFACES
- swift: emit Refusal for unparseable function signatures + ThrowStmt API fix

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ruby-language-signature): regen op_source-unit.spec.json with notes field

The generator `generate_assets.py` was updated to emit a `notes` field on
the `source-unit` op spec but the committed file was never regenerated.
CI gate at Makefile:606 (`generate_assets.py --check`) caught the staleness.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: add composer to conformance job apt block (fixes test-php exit 127)

test-php runs `composer install && composer test`; the conformance job
installed php-cli but never composer, so the command was not found.
Adds the `composer` Ubuntu apt package to the shared system-deps step
so make test-all (and make ci) can reach it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
TSavo pushed a commit that referenced this pull request May 13, 2026
…pperMemento, validate_against, ParametricRealizationMemento

Blocker #1: Mint RealizationPlanMemento after each kit dispatch in
realize_function / apply_canonical_rewrite.

Blocker #2: Call plan.validate_against(&realization) after construction;
propagate RealizationPlanError.

Blocker #3: Build synthetic ParametricRealizationMemento inline (one slot
per param) via mint_realization_artifacts; catalog lookup path acknowledged
as future enhancement per cmd_bind.rs gap comment.

Blocker #4: Mint ObservationWrapperMemento when mode in {witness, monitor,
dispatcher} and kit returns observation_wrapper_emission_record; call
wrapper.validate before persisting; set plan.observation_wrapper_cid.

Nit: used_sugars subset check in invoke_realize returns
ext:unauthorized-sugar error when kit returns a CID not in the cited set.

Add observation_wrapper_emission_record to RealizedSource (both
kit_dispatch and cmd_transport layers) extracted from kit JSON response.

Extend EngineResult with realization_plan_mementos and
observation_wrapper_mementos; write to .provekit/bindings/
realization-plans/ and observation-wrappers/.

Add three unit tests in cmd_transport for mint_realization_artifacts
asserting: plan IS minted, validate_against passes, and wrapper IS minted
for mode=witness.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
TSavo added a commit that referenced this pull request May 13, 2026
* Route contract sugar through realize kits

* Add four substrate-side mints: RealizationPlanMemento, ObservationWrapperMemento, validate_against, ParametricRealizationMemento

Blocker #1: Mint RealizationPlanMemento after each kit dispatch in
realize_function / apply_canonical_rewrite.

Blocker #2: Call plan.validate_against(&realization) after construction;
propagate RealizationPlanError.

Blocker #3: Build synthetic ParametricRealizationMemento inline (one slot
per param) via mint_realization_artifacts; catalog lookup path acknowledged
as future enhancement per cmd_bind.rs gap comment.

Blocker #4: Mint ObservationWrapperMemento when mode in {witness, monitor,
dispatcher} and kit returns observation_wrapper_emission_record; call
wrapper.validate before persisting; set plan.observation_wrapper_cid.

Nit: used_sugars subset check in invoke_realize returns
ext:unauthorized-sugar error when kit returns a CID not in the cited set.

Add observation_wrapper_emission_record to RealizedSource (both
kit_dispatch and cmd_transport layers) extracted from kit JSON response.

Extend EngineResult with realization_plan_mementos and
observation_wrapper_mementos; write to .provekit/bindings/
realization-plans/ and observation-wrappers/.

Add three unit tests in cmd_transport for mint_realization_artifacts
asserting: plan IS minted, validate_against passes, and wrapper IS minted
for mode=witness.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Code <agentwopr@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants