feat(proof): PROOF9 quality memory system (#422)#447
Conversation
…ation (#422) Implement the complete PROOF9 subsystem: - Core proof package (models, ledger, obligations, scope, evidence, stubs, capture, runner) - 9 proof gates: UNIT, CONTRACT, E2E, VISUAL, A11Y, PERF, SEC, DEMO, MANUAL - SQLite ledger for requirements and evidence (raw sqlite3, follows existing patterns) - Keyword-based glitch classification → obligation mapping - Scope intersection engine using git changed files - Evidence attachment with SHA-256 checksums - Test stub generator for each gate type - Waiver system with expiry tracking - CLI commands: capture, run, list, show, waive, status - 42 tests covering all modules
WalkthroughAdds a new PROOF9 quality-memory subsystem: data models, SQLite ledger, capture/run CLI commands, obligation classification, scope intersection, evidence attachment, stub generation, and a runner that executes gates and updates requirement status. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI as CLI (cf proof capture)
participant Capture as Capture Engine
participant Ledger as Ledger
participant Stubs as Stub Generator
User->>CLI: cf proof capture --title ... --description ...
CLI->>Capture: capture_requirement(workspace, title, description, where, ...)
Capture->>Capture: classify_glitch(description)
Capture->>Capture: get_obligations(glitch_type)
Capture->>Capture: build_scope_from_capture(where)
Capture->>Ledger: next_req_id(...) / save_requirement(req)
Capture->>Stubs: generate_stubs(req)
Stubs-->>Capture: stubs dict
Capture-->>CLI: (Requirement, stubs)
CLI->>User: print created REQ summary
sequenceDiagram
participant User
participant CLI as CLI (cf proof run)
participant Runner as Runner Engine
participant Gates as Core Gates
participant Ledger as Ledger
participant Evidence as Evidence API
User->>CLI: cf proof run [--full] [--gate unit]
CLI->>Runner: run_proof(workspace, full, gate_filter)
Runner->>Ledger: check_expired_waivers(workspace)
Runner->>Ledger: list_requirements(workspace, status=OPEN)
Runner->>Runner: get_changed_scope(workspace)
loop per applicable requirement
loop per obligation
Runner->>Gates: _run_gate(gate)
Gates-->>Runner: (passed, output)
Runner->>Evidence: attach_evidence(workspace, req_id, gate, artifact, passed, run_id)
Evidence->>Ledger: save_evidence(...)
end
Runner->>Ledger: save_requirement(updated_req_status)
end
Runner-->>CLI: results mapping
CLI->>User: display results table / exit code
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
PROOF9 Quality Memory System — Code ReviewGood architectural foundation here. The package structure is clean, the separation of concerns is well thought out, and the CLI commands follow the established project conventions. A few issues need attention before merge, ranging from a critical data integrity bug to some lower-priority improvements. CriticalID collision in The current implementation uses Recommend switching to ImportantObligation status not persisted on partial-gate runs (
Fix: persist requirement state after every gate execution loop, not only on full success.
The valid values ( Connection leak in ledger functions No ledger function uses E2E stub template raises The template uses Low Priority
Rich markup injection in Test Coverage GapsThe 42 tests are a solid foundation, but these scenarios are missing:
What Is Working Well
The critical ID collision issue and the obligation persistence bug should be resolved before merge. The other items can be addressed in a follow-up. |
There was a problem hiding this comment.
Actionable comments posted: 12
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@codeframe/cli/proof_commands.py`:
- Around line 178-191: The command currently computes all_pass and prints the
summary but always exits zero; change the failure path to exit non-zero by
invoking sys.exit(1) (or raise SystemExit(1)) when all_pass is False after the
summary printing. Add an import sys at the top of the module if missing and
apply the exit call in the block following the console.print that shows
"[red]Some obligations failed.[/red]" so the CLI (proof run) returns a non-zero
status for CI.
- Around line 323-358: The CLI currently hardcodes approved_by="cli-user" and
leaves manual_checklist empty; change waive to determine the real approver
(prefer workspace metadata if available, e.g., check workspace.current_user or
workspace.user and fall back to getpass.getuser()) and include that value when
constructing Waiver, and collect manual_checklist from the user (add a new
Option --manual-checklist that accepts a comma-separated string or prompt
interactively via typer.prompt, split into a list) before calling
ledger.waive_requirement; update the Waiver instantiation in waive to set
approved_by to the resolved approver and manual_checklist to the parsed list.
In `@codeframe/core/proof/evidence.py`:
- Around line 53-56: The current loop returns True if any historical evidence
(from evidence_list returned by ledger.list_evidence(workspace, req.id)) has
ev.gate == gate and ev.satisfied, which lets an old passing record permanently
mark the gate satisfied; instead, filter evidence_list to the entries for the
target gate, pick the most recent evidence (by a timestamp/created_at field or
run identifier on ev), and derive satisfaction from that latest ev.satisfied
(also prefer evidence from the current run if present), i.e., replace the
early-return loop with logic that selects the newest matching ev and returns its
satisfied value.
- Around line 20-23: Instead of hashing the literal path when the artifact is
missing, make the code reject missing artifacts: in the block that currently
does if path.exists(): h.update(path.read_bytes()) else
h.update(file_path.encode()), replace the else branch with an explicit failure
(raise an appropriate exception such as FileNotFoundError or return an error
result) so missing files cannot produce a checksum and thus cannot be marked
satisfied by attach_evidence(..., satisfied=True, ...); ensure the error
includes the file_path and context so callers can handle/report the missing
artifact.
- Around line 27-46: attach_evidence currently persists Evidence but doesn't
emit an audit event; after saving the evidence with
ledger.save_evidence(workspace, evidence) call core/events.py's event emission
function (e.g., import and call emit_transition or emit from core.events) to
record a transition like "evidence.attached" with a payload containing at least
req_id, gate, run_id, satisfied, artifact_path, artifact_checksum, and timestamp
(use the Evidence instance fields), then return the evidence; place the emit
call immediately after ledger.save_evidence and ensure the import of the emitter
is added at top of the module.
In `@codeframe/core/proof/ledger.py`:
- Around line 178-204: The current ID allocation is racy because next_req_id()
relies on COUNT(*) and save_requirement() uses INSERT OR REPLACE, allowing
concurrent captures to get the same REQ ID and overwrite each other; fix by
moving to an atomic DB-generated ID (e.g., add an INTEGER PRIMARY KEY
AUTOINCREMENT sequence or a dedicated seq table) and change next_req_id() to
obtain the ID atomically via an INSERT + lastrowid (or SELECT from
sqlite_sequence) rather than COUNT(*), and update save_requirement() to perform
a plain INSERT that uses that atomic ID-derived REQ identifier (and NOT INSERT
OR REPLACE) so uniqueness is enforced by the DB and concurrent captures cannot
overwrite each other; update the proof_requirements schema/migration and adjust
any code paths that build the REQ-#### string to derive it from the atomic
integer returned by the new sequence logic.
- Around line 178-204: save_requirement and other proof write paths (requirement
saves, evidence inserts, waiver saves, expired-waiver reopenings) currently
mutate PROOF9 state but do not emit audit events; update each mutating function
(e.g., save_requirement plus the evidence insert, save_waiver, and
reopen-expired-waiver functions referenced) to import the core/events module and
emit a descriptive event after the DB commit using a consistent event name
(e.g., "proof.requirement.saved", "proof.evidence.inserted",
"proof.waiver.saved", "proof.waiver.reopened"), including a payload with
workspace_id, affected entity id(s) (req.id, evidence id, waiver id), actor
(created_by or current user), timestamp, and before/after or changed fields
where available; ensure emissions occur only after successful commit and include
error handling so failures to emit do not corrupt DB state.
In `@codeframe/core/proof/obligations.py`:
- Around line 29-32: The SECURITY_ISSUE heuristic is too permissive because
common domain tokens ("auth", "password", "token") inflate the SECURITY_ISSUE
bucket; remove those strings from GlitchType.SECURITY_ISSUE and instead place
them in a less-weighted bucket (or create GlitchType.AUTH_ISSUE) so they don't
trigger the stricter security label; also update any scoring/selection code that
returns the first max-scoring bucket (the logic that picks the highest-scoring
GlitchType) to rely on the tightened keyword lists so routine auth terms no
longer push items into SECURITY_ISSUE.
In `@codeframe/core/proof/runner.py`:
- Around line 34-37: The code currently treats unsupported gates as satisfied by
returning True when core_gate_name is missing; change this so unsupported gates
are not marked satisfied—return a non-satisfied state (e.g., False with a clear
message like "unsupported/needs-waiver") or surface a distinct pending/waiver
result instead of True. Update the branch that uses _GATE_TO_CORE and
core_gate_name (and the analogous block around lines 117-126) to return that
non-satisfied/pending tuple so callers cannot record skipped gates as proven
without explicit manual evidence or waiver.
- Around line 98-127: The loop currently updates obl.status but only calls
ledger.save_requirement when every obligation ran and all passed, causing status
changes from failed runs or --gate-filtered runs to be lost; after setting
obl.status inside the loop (the line that assigns "satisfied" or "failed"),
persist that change immediately by saving the requirement or obligation (e.g.,
call ledger.save_requirement(workspace, req) or the appropriate
ledger.save_obligation API) so each updated obligation status is written to the
ledger; additionally, when you detect all_satisfied and set req.status =
ReqStatus.SATISFIED, also stamp req.satisfied_at before calling
ledger.save_requirement(workspace, req) so the satisfied transition is
persisted.
In `@codeframe/core/proof/scope.py`:
- Around line 52-59: The current except block around get_status() swallows
errors and leaves scope.files empty, which makes file-scoped REQs appear
out-of-scope; change the handler so failures are surfaced instead of failing
open: in the except for get_status(workspace) (the block that sets scope.files),
after logging the warning include the original exception (or log as error) and
re-raise the exception so cf proof run will fail fast (or alternatively set
scope.files = None and adjust intersects() to treat None as "unknown/all files"
so enforcement still runs); update logging to include the exception details and
ensure references to get_status, scope.files and intersects() are adjusted
accordingly.
- Around line 78-86: The current loop over req_scope.files vs
changed_scope.files treats any req_file as a directory prefix and returns True
for any changed_file under that path; restrict the directory/prefix match so it
only applies when the captured scope is explicitly a directory. Update the logic
in the loop that iterates req_scope.files and changed_scope.files (symbols:
req_scope.files, changed_scope.files, req_file, changed_file) to first check
whether req_file denotes a directory (for example by requiring a trailing "/" or
another explicit directory marker on req_file) before performing the req_dir
prefix check (req_file.rsplit("/") and changed_file.startswith(req_dir + "/")).
Keep the exact concrete-file equality check (req_file == changed_file) as-is.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 97ef5a91-54e4-44e9-915c-2a91fd4bc0e6
📒 Files selected for processing (12)
codeframe/cli/app.pycodeframe/cli/proof_commands.pycodeframe/core/proof/__init__.pycodeframe/core/proof/capture.pycodeframe/core/proof/evidence.pycodeframe/core/proof/ledger.pycodeframe/core/proof/models.pycodeframe/core/proof/obligations.pycodeframe/core/proof/runner.pycodeframe/core/proof/scope.pycodeframe/core/proof/stubs.pytests/core/test_proof9.py
| def waive( | ||
| req_id: str = typer.Argument(help="Requirement ID to waive"), | ||
| reason: str = typer.Option(..., "--reason", "-r", help="Why this is being waived"), | ||
| expires: Optional[str] = typer.Option( | ||
| None, "--expires", help="Expiry date (YYYY-MM-DD)", | ||
| ), | ||
| repo_path: Optional[Path] = typer.Option( | ||
| None, "--workspace", "-w", help="Workspace path", | ||
| ), | ||
| ) -> None: | ||
| """Waive a requirement with reason and optional expiry. | ||
|
|
||
| Example: | ||
| codeframe proof waive REQ-0001 --reason "No automated test yet" --expires 2026-04-01 | ||
| """ | ||
| from codeframe.core.workspace import get_workspace | ||
| from codeframe.core.proof import ledger | ||
| from codeframe.core.proof.models import Waiver | ||
|
|
||
| workspace_path = repo_path or Path.cwd() | ||
| try: | ||
| workspace = get_workspace(workspace_path) | ||
| except Exception as e: | ||
| console.print(f"[red]Error:[/red] {e}") | ||
| raise typer.Exit(1) | ||
|
|
||
| expiry_date = None | ||
| if expires: | ||
| try: | ||
| expiry_date = date.fromisoformat(expires) | ||
| except ValueError: | ||
| console.print(f"[red]Error:[/red] Invalid date format: {expires} (use YYYY-MM-DD)") | ||
| raise typer.Exit(1) | ||
|
|
||
| waiver_obj = Waiver(reason=reason, expires=expiry_date, approved_by="cli-user") | ||
| updated = ledger.waive_requirement(workspace, req_id, waiver_obj) |
There was a problem hiding this comment.
Don't hard-code waiver approvals to cli-user.
Every waiver gets the same approved_by value, so later audits cannot tell who actually granted the exemption. The CLI also leaves manual_checklist empty even though the model supports it.
Minimal fix direction
def waive(
req_id: str = typer.Argument(help="Requirement ID to waive"),
reason: str = typer.Option(..., "--reason", "-r", help="Why this is being waived"),
+ approved_by: str = typer.Option(..., "--approved-by", help="Who approved this waiver"),
expires: Optional[str] = typer.Option(
None, "--expires", help="Expiry date (YYYY-MM-DD)",
),
@@
- waiver_obj = Waiver(reason=reason, expires=expiry_date, approved_by="cli-user")
+ waiver_obj = Waiver(reason=reason, expires=expiry_date, approved_by=approved_by)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@codeframe/cli/proof_commands.py` around lines 323 - 358, The CLI currently
hardcodes approved_by="cli-user" and leaves manual_checklist empty; change waive
to determine the real approver (prefer workspace metadata if available, e.g.,
check workspace.current_user or workspace.user and fall back to
getpass.getuser()) and include that value when constructing Waiver, and collect
manual_checklist from the user (add a new Option --manual-checklist that accepts
a comma-separated string or prompt interactively via typer.prompt, split into a
list) before calling ledger.waive_requirement; update the Waiver instantiation
in waive to set approved_by to the resolved approver and manual_checklist to the
parsed list.
| def attach_evidence( | ||
| workspace: Workspace, | ||
| req_id: str, | ||
| gate: Gate, | ||
| artifact_path: str, | ||
| satisfied: bool, | ||
| run_id: str, | ||
| ) -> Evidence: | ||
| """Create and persist an evidence record with artifact checksum.""" | ||
| evidence = Evidence( | ||
| req_id=req_id, | ||
| gate=gate, | ||
| satisfied=satisfied, | ||
| artifact_path=artifact_path, | ||
| artifact_checksum=_sha256(artifact_path), | ||
| timestamp=datetime.now(timezone.utc), | ||
| run_id=run_id, | ||
| ) | ||
| ledger.save_evidence(workspace, evidence) | ||
| return evidence |
There was a problem hiding this comment.
Emit an audit event when evidence is recorded.
attach_evidence() persists new proof state but never emits a core/events.py transition, so the audit trail misses when a requirement gains new evidence.
As per coding guidelines, codeframe/core/**/*.py: All core modules must emit events for state transitions via core/events.py for audit and observability.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@codeframe/core/proof/evidence.py` around lines 27 - 46, attach_evidence
currently persists Evidence but doesn't emit an audit event; after saving the
evidence with ledger.save_evidence(workspace, evidence) call core/events.py's
event emission function (e.g., import and call emit_transition or emit from
core.events) to record a transition like "evidence.attached" with a payload
containing at least req_id, gate, run_id, satisfied, artifact_path,
artifact_checksum, and timestamp (use the Evidence instance fields), then return
the evidence; place the emit call immediately after ledger.save_evidence and
ensure the import of the emitter is added at top of the module.
| evidence_list = ledger.list_evidence(workspace, req.id) | ||
| for ev in evidence_list: | ||
| if ev.gate == gate and ev.satisfied: | ||
| return True |
There was a problem hiding this comment.
Don't let one historical pass satisfy the gate forever.
This returns True as soon as any prior satisfied=True evidence exists for the gate. After one passing run, a newer failing run for the same gate will still look satisfied forever. Use the latest evidence for that gate, or evidence from the current run, when deriving satisfaction.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@codeframe/core/proof/evidence.py` around lines 53 - 56, The current loop
returns True if any historical evidence (from evidence_list returned by
ledger.list_evidence(workspace, req.id)) has ev.gate == gate and ev.satisfied,
which lets an old passing record permanently mark the gate satisfied; instead,
filter evidence_list to the entries for the target gate, pick the most recent
evidence (by a timestamp/created_at field or run identifier on ev), and derive
satisfaction from that latest ev.satisfied (also prefer evidence from the
current run if present), i.e., replace the early-return loop with logic that
selects the newest matching ev and returns its satisfied value.
| GlitchType.SECURITY_ISSUE: [ | ||
| "security", "xss", "injection", "auth", "csrf", "vulnerability", | ||
| "exploit", "credential", "password", "token", "permission", | ||
| ], |
There was a problem hiding this comment.
Tighten the security heuristic before it under-enforces auth regressions.
"auth", "password", and "token" are common domain terms, not strong security signals. Because Line 77 returns the first max-scoring bucket, routine auth logic defects can land in SECURITY_ISSUE, which shrinks required proof from UNIT/CONTRACT to only SEC.
Also applies to: 69-77
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@codeframe/core/proof/obligations.py` around lines 29 - 32, The SECURITY_ISSUE
heuristic is too permissive because common domain tokens ("auth", "password",
"token") inflate the SECURITY_ISSUE bucket; remove those strings from
GlitchType.SECURITY_ISSUE and instead place them in a less-weighted bucket (or
create GlitchType.AUTH_ISSUE) so they don't trigger the stricter security label;
also update any scoring/selection code that returns the first max-scoring bucket
(the logic that picks the highest-scoring GlitchType) to rely on the tightened
keyword lists so routine auth terms no longer push items into SECURITY_ISSUE.
codeframe/core/proof/scope.py
Outdated
| try: | ||
| from codeframe.core.git import get_status | ||
| status = get_status(workspace) | ||
| all_files = status.modified_files + status.staged_files + status.untracked_files | ||
| scope.files = list(set(all_files)) | ||
| except Exception as exc: | ||
| logger.warning("Could not detect changed files: %s", exc) | ||
|
|
There was a problem hiding this comment.
Don't fail open when changed-file detection breaks.
If get_status() raises, this silently returns an empty scope. With the current intersects() behavior, that makes file-scoped REQs look out of scope and can let cf proof run skip enforcement entirely instead of surfacing the error or falling back to a full run.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@codeframe/core/proof/scope.py` around lines 52 - 59, The current except block
around get_status() swallows errors and leaves scope.files empty, which makes
file-scoped REQs appear out-of-scope; change the handler so failures are
surfaced instead of failing open: in the except for get_status(workspace) (the
block that sets scope.files), after logging the warning include the original
exception (or log as error) and re-raise the exception so cf proof run will fail
fast (or alternatively set scope.files = None and adjust intersects() to treat
None as "unknown/all files" so enforcement still runs); update logging to
include the exception details and ensure references to get_status, scope.files
and intersects() are adjusted accordingly.
Critical fixes: - Reject missing artifacts instead of hashing path string - Use MAX(id) for REQ ID allocation to prevent overwrites - Unsupported gates return False (not True) to fail closed - Git detection failure returns None → runner matches everything Major fixes: - Non-zero exit code when proof checks fail - Always persist obligation status after run (not just on full success) - Exact file matching only (no directory expansion)
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (5)
codeframe/cli/proof_commands.py (1)
358-358:⚠️ Potential issue | 🟠 MajorHard-coded
approved_by="cli-user"loses audit trail.Every waiver gets the same approver value, making it impossible to trace who actually granted exemptions. Consider adding an
--approved-byoption or inferring from environment (e.g.,getpass.getuser()).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@codeframe/cli/proof_commands.py` at line 358, The waiver creation currently hardcodes approved_by="cli-user" in the line creating waiver_obj = Waiver(...), which loses the real approver; change the CLI that constructs the waiver (the command handler that calls Waiver) to accept an --approved-by option and/or default to the current OS user via getpass.getuser(), propagate that value into the Waiver(...) call instead of the literal "cli-user", and ensure any related help text or argument parsing for the command is updated to document the new option.codeframe/core/proof/evidence.py (2)
26-45:⚠️ Potential issue | 🟠 MajorMissing audit event emission for evidence attachment.
attach_evidence()persists evidence state but doesn't emit an event viacore/events.py. This violates the observability requirement for core modules. Based on learnings: "All core modules must emit events for state transitions viacore/events.pyfor audit and observability".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@codeframe/core/proof/evidence.py` around lines 26 - 45, attach_evidence currently persists Evidence via ledger.save_evidence but never emits an audit event; update attach_evidence to import the event emitter from core.events and emit a descriptive event (e.g., event type "evidence.attached") after ledger.save_evidence returns, including key fields from the Evidence object (req_id, gate, run_id, artifact_path, artifact_checksum, timestamp, satisfied) as the event payload so state transitions are observable; ensure the emitter call occurs after successful save and include any returned evidence id if available.
48-56:⚠️ Potential issue | 🟠 MajorAny historical pass permanently satisfies the gate.
check_obligation_satisfiedreturnsTrueas soon as any prior evidence withsatisfied=Trueexists. After one passing run, a subsequent failing run will still appear satisfied. Consider checking only the most recent evidence for the gate, or evidence from the current run.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@codeframe/core/proof/evidence.py` around lines 48 - 56, check_obligation_satisfied currently returns True if any historical evidence for the gate has ev.satisfied=True, which incorrectly treats a past pass as permanent; update it to consider only the most recent evidence for that requirement+gate (or evidence from the current run if a run identifier is available). Specifically, in check_obligation_satisfied replace the any-history check by filtering ledger.list_evidence(workspace, req.id) for ev.gate == gate, then select the single most recent evidence (e.g., by ev.timestamp / ev.created_at or by using a ledger.get_latest_evidence API if present) and return that evidence.satisfied (or if run-based semantics are required, filter by ev.run_id == current_run_id before selecting). Ensure you reference the existing symbols: check_obligation_satisfied, ledger.list_evidence, ev.gate, ev.satisfied, and any timestamp/run-id field on ev.codeframe/core/proof/ledger.py (2)
254-269:⚠️ Potential issue | 🔴 CriticalREQ ID allocation race condition persists.
While using
MAX(id)is better thanCOUNT(*), two concurrent captures can still both receive the same ID before either callssave_requirement(). WithINSERT OR REPLACE, the second write silently overwrites the first requirement. Consider atomic ID generation via a sequence table orINSERTwith conflict detection.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@codeframe/core/proof/ledger.py` around lines 254 - 269, The current next_req_id function is vulnerable to races because it reads MAX(id) then returns a value that can be duplicated by concurrent callers; replace that read-modify step with an atomic increment inside the DB (e.g. create a per-workspace sequence table like proof_req_seq(workspace_id PRIMARY KEY, last INTEGER) in _ensure_tables and then in next_req_id perform a single SQL statement that atomically increments and returns the new value (for modern SQLite: INSERT INTO proof_req_seq(workspace_id,last) VALUES(?,1) ON CONFLICT(workspace_id) DO UPDATE SET last=last+1 RETURNING last; for older SQLite, wrap UPDATE/INSERT in a transaction and SELECT last after UPDATE) and format that returned integer as REQ-####; reference next_req_id, _ensure_tables, and proof_requirements/save_requirement to ensure callers use this atomic allocator instead of MAX(id).
178-204:⚠️ Potential issue | 🟠 MajorAudit event emission missing from write paths.
save_requirement(and other mutating functions) persist state without emitting events viacore/events.py. This leaves the capture/run/waive flows without audit trails. Based on learnings: "All core modules must emit events for state transitions viacore/events.pyfor audit and observability".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@codeframe/core/proof/ledger.py` around lines 178 - 204, The save_requirement write path currently persists state but does not emit an audit event; import the event-emission API from core.events (e.g., emit_event or events.emit) and, after committing the DB change in save_requirement, call it to emit a concise audit event (e.g., event name "requirement.saved" or similar) with a payload containing requirement id (req.id), workspace id (workspace.id), actor (req.created_by), action (create/update), timestamp (use the same created_at/satisfied_at values), and the serialized requirement snapshot; apply the same pattern to all other mutating functions in this module so every state transition emits a corresponding event via core/events.py.
🧹 Nitpick comments (1)
codeframe/core/proof/scope.py (1)
61-67: Docstring mentions prefix matching but implementation uses exact match only.The docstring at lines 64-66 says "File matching supports prefix matching" but the implementation at lines 75-79 performs exact set intersection only. Update the docstring to match the actual behavior.
📝 Suggested fix
def intersects(req_scope: RequirementScope, changed_scope: RequirementScope) -> bool: """Check if a requirement scope overlaps with changed scope. - Returns True if any field has common elements. File matching - supports prefix matching (req file "src/auth/" matches changed - file "src/auth/login.py"). + Returns True if any field has common elements. File matching + uses exact equality only (no directory expansion). """🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@codeframe/core/proof/scope.py` around lines 61 - 67, The docstring for the function intersects incorrectly claims "File matching supports prefix matching" while the function (intersects(req_scope, changed_scope) in scope.py) currently only performs exact set intersections; update the docstring to remove the prefix-matching statement and clearly state that file comparisons use exact matches (i.e., intersection of file path sets), and ensure the description of the return value and behavior matches the actual implementation of intersects.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@codeframe/core/proof/ledger.py`:
- Around line 353-357: The UPDATE that clears expired waivers on
proof_requirements uses only the id filter and must also restrict by
workspace_id to avoid cross-workspace collisions; update the cursor.execute call
in the same block (the code referencing req.id and req.waiver.expires) to
include req.workspace_id in the WHERE clause (e.g., add "AND workspace_id = ?"
and pass req.workspace_id as a parameter) so the UPDATE mirrors other queries'
workspace scoping.
---
Duplicate comments:
In `@codeframe/cli/proof_commands.py`:
- Line 358: The waiver creation currently hardcodes approved_by="cli-user" in
the line creating waiver_obj = Waiver(...), which loses the real approver;
change the CLI that constructs the waiver (the command handler that calls
Waiver) to accept an --approved-by option and/or default to the current OS user
via getpass.getuser(), propagate that value into the Waiver(...) call instead of
the literal "cli-user", and ensure any related help text or argument parsing for
the command is updated to document the new option.
In `@codeframe/core/proof/evidence.py`:
- Around line 26-45: attach_evidence currently persists Evidence via
ledger.save_evidence but never emits an audit event; update attach_evidence to
import the event emitter from core.events and emit a descriptive event (e.g.,
event type "evidence.attached") after ledger.save_evidence returns, including
key fields from the Evidence object (req_id, gate, run_id, artifact_path,
artifact_checksum, timestamp, satisfied) as the event payload so state
transitions are observable; ensure the emitter call occurs after successful save
and include any returned evidence id if available.
- Around line 48-56: check_obligation_satisfied currently returns True if any
historical evidence for the gate has ev.satisfied=True, which incorrectly treats
a past pass as permanent; update it to consider only the most recent evidence
for that requirement+gate (or evidence from the current run if a run identifier
is available). Specifically, in check_obligation_satisfied replace the
any-history check by filtering ledger.list_evidence(workspace, req.id) for
ev.gate == gate, then select the single most recent evidence (e.g., by
ev.timestamp / ev.created_at or by using a ledger.get_latest_evidence API if
present) and return that evidence.satisfied (or if run-based semantics are
required, filter by ev.run_id == current_run_id before selecting). Ensure you
reference the existing symbols: check_obligation_satisfied,
ledger.list_evidence, ev.gate, ev.satisfied, and any timestamp/run-id field on
ev.
In `@codeframe/core/proof/ledger.py`:
- Around line 254-269: The current next_req_id function is vulnerable to races
because it reads MAX(id) then returns a value that can be duplicated by
concurrent callers; replace that read-modify step with an atomic increment
inside the DB (e.g. create a per-workspace sequence table like
proof_req_seq(workspace_id PRIMARY KEY, last INTEGER) in _ensure_tables and then
in next_req_id perform a single SQL statement that atomically increments and
returns the new value (for modern SQLite: INSERT INTO
proof_req_seq(workspace_id,last) VALUES(?,1) ON CONFLICT(workspace_id) DO UPDATE
SET last=last+1 RETURNING last; for older SQLite, wrap UPDATE/INSERT in a
transaction and SELECT last after UPDATE) and format that returned integer as
REQ-####; reference next_req_id, _ensure_tables, and
proof_requirements/save_requirement to ensure callers use this atomic allocator
instead of MAX(id).
- Around line 178-204: The save_requirement write path currently persists state
but does not emit an audit event; import the event-emission API from core.events
(e.g., emit_event or events.emit) and, after committing the DB change in
save_requirement, call it to emit a concise audit event (e.g., event name
"requirement.saved" or similar) with a payload containing requirement id
(req.id), workspace id (workspace.id), actor (req.created_by), action
(create/update), timestamp (use the same created_at/satisfied_at values), and
the serialized requirement snapshot; apply the same pattern to all other
mutating functions in this module so every state transition emits a
corresponding event via core/events.py.
---
Nitpick comments:
In `@codeframe/core/proof/scope.py`:
- Around line 61-67: The docstring for the function intersects incorrectly
claims "File matching supports prefix matching" while the function
(intersects(req_scope, changed_scope) in scope.py) currently only performs exact
set intersections; update the docstring to remove the prefix-matching statement
and clearly state that file comparisons use exact matches (i.e., intersection of
file path sets), and ensure the description of the return value and behavior
matches the actual implementation of intersects.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 628393d2-3b23-49d5-8490-794de2b897b4
📒 Files selected for processing (6)
codeframe/cli/proof_commands.pycodeframe/core/proof/evidence.pycodeframe/core/proof/ledger.pycodeframe/core/proof/runner.pycodeframe/core/proof/scope.pytests/core/test_proof9.py
🚧 Files skipped from review as they are similar to previous changes (1)
- codeframe/core/proof/runner.py
| if req.waiver and req.waiver.expires and req.waiver.expires.isoformat() <= today: | ||
| cursor.execute( | ||
| "UPDATE proof_requirements SET status = 'open', waiver = NULL WHERE id = ?", | ||
| (req.id,), | ||
| ) |
There was a problem hiding this comment.
Missing workspace_id filter in UPDATE for expired waivers.
The UPDATE statement filters only by id but should also filter by workspace_id for consistency with other queries and to prevent cross-workspace updates if IDs ever collide.
🔧 Suggested fix
cursor.execute(
- "UPDATE proof_requirements SET status = 'open', waiver = NULL WHERE id = ?",
- (req.id,),
+ "UPDATE proof_requirements SET status = 'open', waiver = NULL WHERE id = ? AND workspace_id = ?",
+ (req.id, workspace.id),
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@codeframe/core/proof/ledger.py` around lines 353 - 357, The UPDATE that
clears expired waivers on proof_requirements uses only the id filter and must
also restrict by workspace_id to avoid cross-workspace collisions; update the
cursor.execute call in the same block (the code referencing req.id and
req.waiver.expires) to include req.workspace_id in the WHERE clause (e.g., add
"AND workspace_id = ?" and pass req.workspace_id as a parameter) so the UPDATE
mirrors other queries' workspace scoping.
Code Review — PROOF9 Quality Memory System (#422)This is a solid architectural addition. The core/CLI separation is respected throughout — no FastAPI imports in CodeRabbit has already flagged a set of valid automated issues. This review focuses on higher-level concerns and items the automated pass did not cover. Critical1. 2. Requirement satisfaction check is stale-evidence vulnerable Important3. 4. 5. Obligation status mutations in Also in this block: when 6. Late import pattern in CLI is inconsistent with the rest of the codebase Nice to Have7. Scope docstring vs. implementation mismatch 8. Stubs are generated but never written to disk 9. No integration with Testing GapsThe 42-test suite covers the happy path well. Gaps worth tracking:
Architecture Compliance
Overall this is well-structured work. Address points 1-5 before merging — they affect the correctness of evidence recording and satisfaction tracking, which are the core promise of PROOF9. |
Summary
Implements #422: PROOF9 — Quality Memory System with Evidence-Based Verification
The complete PROOF9 subsystem that turns every failure into a permanent proof obligation:
Files Changed (12 files, ~2185 lines)
codeframe/core/proof/package (8 modules)codeframe/cli/proof_commands.py— CLI commandscodeframe/cli/app.py— Wireproof_appinto main CLItests/core/test_proof9.py— 42 testsAcceptance Criteria
cf proof capturecreates REQ with obligations from glitch descriptioncf proof runexecutes relevant obligations based on changed filescf proof statusshows satisfied/failing/waived countsTest Plan
Implementation Notes
state.db— consistent with existing modulescore/gates.pyfor UNIT gate executionCloses #422
Summary by CodeRabbit
New Features
Tests