feat(settings): PROOF9 defaults + workspace config tabs (#556) by frankbria · Pull Request #589 · frankbria/codeframe

frankbria · 2026-05-11T17:27:12Z

Summary

Implements the PROOF9 Defaults and Workspace Config tabs on the existing Settings page (issue [Phase 5.1] Settings page: PROOF9 defaults and workspace config #556 / Phase 5.1).
Adds the two missing backend config endpoints (/api/v2/proof/config, /api/v2/workspaces/config) that persist to .codeframe/proof_config.json and .codeframe/workspace_config.json.
Wires proof_config.json into the runner: gates not in enabled_gates are skipped; strictness="warn" keeps overall_passed=True on gate failures (failure evidence is still recorded).

Closes #556.

Acceptance criteria

PROOF9 gate defaults load, toggle, and persist
Strictness setting persists and is respected by gate runs
Workspace config loads and persists
npm test (813) and npm run build pass

Plan adaptation

Traycer's plan called for /api/v2/workspace/config (singular) via a new router. Implemented under the existing /api/v2/workspaces (plural) router for consistency — same behavior, one fewer router registration.

Test plan

uv run pytest -m v2 — 2949 passed, 9 skipped (e2b cloud)
uv run ruff check . — clean
cd web-ui && npm test — 813 passed
cd web-ui && npm run build — clean
Manual demo (see Demo section below)

New tests

tests/ui/test_proof_config.py (7) — defaults, persistence, validation, round-trip
tests/ui/test_workspace_config.py (5) — defaults, persistence, round-trip, validation
tests/core/test_proof_runner_config.py (4) — gate filtering, strictness behavior
web-ui/src/__tests__/components/settings/Proof9DefaultsTab.test.tsx (6)
web-ui/src/__tests__/components/settings/WorkspaceConfigTab.test.tsx (8)

Demo (run locally)

cf serve and cd web-ui && npm run dev
Open /settings → PROOF9 tab → toggle gates, change strictness, Save → reload → values persist
Settings → Workspace tab → edit branch / toggle auto-detect → Save → reload → values persist
Create a glitch REQ, write .codeframe/proof_config.json with enabled_gates: ["unit"] and strictness: "warn" → cf proof run: only unit gate runs; overall_passed=True even if it fails.

Summary by CodeRabbit

New Features
- Per-workspace PROOF9 config: selectable enabled gates and strictness (strict|warn); warn can mark runs as passed despite gate failures. GET/PUT endpoints to view/update. Vacuous-pass warning when all gates disabled.
- Per-workspace UI config: GET/PUT for workspace defaults.
- Settings page: working PROOF9 Defaults and Workspace Config tabs; client APIs and UI components.
Chores
- Atomic JSON write helper and shared PROOF9 constants.
Tests
- Backend and UI tests for persistence, validation, gate filtering, and strictness semantics.

Surfaces the last CLI-only onboarding knobs in the web UI: which of the 9 gates are enabled by default and the workspace's root/branch/tech-stack detection. Closes #556 (Phase 5.1). Backend ------- * GET/PUT /api/v2/proof/config — persists enabled_gates + strictness to .codeframe/proof_config.json; validates gates against the Gate enum * GET/PUT /api/v2/workspaces/config — persists workspace_root, default_branch, auto_detect_tech_stack, tech_stack_override to .codeframe/workspace_config.json * core/proof/runner.py reads proof_config.json, filters obligations by enabled_gates, and in 'warn' strictness keeps overall_passed=True on gate failures (still records the failure evidence) Frontend -------- * New Proof9DefaultsTab and WorkspaceConfigTab components with the same save/discard pattern as the Agent tab * Settings page wires both into the existing tabs * proofConfigApi + workspaceConfigApi added to lib/api.ts * PROOF9_GATES constant + types Diverges from Traycer's plan in one place: workspace config uses the plural prefix /api/v2/workspaces/config (consistent with the existing workspaces router) instead of a new singular /api/v2/workspace router.

coderabbitai · 2026-05-11T17:27:26Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

This PR adds per-workspace PROOF9 defaults and workspace configuration: typed schemas and constants, v2 GET/PUT endpoints with atomic JSON persistence, proof-runner integration to apply enabled_gates and strictness, SWR-backed React settings components and API clients, and tests covering backend and UI behavior.

Changes

Configuration Settings Feature

Layer / File(s)	Summary
Data Models & Type Contracts `web-ui/src/types/index.ts`, `web-ui/src/lib/proof.ts`, `codeframe/core/proof/models.py`	Adds `ProofStrictness` and request/response interfaces, `PROOF9_GATES`/`GATE_LABELS`, `PROOF_CONFIG_FILENAME`, and `PROOF9_GATE_ORDER`.
Atomic JSON Helper `codeframe/ui/routers/_helpers.py`	Adds `atomic_write_json(path, payload)` used by both routers for crash-safe writes (temp file + os.replace).
Backend PROOF9 Configuration Endpoints `codeframe/ui/routers/proof_v2.py`	Adds `GET /api/v2/proof/config` and `PUT /api/v2/proof/config` with Pydantic models, atomic JSON write, and validation of gates and strictness; run endpoint now prefers persisted strictness-aware `overall_passed` when available.
Backend Workspace Configuration Endpoints `codeframe/ui/routers/workspace_v2.py`	Adds `GET /api/v2/workspaces/config` and `PUT /api/v2/workspaces/config` with models, config path helpers, defaults, and atomic JSON persistence to `.codeframe/workspace_config.json`.
Proof Runner Configuration Integration `codeframe/core/proof/runner.py`	Adds `_load_proof_config()`, applies `enabled_gates` allowlist when iterating obligations, warns on empty `enabled_gates`, and adjusts `overall_passed` computation per `strictness` (`warn` may preserve True despite failures).
Frontend API Client Methods `web-ui/src/lib/api.ts`	Exports `proofConfigApi` and `workspaceConfigApi` with typed `getConfig()` and `updateConfig()` methods calling the v2 endpoints with `workspace_path`.
Proof9DefaultsTab React Component `web-ui/src/components/settings/Proof9DefaultsTab.tsx`	SWR-backed component that loads proof config, renders nine gate checkboxes and strictness select, computes dirty state, and saves/discards via `proofConfigApi.updateConfig` with SWR `mutate` and toasts.
WorkspaceConfigTab React Component `web-ui/src/components/settings/WorkspaceConfigTab.tsx`	SWR-backed component that loads workspace config, edits root/default_branch/tech-stack settings, computes dirty/canSave, and saves/discards via `workspaceConfigApi.updateConfig` with SWR `mutate` and toasts.
Settings Page Integration `web-ui/src/app/settings/page.tsx`	Replaces `ComingSoon` placeholders with `Proof9DefaultsTab` and `WorkspaceConfigTab`, passing `workspacePath`.
Proof Runner Config Tests `tests/core/test_proof_runner_config.py`	Integration tests verifying enabled_gates filtering and strictness effects on persisted run `overall_passed`.
Proof Config API Tests `tests/ui/test_proof_config.py`	Tests for `/api/v2/proof/config` GET/PUT: defaults, persistence, round-trip correctness, validation of gate names and strictness values, corrupted JSON fallback, and run-cache passed-semantics regression test.
Workspace Config API Tests `tests/ui/test_workspace_config.py`	Tests for `/api/v2/workspaces/config` GET/PUT: defaults, fallback when JSON is corrupted, persistence, PUT→GET round-trip, and required-field validation.
Proof9DefaultsTab Component Tests `web-ui/src/__tests__/components/settings/Proof9DefaultsTab.test.tsx`	Jest tests verifying gate checkbox rendering/toggling, Save/Discard button states, API invocation, and discard reset.
WorkspaceConfigTab Component Tests `web-ui/src/__tests__/components/settings/WorkspaceConfigTab.test.tsx`	Jest tests verifying form rendering, auto-detect override behavior, Save/Discard states, API persistence, and empty-field validation.
Docs & Roadmap `CLAUDE.md`, `docs/PRODUCT_ROADMAP.md`	Updates timestamps and roadmap items to reflect shipped Settings/PROOF9 and workspace config work.

Sequence Diagram(s)

sequenceDiagram
  participant UI as Settings UI (Proof9/Workspace)
  participant Frontend as web-ui `proofConfigApi`/`workspaceConfigApi`
  participant Server as FastAPI `/api/v2/*` routers
  participant FS as Workspace state dir (`.codeframe/*`)
  participant Runner as core.proof.runner

  UI->>Frontend: GET /config?workspace_path=...
  Frontend->>Server: GET /api/v2/.../config
  Server->>FS: read `.codeframe/*_config.json` (or return defaults)
  FS-->>Server: file contents / not found
  Server-->>Frontend: 200 ProofConfigResponse / WorkspaceConfigResponse
  Frontend-->>UI: data (SWR)
  UI->>Frontend: PUT /config (workspace_path, body)
  Frontend->>Server: PUT /api/v2/.../config with JSON
  Server->>FS: atomic write temp -> os.replace `.codeframe/*_config.json`
  FS-->>Server: write complete
  Server-->>Frontend: 200 saved config
  Note over Runner,FS: Later run_proof reads config
  Runner->>FS: read `.codeframe/proof_config.json`
  FS-->>Runner: enabled_gates, strictness
  Runner->>Runner: filter obligations by enabled_gates, execute gates
  Runner->>Runner: compute overall_passed per strictness (warn may preserve True)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

frankbria/codeframe#459: Earlier work that added the v2 proof router which this PR extends with PROOF9 config endpoints and strictness-aware run semantics.
frankbria/codeframe#574: Related changes to run persistence and GET /runs/{run_id}; interacts with how persisted passed is used.
frankbria/codeframe#575: Modifies the proof runner and run metadata; closely related to gate execution and overall_passed logic here.

Poem

🐰 I hop through settings, tidy and swift,
Nine little gates in a neat little drift.
Strict or a whisper (warn keeps the peace),
Save what you like, or discard with ease.
Toasts sing success — the workspace breathes.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.98% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the primary change: adding PROOF9 defaults and workspace config tabs to the Settings page, referencing the resolved issue `#556`.
Linked Issues check	✅ Passed	All coding-related acceptance criteria from issue `#556` are met: PROOF9 gate defaults load/toggle/persist [proof_v2.py, models.py, runner.py, frontend components]; strictness persists and is respected by runner [runner.py with strictness logic, backend/frontend tests]; workspace config loads/persists [workspace_v2.py]; tests pass.
Out of Scope Changes check	✅ Passed	All changes directly support issue `#556` objectives: backend endpoints for proof/workspace config, runner integration, frontend tabs/components, types, tests, and documentation updates. No unrelated modifications detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/issue-556-settings-proof9-workspace

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

claude · 2026-05-11T17:29:25Z

Review: feat(settings): PROOF9 defaults + workspace config tabs (#556)

Clean, well-structured implementation. The layering (core → router → UI) follows project conventions, test coverage is solid, and the plan deviation (plural workspaces router) is well-reasoned. A few items worth addressing before merge:

Issues

1. _PROOF_CONFIG_FILENAME is duplicated

"proof_config.json" is hardcoded in both runner.py and proof_v2.py. If the filename ever changes, there are two places to update and they can drift silently.

Move the constant to codeframe/core/proof/models.py (or a dedicated config.py) and import it in both call sites.

2. Missing OSError in GET endpoint exception handlers

runner.py._load_proof_config correctly catches (OSError, json.JSONDecodeError). The router endpoints only catch (json.JSONDecodeError, ValueError):

# proof_v2.py – get_proof_config
except (json.JSONDecodeError, ValueError) as e:
    ...

A permission-denied or mid-write corrupt read would still raise OSError and result in a 500. Add OSError to both GET handlers to match the runner's defense.

3. Non-atomic file writes

Both PUT endpoints write directly with path.write_text(json.dumps(...)). A crash or KeyboardInterrupt mid-write leaves a truncated JSON file. The next GET will silently fall back to defaults (losing the user's config). Consider a temp-file + atomic rename pattern:

import tempfile, os
tmp = path.with_suffix('.tmp')
tmp.write_text(json.dumps(payload, indent=2))
os.replace(tmp, path)

4. workspace_root is user-editable without path validation

UpdateWorkspaceConfigRequest only validates min_length=1. A user could store an arbitrary path that diverges from the workspace the server is actually using. At minimum, document that this field is informational/display-only, or strip the field from the request model if it should not be changed by the user.

Minor

5. Default gate order: alphabetical vs. semantic

_default_proof_config() returns sorted(_VALID_GATES) (alphabetical: a11y, contract, demo, e2e, manual, perf, sec, unit, visual), but the frontend's PROOF9_GATES array has a different semantic order (unit, contract, e2e, ...). On first load (no config file), the server returns alphabetical order, but after a save the file stores whatever order the UI sent. The isDirty check uses a Set (order-insensitive), so this won't cause false dirty states, but the initial GET for a new workspace returns a differently-ordered list than what the UI would write back. Low risk, but aligning _default_proof_config to match the frontend ordering avoids subtle surprises.

6. Strictness description in Proof9DefaultsTab is slightly off

<strong>strict</strong> fails proof runs on any open non-waived REQ.

The actual behavior (from runner.py) is that strict flips overall_passed=False when any gate execution fails — it's not specifically about "open non-waived REQs". Suggest: <strong>strict</strong> — any gate failure fails the proof run. <strong>warn</strong> — gate failures are recorded but overall_passed stays True.

7. test_returns_existing_config has an order-sensitive assertion

assert data["enabled_gates"] == ["unit", "sec"]

This implicitly relies on JSON preserving insertion order. It works in CPython ≥ 3.7+, but using assert set(data["enabled_gates"]) == {"unit", "sec"} would be both more robust and more consistent with how test_returns_defaults_when_no_config asserts.

8. Duplicate test_workspace fixture

tests/ui/test_proof_config.py and tests/ui/test_workspace_config.py have identical test_workspace fixtures. Extract to tests/ui/conftest.py to DRY this out.

What looks good

Core headless contract maintained — runner.py reads config via filesystem, no FastAPI imports
isDirty using Set in Proof9DefaultsTab is correct and order-insensitive
canSave in WorkspaceConfigTab correctly gates on non-empty workspace_root and default_branch
Pydantic field_validator for gate names gives a clear 422 error
strictness="warn" correctly records failure evidence while keeping overall_passed=True — behavior matches the design intent
Frontend tests cover loading, toggling, save, discard, and validation paths
All four test files are marked v2 — consistent with project conventions

The atomic-write (#3) and OSError (#2) items are the most important to fix before merge. The rest can be follow-ons if preferred.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

codeframe/ui/routers/proof_v2.py (1)

417-433: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Align cached passed with strictness-aware overall_passed.

Line 417 recomputes passed from raw gate results, which ignores warn-mode semantics from run_proof() and can incorrectly report failure via cached run status.

💡 Proposed fix

-        passed = all(
-            satisfied
-            for gate_results in results.values()
-            for _, satisfied in gate_results
-        )
+        persisted_run = get_run(workspace, run_id)
+        if persisted_run is not None:
+            passed = persisted_run.overall_passed
+        else:
+            # Fallback only if persistence unexpectedly failed
+            passed = all(
+                satisfied
+                for gate_results in results.values()
+                for _, satisfied in gate_results
+            )

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@codeframe/ui/routers/proof_v2.py` around lines 417 - 433, The cached "passed"
is being recomputed from raw gate results (variable passed) which ignores
warn-mode semantics from run_proof(); instead use the strictness-aware
overall_passed produced by run_proof() when building the RunProofResponse and
when storing into _run_cache. Replace the recomputation of passed with
overall_passed (and ensure RunProofResponse.success uses overall_passed) and
store that value into _run_cache[(str(workspace.repo_path), run_id)]["passed"]
so cached status matches run_proof() semantics.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@web-ui/src/components/settings/Proof9DefaultsTab.tsx`:
- Around line 58-67: The component Proof9DefaultsTab treats absence of draft as
a load failure causing a transient "Failed to load" flash; change the error
condition so missing draft is not considered an error during hydration—keep the
loading branch as `if (isLoading && !draft)` but update the failure branch
(currently `if (error || !draft || !data)`) to `if (error || !data)` so only
real errors or missing server data show the failure message; locate and update
the conditional in Proof9DefaultsTab where isLoading, draft, error, and data are
checked.

---

Outside diff comments:
In `@codeframe/ui/routers/proof_v2.py`:
- Around line 417-433: The cached "passed" is being recomputed from raw gate
results (variable passed) which ignores warn-mode semantics from run_proof();
instead use the strictness-aware overall_passed produced by run_proof() when
building the RunProofResponse and when storing into _run_cache. Replace the
recomputation of passed with overall_passed (and ensure RunProofResponse.success
uses overall_passed) and store that value into
_run_cache[(str(workspace.repo_path), run_id)]["passed"] so cached status
matches run_proof() semantics.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2c066910-0a37-4352-bb7e-9c9a5b4729af

📥 Commits

Reviewing files that changed from the base of the PR and between c4023b4 and aa06351.

📒 Files selected for processing (14)

codeframe/core/proof/runner.py
codeframe/ui/routers/proof_v2.py
codeframe/ui/routers/workspace_v2.py
tests/core/test_proof_runner_config.py
tests/ui/test_proof_config.py
tests/ui/test_workspace_config.py
web-ui/src/__tests__/components/settings/Proof9DefaultsTab.test.tsx
web-ui/src/__tests__/components/settings/WorkspaceConfigTab.test.tsx
web-ui/src/app/settings/page.tsx
web-ui/src/components/settings/Proof9DefaultsTab.tsx
web-ui/src/components/settings/WorkspaceConfigTab.tsx
web-ui/src/lib/api.ts
web-ui/src/lib/proof.ts
web-ui/src/types/index.ts

- Move proof_config.json filename + canonical 9-gate order into codeframe/core/proof/models.py and import from both runner and router - proof/workspace config PUT endpoints now write via temp-file + os.replace so a crash mid-write cannot leave a truncated file - Add OSError to the config GET exception handlers - Default proof config returns gates in canonical PROOF9 order (matches the frontend) instead of alphabetical - Tighten the strictness description in Proof9DefaultsTab - Document that workspace_root in WorkspaceConfig is display-only - Make TestGetProofConfig.test_returns_existing_config order-insensitive

frankbria · 2026-05-11T17:34:37Z

Addressed claude review in f823f33:

Add Claude Code GitHub Workflow #1 (DRY filename): Moved PROOF_CONFIG_FILENAME and PROOF9_GATE_ORDER into codeframe/core/proof/models.py; both runner.py and proof_v2.py import from there.
Brainstorming: Integrate remaining general concepts into specification #2 (OSError): Added OSError to both config GET handlers.
feat(sprint-4): Multi-Agent Coordination System - P0 & P1 Complete #3 (atomic writes): Both PUT endpoints now write via tmp + os.replace so a crash mid-write can no longer leave a truncated file.
feat: Project schema refactoring with source types and workspace management #4 (workspace_root display-only): Added a docstring note clarifying that workspace_root is informational/display-only and does not relocate the workspace.
feat: Project schema refactoring with API endpoint integration #5 (gate ordering): _default_proof_config() now returns gates in the canonical PROOF9 order (unit, contract, e2e, visual, a11y, perf, sec, demo, manual) matching the frontend.
Project Schema Refactoring - Flexible Source Types & Deployment Modes #6 (strictness description): Tightened the description in Proof9DefaultsTab to match runner behavior (gate-failure based, not REQ-status based).
Improve workspace cleanup in project creation rollback #7 (order-sensitive test): Changed test_returns_existing_config to use set comparison.
Add missing logger import in server.py #8 (fixture duplication): Deferred — multiple files in tests/ui/ already pattern this way; extracting would touch unrelated files. Worth a follow-up cleanup PR.

All 58 backend + 30 frontend settings tests pass; ruff and npm run build clean.

claude · 2026-05-11T17:36:55Z

Code Review — PR #589: PROOF9 Defaults + Workspace Config tabs

Overall this is a well-structured addition. The core logic is clean, the test coverage is solid across all four layers (core runner, FastAPI routers, React components), and the separation between the config-file format and the Pydantic API contract is properly handled.

Agreeing with CodeRabbit's two flags

Both issues raised by coderabbitai are real and worth fixing before merge:

Cache inconsistency in proof_v2.py ~line 417 — The existing passed recomputation in the run endpoint reads raw gate results directly, bypassing the strictness-aware overall_passed that run_proof() already computed and persisted. This means a warn-mode run can report success: false in the API response even though overall_passed is True on the ledger. The fix is to read persisted_run.overall_passed (as suggested in CodeRabbit's proposed diff) rather than re-aggregating raw results.
Transient "Failed to load" flash in Proof9DefaultsTab.tsx — The error guard if (error || !draft || !data) fires briefly between when data arrives and when the useEffect sets draft, causing a visible error state. Changing it to if (error || !data) (letting the loading branch handle the missing-draft case) removes the flash.

Additional observations

_atomic_write_json is duplicated (proof_v2.py and workspace_v2.py both define identical functions). Move it to codeframe/ui/routers/_utils.py or a similar shared module. Also, import os inside the function body is unconventional — move it to module level with the other imports.

Frontend gate list needs a sync comment — web-ui/src/lib/proof.ts defines PROOF9_GATES as a hard-coded tuple. This silently diverges if a gate is added to or removed from the backend Gate enum. A one-line comment pointing to codeframe/core/proof/models.py as the authoritative source would make the coupling explicit for future maintainers.

Missing fallback test — tests/ui/test_proof_config.py covers happy-path GET but not the "corrupted JSON → fallback to defaults" branch that both get_proof_config and get_workspace_config implement. A single test writing invalid JSON to the config path and asserting a 200 with defaults would close that gap.

workspace_root is display-only — document it at the model level — The UpdateWorkspaceConfigRequest model accepts any workspace_root string from the client, and the docstring on update_workspace_config correctly says it's informational-only. Worth adding a short note to the Pydantic model field itself (Field(..., description="Display only …")) so it's visible in the OpenAPI schema and can't be misread as having server-side effect.

What's solid

_load_proof_config in runner.py is properly defensive (handles missing file, invalid JSON, unknown gate names) and falls back gracefully.
Gate filtering integrates cleanly alongside the existing gate_filter path without interfering with it.
Pydantic validators on UpdateProofConfigRequest correctly catch unknown gate names and invalid strictness values at the API boundary.
canSave guard in WorkspaceConfigTab.tsx blocks save on empty required fields client-side — good UX, consistent with backend validation.
All new Python tests are marked pytestmark = pytest.mark.v2 per CLAUDE.md convention.

The two CodeRabbit bugs (#1 especially) should be addressed before merge. The rest are minor polish items.

High-priority (both reviewers): - proof_v2.py run endpoint: read the strictness-aware overall_passed from the persisted run record instead of recomputing 'passed' from raw gate results. The recomputation incorrectly reported failure for warn-mode runs via the cached run status. Regression test added. Frontend UX: - Proof9DefaultsTab + WorkspaceConfigTab: stop showing a transient "Failed to load" flash between data arrival and the useEffect that initializes the draft. Error guard now only fires on actual errors. DRY: - Extract _atomic_write_json into codeframe/ui/routers/_helpers.py and use it from both proof_v2.py and workspace_v2.py. Polish: - workspace_root marked display-only at the Pydantic field level so OpenAPI consumers see the intent. - web-ui/src/lib/proof.ts gained a SYNC comment pointing to codeframe/core/proof/models.py as the source of truth. - New test: GET endpoints fall back to defaults on corrupted JSON.

frankbria · 2026-05-11T17:51:17Z

Round 2: addressed claude + CodeRabbit feedback in ae32da7.

High priority (both reviewers flagged)

proof_v2.py run endpoint now reads persisted_run.overall_passed instead of recomputing from raw results, so warn-mode runs correctly report passed: true via the cached run status. Regression test TestRunCachePassedSemantics::test_warn_mode_run_cache_passed_is_true added.

Frontend UX (claude #2)

Both tabs now only show "Failed to load" on actual errors; loading state covers the brief window between data arrival and the useEffect that initializes draft.

DRY (claude observation)

_atomic_write_json extracted into codeframe/ui/routers/_helpers.py and imported from both routers. import os moved to module level.

Polish

workspace_root annotated as display-only at the Pydantic Field level (visible in OpenAPI).
web-ui/src/lib/proof.ts gained a SYNC comment pointing to codeframe/core/proof/models.py as the source of truth.
New corrupted-JSON fallback tests for both GET /config endpoints.

All 107 backend tests + 30 frontend settings tests pass; ruff clean; build clean.

claude · 2026-05-11T17:54:45Z

Follow-up Review

All five substantive items from the prior review have been addressed in this iteration — nice work:

PROOF_CONFIG_FILENAME centralized in models.py and imported by both runner.py and proof_v2.py
OSError added to both GET handlers (proof_v2.py, workspace_v2.py)
Atomic writes implemented via _helpers.atomic_write_json() — clean approach with the temp-suffix pattern
workspace_root documented as display-only in both the request model docstring and the PUT endpoint doc
Default gate order fixed to use PROOF9_GATE_ORDER (matches enum + frontend order)
Strictness description corrected in Proof9DefaultsTab — now accurately describes gate-failure behavior
CodeRabbit's cached-passed bug fixed — run_proof_endpoint now reads persisted_run.overall_passed instead of recomputing from raw gate results

One open item from the previous review

Duplicate test_workspace fixture — tests/ui/test_proof_config.py and tests/ui/test_workspace_config.py both define the identical fixture. Extracting to tests/ui/conftest.py would DRY this out. Not a blocker, but worth a follow-on.

One new minor observation

isLoading is destructured but unused in both Proof9DefaultsTab.tsx and WorkspaceConfigTab.tsx. The loading state is correctly inferred from !data || !draft, making isLoading dead. This is cosmetic — no behavioral impact — but a linter with no-unused-vars would flag it.

Summary

This is in good shape. The atomic-write and OSError fixes address the two most important correctness issues from the prior review. The remaining items are low-priority cleanup. Ready to merge once CI is green.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@codeframe/ui/routers/_helpers.py`:
- Around line 8-15: The atomic_write_json function currently reuses a single tmp
name (path.suffix + ".tmp") which can cause concurrent writes to clobber each
other; change atomic_write_json to create a unique per-write temp file in the
same directory (e.g., use tempfile.NamedTemporaryFile or generate a suffix with
uuid4()/pid/timestamp) write the JSON to that unique temp file and then
os.replace the unique temp path to path, and ensure the temp file is created in
path.parent so os.replace stays atomic and any temp file is removed/left only on
failure.

In `@codeframe/ui/routers/proof_v2.py`:
- Around line 648-717: The router currently contains domain logic for defaults,
validation, file layout, and corrupted-file fallback (symbols: _VALID_GATES,
_proof_config_path, _default_proof_config,
UpdateProofConfigRequest._validate_gates, get_proof_config, update_proof_config,
atomic_write_json); move that logic into a core API (e.g., core.proof_config) by
implementing functions like default_proof_config(), validate_gates(gates) or
exceptions for invalid gates, load_proof_config(workspace) which reads the file,
returns defaults on missing/corrupt files and logs or surfaces errors, and
save_proof_config(workspace, payload) which does atomic write; then simplify the
router handlers to only call core.load_proof_config and core.save_proof_config
and map request/response models (keep ProofConfigResponse and
UpdateProofConfigRequest as DTOs) so all file/validation behavior lives in the
core module instead of codeframe/ui/routers/proof_v2.py.

In `@tests/ui/test_proof_config.py`:
- Around line 38-54: The tests currently bypass the router's workspace_path
query parameter by overriding get_v2_workspace to return the workspace object
directly; instead add a small request helper on the TestClient that always
includes workspace_path (derived from test_workspace, e.g. test_workspace.path
or test_workspace.workspace_path) and use that helper across the file.
Concretely: in the test_client fixture (which references get_v2_workspace,
proof_v2.router, TestClient and test_workspace) keep the dependency override if
needed but attach a helper like client.request_with_workspace(method, path, *,
params=None, **kwargs) that merges params with {'workspace_path':
<test_workspace path>} and calls client.request(method, path, params=merged,
**kwargs); replace direct client.get/post calls in this test module with that
helper so every request includes workspace_path.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4c6d3674-2cb4-4908-a150-e27f306da678

📥 Commits

Reviewing files that changed from the base of the PR and between f823f33 and ae32da7.

📒 Files selected for processing (8)

codeframe/ui/routers/_helpers.py
codeframe/ui/routers/proof_v2.py
codeframe/ui/routers/workspace_v2.py
tests/ui/test_proof_config.py
tests/ui/test_workspace_config.py
web-ui/src/components/settings/Proof9DefaultsTab.tsx
web-ui/src/components/settings/WorkspaceConfigTab.tsx
web-ui/src/lib/proof.ts

🚧 Files skipped from review as they are similar to previous changes (5)

web-ui/src/components/settings/Proof9DefaultsTab.tsx
web-ui/src/components/settings/WorkspaceConfigTab.tsx
codeframe/ui/routers/workspace_v2.py
tests/ui/test_workspace_config.py
web-ui/src/lib/proof.ts

coderabbitai · 2026-05-11T17:56:19Z

+# ============================================================================
+# PROOF9 Config (issue #556)
+#
+# Persists which gates are enabled by default and the strictness setting
+# (strict vs warn) to .codeframe/proof_config.json.
+# ============================================================================
+
+
+_VALID_GATES = {g.value for g in Gate}
+
+
+def _proof_config_path(workspace: Workspace):
+    return workspace.state_dir / PROOF_CONFIG_FILENAME
+
+
+def _default_proof_config() -> dict:
+    return {"enabled_gates": list(PROOF9_GATE_ORDER), "strictness": "strict"}
+
+
+class ProofConfigResponse(BaseModel):
+    enabled_gates: list[str]
+    strictness: Literal["strict", "warn"]
+
+
+class UpdateProofConfigRequest(BaseModel):
+    enabled_gates: list[str]
+    strictness: Literal["strict", "warn"]
+
+    @field_validator("enabled_gates")
+    @classmethod
+    def _validate_gates(cls, v: list[str]) -> list[str]:
+        unknown = [g for g in v if g not in _VALID_GATES]
+        if unknown:
+            raise ValueError(
+                f"Unknown gate(s): {unknown}. Valid: {sorted(_VALID_GATES)}"
+            )
+        return v
+
+
+@router.get("/config", response_model=ProofConfigResponse)
+@rate_limit_standard()
+async def get_proof_config(
+    request: Request,
+    workspace: Workspace = Depends(get_v2_workspace),
+) -> ProofConfigResponse:
+    """Load PROOF9 defaults for this workspace.
+
+    Returns the all-gates-enabled + strict defaults if no config file exists.
+    """
+    path = _proof_config_path(workspace)
+    if path.exists():
+        try:
+            data = json.loads(path.read_text())
+            return ProofConfigResponse(**data)
+        except (OSError, json.JSONDecodeError, ValueError) as e:
+            logger.warning("Invalid proof_config.json — falling back to defaults: %s", e)
+    return ProofConfigResponse(**_default_proof_config())
+
+
+@router.put("/config", response_model=ProofConfigResponse)
+@rate_limit_standard()
+async def update_proof_config(
+    request: Request,
+    body: UpdateProofConfigRequest,
+    workspace: Workspace = Depends(get_v2_workspace),
+) -> ProofConfigResponse:
+    """Persist PROOF9 defaults to .codeframe/proof_config.json."""
+    payload = {"enabled_gates": body.enabled_gates, "strictness": body.strictness}
+    atomic_write_json(_proof_config_path(workspace), payload)
+    return ProofConfigResponse(**payload)


🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Move PROOF config policy out of the router.

This block now owns default values, gate validation, file layout, and corrupted-file fallback behavior. That is domain logic, and keeping it in codeframe/ui/routers/proof_v2.py creates another place that can drift from the core runner’s interpretation of the same config. Please move load/save/default/validate behavior into a core API and keep the router to request/response translation.

As per coding guidelines, codeframe/ui/**/*.py: "FastAPI server and web UI are thin adapters over core — do NOT implement domain logic in codeframe/ui/ routers".

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@codeframe/ui/routers/proof_v2.py` around lines 648 - 717, The router currently contains domain logic for defaults, validation, file layout, and corrupted-file fallback (symbols: _VALID_GATES, _proof_config_path, _default_proof_config, UpdateProofConfigRequest._validate_gates, get_proof_config, update_proof_config, atomic_write_json); move that logic into a core API (e.g., core.proof_config) by implementing functions like default_proof_config(), validate_gates(gates) or exceptions for invalid gates, load_proof_config(workspace) which reads the file, returns defaults on missing/corrupt files and logs or surfaces errors, and save_proof_config(workspace, payload) which does atomic write; then simplify the router handlers to only call core.load_proof_config and core.save_proof_config and map request/response models (keep ProofConfigResponse and UpdateProofConfigRequest as DTOs) so all file/validation behavior lives in the core module instead of codeframe/ui/routers/proof_v2.py.

coderabbitai · 2026-05-11T17:56:19Z

+@pytest.fixture
+def test_client(test_workspace):
+    """Create a FastAPI TestClient with proof_v2 router and workspace override."""
+    from codeframe.ui.dependencies import get_v2_workspace
+    from codeframe.ui.routers import proof_v2
+
+    app = FastAPI()
+    app.include_router(proof_v2.router)
+
+    def get_test_workspace():
+        return test_workspace
+
+    app.dependency_overrides[get_v2_workspace] = get_test_workspace
+
+    client = TestClient(app)
+    client.workspace = test_workspace
+    return client


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Make these tests send workspace_path.

Because the dependency override hands the router a workspace directly, every request in this module bypasses the required query-parameter contract. That leaves this suite unable to catch regressions in the v2 endpoint shape. Please wrap requests in a helper that always includes workspace_path and use that throughout the file.

As per coding guidelines, codeframe/ui/routers/**/*.py: "Web UI API client must require workspace_path query parameter for all endpoints".

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/ui/test_proof_config.py` around lines 38 - 54, The tests currently bypass the router's workspace_path query parameter by overriding get_v2_workspace to return the workspace object directly; instead add a small request helper on the TestClient that always includes workspace_path (derived from test_workspace, e.g. test_workspace.path or test_workspace.workspace_path) and use that helper across the file. Concretely: in the test_client fixture (which references get_v2_workspace, proof_v2.router, TestClient and test_workspace) keep the dependency override if needed but attach a helper like client.request_with_workspace(method, path, *, params=None, **kwargs) that merges params with {'workspace_path': <test_workspace path>} and calls client.request(method, path, params=merged, **kwargs); replace direct client.get/post calls in this test module with that helper so every request includes workspace_path.

- PRODUCT_ROADMAP.md: Phase 5.1 → ✅ Complete - CLAUDE.md: add Phase 5.1 to the recent-progress note, bump date, point next-up to 5.2–5.5 - Proof9DefaultsTab/WorkspaceConfigTab: stop destructuring isLoading (unused after the load/error guard reorder in the previous commit)

claude · 2026-05-11T18:14:49Z

Code Review — PR #589: PROOF9 Defaults + Workspace Config tabs

Good implementation overall. All acceptance criteria appear met and test coverage is solid. Prior CodeRabbit reviews flagged a few things; here is where things stand plus new observations.

What has been resolved

The first CodeRabbit review's critical issue — cached passed in run_proof_endpoint ignoring warn-mode semantics — is properly fixed. The endpoint now reads persisted_run.overall_passed rather than recomputing from raw gate results. The regression test in TestRunCachePassedSemantics locks it in.

Still open from previous CodeRabbit review

_helpers.py — fixed temp filename is a concurrent-write race

tmp = path.with_suffix(path.suffix + ".tmp")  # same name for every caller

Two simultaneous saves to the same config file collide on the temp path. Low probability for a settings endpoint, but easy to fix with tempfile.mkstemp to generate a unique path per write.

New observations

1. Empty enabled_gates produces a vacuously-passing proof run

When all gates are disabled (enabled_gates: []), every obligation is filtered out, executed = [], and all([]) == True, so the run records as passing with zero evidence. The API correctly accepts an empty list (test_put_allows_empty_gate_list) but the runner silently marks runs green with no gates executed.

Suggestion: log a warning when all obligations are filtered out, or add a test that explicitly documents vacuous-pass as intentional.

2. workspace_root field is editable but has no effect

The workspace_v2.py docstring says editing does not relocate the workspace, but the form label reads "Absolute path on the host where this workspace lives" — a user will expect editing it to move the workspace. The current description does not convey that saving is inert.

Suggestion: either make the field read-only in the UI (disabled Input) or add an inline note such as "Display only — workspace path is fixed at init time."

3. Config filename constant is duplicated

PROOF_CONFIG_FILENAME lives in codeframe/core/proof/models.py and is shared by runner and router. But _WORKSPACE_CONFIG_FILENAME = "workspace_config.json" is local to workspace_v2.py. Fine today since no runner reads it, but inconsistent with the proof-config pattern. Worth moving to a core module if a future runner needs it.

4. Missing blank line before _GATE_TO_CORE in runner.py

The new _load_proof_config function has no trailing blank line before the module-level _GATE_TO_CORE dict. Cosmetic, but ruff E302 may flag it.

Test coverage

Good: defaults, round-trip, corrupt-file fallback, gate validation, strictness semantics, and the warn-mode cache regression are all exercised. Two gaps:

No test for empty enabled_gates + runner — documents whether vacuous-pass is intentional.
test_proof_config.py and test_workspace_config.py override get_v2_workspace and skip the workspace_path query param entirely. Tests work, but the real dependency resolution path is not exercised. Low risk, worth a comment in the fixture.

Summary

The feature is solid. Two items worth addressing before merge: the temp-file race in atomic_write_json, and making workspace_root clearly display-only in the UI. The empty-gates vacuous-pass should have either a logged warning or a test confirming the behavior is intentional.

- _helpers.atomic_write_json now uses tempfile.mkstemp for a unique per-call temp filename so concurrent writers do not collide on a shared ".tmp" suffix. - workspace_root input is now readOnly + disabled in the UI with a clear "Display only" note; the field served no functional purpose and the previous label invited the user to expect a relocation. - proof runner logs a warning when all gates are disabled, so the vacuous-pass case (zero obligations executed, run reports pass) is not silent. Test in tests/core/test_proof_runner_config.py locks in the documented behavior.

frankbria · 2026-05-11T18:30:43Z

Round 3: addressed claude review in 13f18e4.

Temp-file race: atomic_write_json now uses tempfile.mkstemp for a per-call unique temp name, so concurrent writers can't collide on a shared .tmp suffix.
workspace_root clearly display-only: input is now readOnly + disabled with the helper text "Display only — workspace path is fixed at init time."
Vacuous-pass warning: runner logs a warning when all 9 gates are disabled in proof_config.json. New test TestEmptyEnabledGates in tests/core/test_proof_runner_config.py documents and locks in the intentional behavior (zero gates run → vacuous pass, warning emitted).

Deferred (cleanup follow-ups, not blockers):

DRY test_workspace fixture across tests/ui/ — existing pattern already replicated in several other files; would be a separate cleanup PR.
Moving _WORKSPACE_CONFIG_FILENAME to a core module — no core consumer today.

All 109 backend + 30 frontend settings tests pass; ruff clean; build clean.

claude · 2026-05-11T18:33:18Z

Code Review — PR #589: feat(settings): PROOF9 defaults + workspace config tabs

This is a clean, well-scoped implementation of Phase 5.1 (#556). The layering is correct (headless core constants, thin router adapters, SWR-backed UI), tests are comprehensive, and the atomic-write helper solves the right problem. A few items worth addressing before merge:

Issues

1. Missing blank line in runner.py (PEP 8 / E302)

After _load_proof_config, there is only one blank line before # Map PROOF9 gates to existing core/gates.py gate names. PEP 8 and ruff's E302 require two blank lines between top-level definitions. The PR states ruff check . passes, so this may be fine depending on ruff config, but it's worth confirming.

    return enabled, strictness
                                    ← only one blank line here
# Map PROOF9 gates to existing core/gates.py gate names
_GATE_TO_CORE: dict[Gate, str] = {

2. _WORKSPACE_CONFIG_FILENAME is a router-local constant

PROOF_CONFIG_FILENAME lives in codeframe/core/proof/models.py and is shared between the router and runner.py. But _WORKSPACE_CONFIG_FILENAME = "workspace_config.json" is defined only in workspace_v2.py. This is fine as long as workspace config is only read by the router, but it creates an inconsistency: if a core module ever needs to read workspace config (similar to how runner.py reads proof config), the constant would need to be lifted to a shared module. Worth at least a note or moving it to core/workspace.py for symmetry.

3. Error message in _validate_gates uses alphabetical sort instead of canonical order

raise ValueError(
    f"Unknown gate(s): {unknown}. Valid: {sorted(_VALID_GATES)}"
)

sorted(_VALID_GATES) produces ['a11y', 'contract', 'demo', 'e2e', 'manual', 'perf', 'sec', 'unit', 'visual'] — alphabetical, not the PROOF9 canonical order. The canonical order is already available as PROOF9_GATE_ORDER. Suggestion:

raise ValueError(
    f"Unknown gate(s): {unknown}. Valid: {list(PROOF9_GATE_ORDER)}"
)

This keeps the error consistent with the order users see everywhere else in the UI.

4. Untested fallback path in run_proof_endpoint

persisted_run = get_run(workspace, run_id)
if persisted_run is not None:
    passed = persisted_run.overall_passed
else:
    # Fallback only if persistence unexpectedly failed
    passed = all(...)

The else branch has no test coverage. TestRunCachePassedSemantics in test_proof_config.py correctly covers the happy path, but if save_run() ever silently failed, the fallback would recompute passed without strictness awareness and surface a different result. Not high risk, but a single test asserting the happy path exists (which it does) doesn't protect the fallback from regressing. Worth adding a test or an inline comment noting the fallback bypasses strictness.

Observations (no action required)

isDirty in Proof9DefaultsTab: The combination of length !== length guard + one-way .some() membership check is mathematically correct (set equality via length + subset). Good.
Vacuous-pass warning: The explicit logger.warning when all gates are disabled is a good UX safeguard that's easy to miss in the test output — the test covers it with caplog, which is exactly right.
atomic_write_json: Correct pattern — mkstemp in the same directory avoids cross-device os.replace errors; exception handler cleans up the temp file. Solid.
SWR cache key is [prefix, workspacePath]: Scoped per-workspace, so switching workspaces gets fresh data. Good.
canSave guard in WorkspaceConfigTab: The extra workspace_root.trim() !== '' and default_branch.trim() !== '' checks match the backend's min_length=1 validators. Correct pairing.
Test fixture duplication: Both test_proof_config.py and test_workspace_config.py define identical test_workspace fixtures. A shared conftest under tests/ui/ would reduce duplication — not a blocker, but worth noting for the next PR touching this area.

Summary

The core logic (runner integration, atomic write, endpoint design) is sound and the test coverage is good. The three actionable items above are all minor — the missing blank line and constant placement are cosmetic/structural, and the error message sort is a small UX fix. None are blocking.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

tests/core/test_proof_runner_config.py (1)

62-64: ⚡ Quick win

Use the shared config filename constant in tests

These tests hardcode "proof_config.json" in multiple places. Import and use PROOF_CONFIG_FILENAME to keep tests aligned with the runner/model source of truth and avoid drift if the filename changes.

Suggested refactor

 from codeframe.core.proof.models import (
     Gate,
     Obligation,
+    PROOF_CONFIG_FILENAME,
     Requirement,
     RequirementScope,
     ReqStatus,
     Severity,
     Source,
 )
@@
-        (workspace.state_dir / "proof_config.json").write_text(
+        (workspace.state_dir / PROOF_CONFIG_FILENAME).write_text(
             json.dumps({"enabled_gates": ["unit"], "strictness": "strict"})
         )
@@
-        (workspace.state_dir / "proof_config.json").write_text(
+        (workspace.state_dir / PROOF_CONFIG_FILENAME).write_text(
             json.dumps({"enabled_gates": [], "strictness": "strict"})
         )
@@
-        (workspace.state_dir / "proof_config.json").write_text(
+        (workspace.state_dir / PROOF_CONFIG_FILENAME).write_text(
             json.dumps({"enabled_gates": ["unit"], "strictness": "strict"})
         )
@@
-        (workspace.state_dir / "proof_config.json").write_text(
+        (workspace.state_dir / PROOF_CONFIG_FILENAME).write_text(
             json.dumps({"enabled_gates": ["unit"], "strictness": "warn"})
         )

Also applies to: 98-100, 130-132, 150-152

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/core/test_proof_runner_config.py` around lines 62 - 64, Replace
hardcoded "proof_config.json" usages in tests with the shared
PROOF_CONFIG_FILENAME constant: import PROOF_CONFIG_FILENAME from the module
that defines it (the proof runner/model module) and use (workspace.state_dir /
PROOF_CONFIG_FILENAME).write_text(...) and any path constructions or reads
instead of string literals; update all occurrences (the ones around the current
blocks and the ones noted at 98-100, 130-132, 150-152) so tests use the single
source-of-truth constant.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@codeframe/core/proof/runner.py`:
- Around line 39-47: The code assumes parsed JSON "data" is a dict and calls
data.get(...), which will crash if the JSON root is not an object; modify the
logic in runner.py (around the variables data, gates_raw, enabled_gates and
PROOF_CONFIG_FILENAME) to first verify isinstance(data, dict) before using
data.get, and if not a dict treat the config as invalid (log a warning similar
to the existing one) and fall back to the defaults (return None, "strict" or the
existing default path) so enabled remains None and downstream code won’t raise
AttributeError.

In `@web-ui/src/components/settings/WorkspaceConfigTab.tsx`:
- Around line 67-76: The code validates trimmed strings in canSave but passes
raw draft values to workspaceConfigApi.updateConfig, risking persisting
whitespace; update handleSave to normalize inputs before sending by trimming
draft.workspace_root, draft.default_branch, and draft.tech_stack_override (use
the trimmed values or default to '' for override) so the payload to
workspaceConfigApi.updateConfig contains trimmed/normalized strings; ensure
canSave still uses the same trimmed checks so validation and persisted data
match.

---

Nitpick comments:
In `@tests/core/test_proof_runner_config.py`:
- Around line 62-64: Replace hardcoded "proof_config.json" usages in tests with
the shared PROOF_CONFIG_FILENAME constant: import PROOF_CONFIG_FILENAME from the
module that defines it (the proof runner/model module) and use
(workspace.state_dir / PROOF_CONFIG_FILENAME).write_text(...) and any path
constructions or reads instead of string literals; update all occurrences (the
ones around the current blocks and the ones noted at 98-100, 130-132, 150-152)
so tests use the single source-of-truth constant.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7461eda8-e8cc-4076-bc8e-b1342797cb3e

📥 Commits

Reviewing files that changed from the base of the PR and between 6a45ec2 and 13f18e4.

📒 Files selected for processing (4)

codeframe/core/proof/runner.py
codeframe/ui/routers/_helpers.py
tests/core/test_proof_runner_config.py
web-ui/src/components/settings/WorkspaceConfigTab.tsx

🚧 Files skipped from review as they are similar to previous changes (1)

codeframe/ui/routers/_helpers.py

coderabbitai · 2026-05-11T18:37:49Z

+    try:
+        data = json.loads(path.read_text())
+    except (OSError, json.JSONDecodeError) as exc:
+        logger.warning("Invalid %s — using defaults: %s", PROOF_CONFIG_FILENAME, exc)
+        return None, "strict"
+
+    enabled: Optional[set[Gate]] = None
+    gates_raw = data.get("enabled_gates")
+    if isinstance(gates_raw, list):


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle non-object JSON configs to avoid runtime crashes

At Line 46, data.get(...) assumes the parsed JSON is a dict. If proof_config.json is valid JSON but not an object (e.g., []), this raises AttributeError and aborts the run instead of falling back to defaults.

Suggested fix

try: data = json.loads(path.read_text()) except (OSError, json.JSONDecodeError) as exc: logger.warning("Invalid %s — using defaults: %s", PROOF_CONFIG_FILENAME, exc) return None, "strict" + if not isinstance(data, dict): + logger.warning( + "Invalid %s shape (expected JSON object) — using defaults", + PROOF_CONFIG_FILENAME, + ) + return None, "strict" enabled: Optional[set[Gate]] = None gates_raw = data.get("enabled_gates")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try:

data = json.loads(path.read_text())

except (OSError, json.JSONDecodeError) as exc:

logger.warning("Invalid %s — using defaults: %s", PROOF_CONFIG_FILENAME, exc)

return None, "strict"

enabled: Optional[set[Gate]] = None

gates_raw = data.get("enabled_gates")

if isinstance(gates_raw, list):

try:

data = json.loads(path.read_text())

except (OSError, json.JSONDecodeError) as exc:

logger.warning("Invalid %s — using defaults: %s", PROOF_CONFIG_FILENAME, exc)

return None, "strict"

if not isinstance(data, dict):

logger.warning(

"Invalid %s shape (expected JSON object) — using defaults",

PROOF_CONFIG_FILENAME,

)

return None, "strict"

enabled: Optional[set[Gate]] = None

gates_raw = data.get("enabled_gates")

if isinstance(gates_raw, list):

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@codeframe/core/proof/runner.py` around lines 39 - 47, The code assumes parsed JSON "data" is a dict and calls data.get(...), which will crash if the JSON root is not an object; modify the logic in runner.py (around the variables data, gates_raw, enabled_gates and PROOF_CONFIG_FILENAME) to first verify isinstance(data, dict) before using data.get, and if not a dict treat the config as invalid (log a warning similar to the existing one) and fall back to the defaults (return None, "strict" or the existing default path) so enabled remains None and downstream code won’t raise AttributeError.

coderabbitai · 2026-05-11T18:37:49Z

+  const canSave = dirty && !saving && draft.workspace_root.trim() !== '' && draft.default_branch.trim() !== '';
+
+  const handleSave = async () => {
+    setSaving(true);
+    try {
+      const saved = await workspaceConfigApi.updateConfig(workspacePath, {
+        workspace_root: draft.workspace_root,
+        default_branch: draft.default_branch,
+        auto_detect_tech_stack: draft.auto_detect_tech_stack,
+        tech_stack_override: draft.tech_stack_override,


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Persist normalized text values, not raw draft strings.

Line 67 validates default_branch with trim(), but Line 74 sends the untrimmed value; similarly Line 76 can persist whitespace-only override text. This can store invalid config and cause downstream branch resolution issues.

Suggested fix

const handleSave = async () => { setSaving(true); try { + const normalizedDefaultBranch = draft.default_branch.trim(); + const normalizedTechStackOverride = + draft.tech_stack_override?.trim() || null; + const saved = await workspaceConfigApi.updateConfig(workspacePath, { workspace_root: draft.workspace_root, - default_branch: draft.default_branch, + default_branch: normalizedDefaultBranch, auto_detect_tech_stack: draft.auto_detect_tech_stack, - tech_stack_override: draft.tech_stack_override, + tech_stack_override: normalizedTechStackOverride, });

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const canSave = dirty && !saving && draft.workspace_root.trim() !== '' && draft.default_branch.trim() !== '';

const handleSave = async () => {

setSaving(true);

try {

const saved = await workspaceConfigApi.updateConfig(workspacePath, {

workspace_root: draft.workspace_root,

default_branch: draft.default_branch,

auto_detect_tech_stack: draft.auto_detect_tech_stack,

tech_stack_override: draft.tech_stack_override,

const canSave = dirty && !saving && draft.workspace_root.trim() !== '' && draft.default_branch.trim() !== '';

const handleSave = async () => {

setSaving(true);

try {

const normalizedDefaultBranch = draft.default_branch.trim();

const normalizedTechStackOverride =

draft.tech_stack_override?.trim() || null;

const saved = await workspaceConfigApi.updateConfig(workspacePath, {

workspace_root: draft.workspace_root,

default_branch: normalizedDefaultBranch,

auto_detect_tech_stack: draft.auto_detect_tech_stack,

tech_stack_override: normalizedTechStackOverride,

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@web-ui/src/components/settings/WorkspaceConfigTab.tsx` around lines 67 - 76, The code validates trimmed strings in canSave but passes raw draft values to workspaceConfigApi.updateConfig, risking persisting whitespace; update handleSave to normalize inputs before sending by trimming draft.workspace_root, draft.default_branch, and draft.tech_stack_override (use the trimmed values or default to '' for override) so the payload to workspaceConfigApi.updateConfig contains trimmed/normalized strings; ensure canSave still uses the same trimmed checks so validation and persisted data match.

- Add the missing PEP 8 blank line between _load_proof_config and the _GATE_TO_CORE module-level mapping. - _validate_gates error message now lists gates in PROOF9_GATE_ORDER (the same order the UI shows) instead of sorted alphabetically. - Comment on the fallback branch in run_proof_endpoint documenting why the raw recompute is acceptable there.

frankbria · 2026-05-11T18:47:45Z

Round 4: addressed the three minor items in 1e77321.

Blank line between _load_proof_config and _GATE_TO_CORE (PEP 8 E302).
_validate_gates error message now lists gates in PROOF9_GATE_ORDER instead of alphabetical.
Inline comment on the fallback branch in run_proof_endpoint documenting why the raw recompute is acceptable there (it only fires if save_run silently failed — a deeper bug we'd want to surface).

Deferred (per reviewer: not blocking): move _WORKSPACE_CONFIG_FILENAME to a core module — no core consumer today; DRY test_workspace fixture across tests/ui/ — existing pattern already in 3+ other files.

claude · 2026-05-11T18:51:16Z

Code Review — PR #589: PROOF9 Defaults + Workspace Config tabs (Phase 5.1)

Summary: Solid implementation. Atomic writes, clean strictness semantics, good test coverage across all three layers. A few targeted items below.

Architecture: `_WORKSPACE_CONFIG_FILENAME` is in the wrong layer

File: codeframe/ui/routers/workspace_v2.py

PROOF_CONFIG_FILENAME correctly lives in codeframe/core/proof/models.py, which is why runner.py (core) can import it without touching UI code. _WORKSPACE_CONFIG_FILENAME is buried in workspace_v2.py (UI layer). If any core module ever needs to read workspace config (the natural next step, given the runner already reads proof config), it would have to import from the UI layer — breaking Architecture Rule 1.

Suggest moving it to codeframe/core/workspace.py or alongside PROOF_CONFIG_FILENAME in a shared constants file before this pattern proliferates.

Silent gate-name drops in `_load_proof_config`

File: codeframe/core/proof/runner.py

Unknown gate names in enabled_gates are silently skipped. If a user has a typo like "unittest" instead of "unit", enabled becomes an empty set, triggers the vacuous-pass warning, and runs nothing — but the warning message says "all 9 gates are disabled" which is confusing when a typo is the real cause. A targeted warning per unknown name would help:

if name not in valid_values:
    logger.warning("Unknown gate name '%s' in proof_config.json — skipped", name)

`workspace_root` is writeable via API but has no server-side effect

File: codeframe/ui/routers/workspace_v2.py, UpdateWorkspaceConfigRequest

The PUT endpoint accepts workspace_root and persists it, but the docstring notes it is display-only. An API client could reasonably think PUT-ing a new path relocates the workspace. Options: (a) drop workspace_root from the request model and derive it from the live workspace in the response, or (b) return 400 if the sent value differs from workspace.repo_path. Current shape works but is a footgun for API consumers.

Minor: `Proof9DefaultsTab` Save does not guard against all-gates-disabled

File: web-ui/src/components/settings/Proof9DefaultsTab.tsx

WorkspaceConfigTab blocks Save when default_branch is empty. Proof9DefaultsTab has no guard against saving with zero gates selected. Since the runner already emits a warning, this may be intentional, but a UI-level warning when enabled_gates is empty would improve discoverability.

Minor: duplicated `test_workspace` fixture

tests/ui/test_proof_config.py and tests/ui/test_workspace_config.py both define an identical test_workspace fixture. A shared conftest.py fixture would eliminate the duplication.

What is well done

Atomic writes in _helpers.atomic_write_json — correct pattern (unique temp name + os.replace), proper cleanup on failure.
Strictness semantics — overall_passed is read back from the persisted run record rather than recomputed, so warn-mode is consistent end-to-end.
Vacuous-pass warning — good defensive log before the obligation loop.
Test coverage — TestRunCachePassedSemantics in test_proof_config.py is exactly the right regression test for the strictness/cache fix.
isDirty in both tabs — length check before the set-subset check correctly handles both additions and removals.
canSave in WorkspaceConfigTab — client-side validation mirrors the min_length=1 server constraint.

The architecture concern is the most important item to address before merge (to keep the headless-core contract intact). The rest are improvements worth tracking.

- Move WORKSPACE_CONFIG_FILENAME from workspace_v2.py (UI layer) to codeframe/core/workspace.py alongside CODEFRAME_DIR. Symmetric with PROOF_CONFIG_FILENAME in core/proof/models.py and keeps the door open for a core consumer without violating the headless-core rule. - _load_proof_config logs a targeted warning for each unknown gate name in proof_config.json, so a typo like "unittest" no longer masquerades as "all 9 gates are disabled".

frankbria · 2026-05-11T19:39:51Z

Round 5 in 2f81b20:

Architecture (Add Claude Code GitHub Workflow #1): WORKSPACE_CONFIG_FILENAME moved from codeframe/ui/routers/workspace_v2.py to codeframe/core/workspace.py next to CODEFRAME_DIR. Symmetric with PROOF_CONFIG_FILENAME in core/proof/models.py — any future core consumer can read it without importing UI.
Unknown gate names (Brainstorming: Integrate remaining general concepts into specification #2): _load_proof_config now logs a targeted warning per unknown gate name in proof_config.json. A typo like "unittest" now logs "Unknown gate name 'unittest' — skipped" instead of silently devolving into the all-disabled vacuous-pass path.

Deferred (low priority): workspace_root PUT footgun (input is already readOnly in the UI, mitigating the primary surface); zero-gates Save guard in the tab (backend already warns); fixture DRY in tests/ui/.

This is the last round of feedback addressed. Merging once CI is green.

claude · 2026-05-11T19:42:28Z

Code Review — PR #589: PROOF9 Defaults + Workspace Config Tabs

Overall this is a clean, well-scoped implementation. The architecture is sound: atomic writes, graceful fallbacks, proper separation between core runner logic and the API layer, and solid test coverage across both Python and TypeScript. Previous CodeRabbit issues (concurrent mkstemp, passed/overall_passed strictness alignment) were correctly addressed.

Remaining item from prior reviews

Hardcoded "proof_config.json" in tests (flagged in the last CodeRabbit pass, not yet resolved)

tests/core/test_proof_runner_config.py writes "proof_config.json" directly in multiple places. If PROOF_CONFIG_FILENAME ever changes, these tests silently become wrong. Use the constant:

# Before
(workspace.state_dir / "proof_config.json").write_text(...)

# After
from codeframe.core.proof.models import PROOF_CONFIG_FILENAME
(workspace.state_dir / PROOF_CONFIG_FILENAME).write_text(...)

New observations

1. workspace_root staleness in WorkspaceConfigResponse

GET /api/v2/workspaces/config returns the value stored in workspace_config.json, not the live workspace.repo_path. If the workspace is later moved or the file predates a path change, the stored value drifts silently. Since the UI makes this field read-only, the persisted value never gets corrected either.

Consider always overriding workspace_root from the live workspace on GET:

config = WorkspaceConfigResponse(**data)
# Ensure workspace_root always reflects reality
config.workspace_root = str(workspace.repo_path)
return config

Or at minimum, ignore the stored workspace_root on read and inject it from workspace.repo_path. The PUT already notes it's "display-only" — storing it at all buys little.

2. Parallel JSON parsing between runner._load_proof_config and proof_v2.get_proof_config

Both read and validate proof_config.json independently with different code paths. If the schema evolves (e.g., new field, renamed key), both need updating. The runner needs to stay headless per architecture rules, so a shared helper in codeframe/core/proof/ (not importing from ui/) would be the clean fix. Not blocking, but worth tracking as the config schema grows.

3. Dead validation in WorkspaceConfigTab.canSave

const canSave = dirty && !saving && draft.workspace_root.trim() !== '' && draft.default_branch.trim() !== '';

workspace_root is readOnly + disabled in the UI, so draft.workspace_root.trim() !== '' can never be false for a live workspace. The check is harmless but can be simplified to remove the dead branch:

const canSave = dirty && !saving && draft.default_branch.trim() !== '';

4. Optional type annotation missing import in runner.py

_load_proof_config returns tuple[Optional[set[Gate]], str] but the Optional import may not be present at the top of the file (the diff adds from typing import lines only selectively). Verify Optional is imported; if it isn't, use set[Gate] | None (Python 3.10+ union syntax, consistent with the rest of the codebase).

Minor / non-blocking

tests/ui/test_proof_config.py and tests/ui/test_workspace_config.py duplicate the test_workspace fixture. Moving it to a conftest.py would reduce copy-paste drift.
web-ui/src/lib/proof.ts's SYNC: comment is the right mitigation for the frontend/backend gate-name duplication. Consider adding a backend route (or build step) to emit the gate list dynamically if the gate count grows frequently — but the comment approach is fine for now.
The vacuous-pass warning in runner.py hardcodes "all 9 gates" in the log message. Since it fires only when enabled_gates is an empty set, the count is always correct, but a dynamic len(Gate) reference would make it self-maintaining.

Summary: The core implementation is solid and CI-clean. The main actionable items are (1) using PROOF_CONFIG_FILENAME in tests (carryover from the last review), and (2) the workspace_root staleness design question. Everything else is cosmetic.

- workspace_root in GET /workspaces/config is now sourced from the live workspace.repo_path on every read; any value previously stored in workspace_config.json is ignored. This closes a subtle drift bug: if the workspace ever moves, the persisted value would be silently stale, but the UI's readOnly input never lets the user correct it. Regression test test_workspace_root_always_reflects_live_path locks this in. - tests/core/test_proof_runner_config.py now uses PROOF_CONFIG_FILENAME instead of the hardcoded "proof_config.json" string, removing the silent test-vs-implementation drift risk flagged in the previous review.

frankbria · 2026-05-11T19:52:01Z

Round 6 in e794150 — addressing the two highest-value items from the latest review:

workspace_root staleness: GET /api/v2/workspaces/config now sources workspace_root from the live workspace.repo_path on every read. Any value previously persisted in workspace_config.json is ignored. Closes the drift scenario where a workspace moved + UI readOnly input could never correct the stale stored value. New regression test test_workspace_root_always_reflects_live_path.
PROOF_CONFIG_FILENAME in tests: tests/core/test_proof_runner_config.py now uses the constant instead of the hardcoded string.

Skipped (low priority / cosmetic):

Lifting JSON parsing into a shared core helper — schema is currently 2 fields, would be over-engineering today.
workspace_root.trim() !== '' dead branch in canSave — harmless, and removing it adds churn without changing behavior.
Vacuous-pass warning hard-coded "9" — accurate since enabled_gates is empty; len(Gate) swap is trivial future polish.

This should be the final round. Merging when CI lands green.

claude · 2026-05-11T19:55:30Z

PR 589 Review - feat(settings): PROOF9 defaults + workspace config tabs

Overall: solid implementation. The core mechanics are all well-executed. A few items worth addressing before merge.

ISSUES

Duplicate gate names accepted by PUT /api/v2/proof/config

UpdateProofConfigRequest._validate_gates only rejects unknown gate names, not duplicates. A payload with enabled_gates containing duplicate values (e.g., unit twice) passes validation, gets persisted verbatim, and is returned by GET as-is. The UI de-dupes via enabledSet so visually fine, but the stored file is inconsistent.

Fix: add a dedup step in proof_v2.py after the unknown-gates check - use dict.fromkeys(v) to preserve order while removing dupes, then return list(seen).

Deferred imports inside test function bodies (tests/core/test_proof_runner_config.py lines 613, 624, 651, 672)

import logging and from codeframe.core.proof.ledger import get_run calls inside test methods should be at the module top-level. Not a correctness bug, but unusual and ruff will flag it once import-ordering rules are enabled.

Vacuous-pass is invisible in the UI

When all 9 gates are unchecked the runner records overall_passed=True with zero evidence; the warning goes to server logs only. Consider an inline warning banner in Proof9DefaultsTab when draft.enabled_gates.length === 0. Usability concern, not a blocker, but mirrors the backend warning.

MINOR OBSERVATIONS (non-blocking)

web-ui/src/lib/proof.ts PROOF9_GATES is a hand-maintained copy of the Python Gate enum. The SYNC comment is good but easy to miss. A CI check would prevent silent drift if gates are ever added/removed.
update_workspace_config stores the submitted workspace_root on disk even though GET always overrides it with the live path. Stripping it before writing (since it is display-only) would make the stored file less confusing to inspect manually.
run_proof_endpoint now does an extra get_run DB read per run to read back the persisted overall_passed. The inline comment explains the trade-off well. Acceptable for now; worth revisiting if proof-run latency becomes a concern.
_default_workspace_config hardcodes main as the branch default. Reading actual git HEAD would be better in future. Not in scope for this PR.

WHAT IS DONE WELL

Atomic JSON writes with unique temp names and cleanup on failure - correct.
Runner reads config directly from the filesystem (headless, no FastAPI import) - follows architecture rule 1.
_load_proof_config returns (None, strict) for a missing file vs (set(), strict) for empty enabled_gates - the no-filter vs filter-all distinction is semantically correct and tested.
Pydantic Literal strictness validation catches invalid values at the API boundary cleanly.
Frontend isDirty preserves canonical gate order via PROOF9_GATES.filter.
Thorough test coverage: defaults, round-trips, corrupted-JSON fallback, vacuous pass, strictness semantics, and the warn-mode run-cache regression.
Docs and roadmap updated in the same PR.

- PUT /api/v2/proof/config now de-dupes enabled_gates while preserving submission order, so the persisted file never carries the same gate twice. Test: test_put_dedupes_gates. - Move in-function imports (logging, get_run, save_requirement, model classes) to module top-level in tests/core/test_proof_runner_config.py and tests/ui/test_proof_config.py — consistent with the rest of the test suite and friendly to future import-ordering lint rules. - Proof9DefaultsTab now shows an inline yellow warning banner when no gates are enabled, mirroring the runner's server-side warning so the vacuous-pass case is visible to the user. Tests added.

frankbria · 2026-05-11T19:59:51Z

Round 7 in 62d51f3:

Duplicate gate names: UpdateProofConfigRequest._validate_gates now de-dupes via list(dict.fromkeys(v)) (order-preserving). {"enabled_gates": ["unit", "sec", "unit", "e2e", "sec"]} is persisted as ["unit", "sec", "e2e"]. Test: test_put_dedupes_gates.
Deferred imports in tests: Moved import logging, get_run, save_requirement, and the proof model classes to module top-level in tests/core/test_proof_runner_config.py and tests/ui/test_proof_config.py.
Vacuous-pass UI banner: Proof9DefaultsTab now shows an inline yellow role="alert" banner when draft.enabled_gates.length === 0, mirroring the server-side warning. Two new tests cover the show/hide behavior.

claude · 2026-05-11T20:02:59Z

Code Review — PR #589: PROOF9 Defaults + Workspace Config Tabs

Overall: This is a well-structured, complete feature addition. The layering (shared constants in core/proof/models.py, atomic helper in _helpers.py, clean REST endpoints, SWR-based UI components) is consistent with the project's architecture rules, and test coverage is genuinely thorough. A few items worth addressing before merge:

Findings

1. workspace_root is validated, stored, but always discarded on read

In workspace_v2.py, UpdateWorkspaceConfigRequest validates workspace_root with min_length=1 and persists it via body.model_dump(). But get_workspace_config always overrides the stored value with workspace.repo_path. This means a client can PUT an arbitrary path, see it accepted with 200, then GET it back with a different value — which is misleading API behaviour.

The simplest fix is to strip the field before persisting:

payload = body.model_dump(exclude={'workspace_root'})
payload['workspace_root'] = str(workspace.repo_path)  # keep it authoritative
atomic_write_json(_workspace_config_path(workspace), payload)

The UI already disables the field and the docstring explains the semantics — but the fix makes storage consistent with the read behaviour.

2. Vacuous-pass (all gates disabled) is silent at the API level

When enabled_gates=[], run_proof() emits a warning to the logger, but the RunProofResponse returned by run_proof_endpoint has no indication of this condition. The Phase 4A merge gate will need to distinguish "passed because all checks ran" from "passed vacuously because nothing ran".

Consider adding a vacuous_pass: bool field to RunProofResponse (or ProofRun) now, while the plumbing is fresh, even if the merge gate doesn't consume it yet. At minimum, document the gap as a known limitation in docs/PRODUCT_ROADMAP.md under 4A.

3. Test fixtures: prefer tmp_path over manual tempfile.mkdtemp()

Both tests/ui/test_proof_config.py and tests/ui/test_workspace_config.py use manual mkdtemp + shutil.rmtree. Pytest's built-in tmp_path fixture handles lifetime and cleanup automatically. The tests/core/test_proof_runner_config.py already uses tmp_path correctly — the two UI test files can follow the same pattern.

4. atomic_write_json: no fsync before os.replace

Correct for the common case. For crash-safe durability you'd add f.flush(); os.fsync(f.fileno()) before closing the fd. Acceptable as-is for config files where a crash-during-write just falls back to defaults — but worth noting for anyone who copies this helper for more critical data.

Positives worth highlighting

Atomic writes via mkstemp + os.replace with per-call unique temp names and exception-path cleanup is exactly the right approach.
enabled_gates=None vs empty set distinction in _load_proof_config cleanly separates "no config file" (all gates allowed) from "explicitly disabled all gates" — and the vacuous-pass warning is appropriately loud.
get_run(workspace, run_id) read-back in run_proof_endpoint to pick up the strictness-aware overall_passed is the right fix; TestRunCachePassedSemantics is a valuable regression guard.
Gate order preservation in toggleGate via PROOF9_GATES.filter(...) keeps the stored list in canonical order regardless of UI interaction sequence.
workspace_root live-override on GET is correctly tested by test_workspace_root_always_reflects_live_path.
SYNC comment in web-ui/src/lib/proof.ts proactively flags the manual sync requirement between the Python Gate enum and the TS constants.

Checklist summary

Item	Status
Architecture rules (core headless, CLI-first)	✅
Rate-limit decorators on new endpoints	✅
Atomic file writes	✅
Validation (gate names, strictness enum)	✅
Fallback to defaults on corrupt config	✅
Tests: backend (16 new) + frontend (14 new)	✅
`workspace_root` PUT/GET round-trip consistency	⚠️ See #1
Vacuous-pass surfaced in API response	⚠️ See #2
Test fixture cleanup (minor)	💡 See #3

Items #1 and #2 are the ones I'd want resolved before merge; #3 and #4 are suggestions.

- PUT /api/v2/workspaces/config now strips workspace_root from the client request and stores str(workspace.repo_path) instead, so PUT and GET return the same value and a client cannot persist a path that differs from the live workspace. Regression test: test_put_ignores_client_workspace_root. - Document the vacuous-pass API surface gap as a known follow-up under Phase 4A in PRODUCT_ROADMAP.md (the merge gate is the right place to consume a vacuous_pass flag; adding the field here would be speculative API surface for #556).

frankbria · 2026-05-11T20:10:49Z

Round 8 in 46f072a — addressing the two pre-merge items from the latest review:

Add Claude Code GitHub Workflow #1 workspace_root PUT/GET consistency: PUT now drops the client-sent workspace_root (model_dump(exclude={"workspace_root"})) and stores str(workspace.repo_path) instead. PUT and GET now return the same value, and a client cannot persist a path that differs from the live workspace. Test: test_put_ignores_client_workspace_root.
Brainstorming: Integrate remaining general concepts into specification #2 vacuous-pass surfaced in API: documented as a known follow-up under Phase 4A in docs/PRODUCT_ROADMAP.md. Adding vacuous_pass to RunProofResponse now would be speculative API surface — the merge gate that consumes it is the right place to introduce the field, and it ships in Phase 4A. The runner warning + Settings page banner already surface the condition to humans today.

Skipping (explicitly flagged "suggestions" by reviewer):

feat(sprint-4): Multi-Agent Coordination System - P0 & P1 Complete #3 tmp_path instead of mkdtemp in two UI tests — cosmetic; existing pattern in many other tests/ui/ files.
feat: Project schema refactoring with source types and workspace management #4 fsync before os.replace — acceptable for config files where a crash mid-write falls back to defaults; documented in the helper docstring.

23 backend + 16 frontend settings tests pass; ruff clean; build clean.

claude · 2026-05-11T20:14:04Z

Follow-up Review: feat(settings): PROOF9 defaults + workspace config tabs (#556)

All items from my previous review have been addressed — thank you for the thorough follow-through. Notes below.

Resolved (from prior review)

✅ Add Claude Code GitHub Workflow #1 Duplicate filename constant — PROOF_CONFIG_FILENAME and PROOF9_GATE_ORDER now live in models.py and are imported by both runner.py and proof_v2.py
✅ Brainstorming: Integrate remaining general concepts into specification #2 Missing OSError in GET handlers — both GET endpoints now catch (OSError, json.JSONDecodeError, ValueError)
✅ feat(sprint-4): Multi-Agent Coordination System - P0 & P1 Complete #3 Non-atomic writes — atomic_write_json in _helpers.py handles temp-file + os.replace correctly, including cleanup on failure
✅ feat: Project schema refactoring with source types and workspace management #4 workspace_root path validation — now documented and enforced: server ignores the client-sent value and always stores/returns str(workspace.repo_path)
✅ feat: Project schema refactoring with API endpoint integration #5 Gate order alignment — _default_proof_config() returns list(PROOF9_GATE_ORDER) which matches the frontend's PROOF9_GATES array
✅ Project Schema Refactoring - Flexible Source Types & Deployment Modes #6 Strictness description — UI copy now correctly says "any gate failure fails the proof run" / "gate failures are recorded but overall_passed stays True"
✅ Improve workspace cleanup in project creation rollback #7 Order-sensitive assertion — test_returns_existing_config now uses set() comparison

Still open

#8 Duplicate test_workspace fixture (unchanged from prior review)

tests/ui/test_proof_config.py and tests/ui/test_workspace_config.py carry identical test_workspace fixtures. Extract to tests/ui/conftest.py when convenient.

New: Pydantic `ValidationError` not caught by `ValueError` in GET endpoints

The GET handlers catch (OSError, json.JSONDecodeError, ValueError). A config file that is valid JSON but has an invalid field value (e.g. {"enabled_gates": ["unit"], "strictness": "bad"}) would cause ProofConfigResponse(**data) to raise a Pydantic ValidationError. In Pydantic v2, ValidationError does not subclass ValueError, so it would propagate as a 500 instead of falling back to defaults.

runner.py avoids this by reading fields manually with data.get(). The routers should do the same, or catch pydantic.ValidationError explicitly:

from pydantic import ValidationError

except (OSError, json.JSONDecodeError, ValueError, ValidationError) as e:

This is a real correctness gap only if Pydantic v2 is in use (v1 makes ValidationError a subclass of ValueError, so the current handler would cover it). Worth confirming and patching accordingly.

Optional observation

The test_workspace fixture in test_proof_config.py does not call init_proof_tables(), but TestRunCachePassedSemantics.test_warn_mode_run_cache_passed_is_true calls save_requirement() which needs proof tables. This works today only if save_requirement auto-initializes tables internally — if that ever changes, the test will fail silently. The test_proof_runner_config.py fixture correctly calls init_proof_tables; aligning test_proof_config.py to do the same would make the dependency explicit.

Bottom line: The Pydantic ValidationError gap is the only new item worth addressing before merge. Everything else is clean — the implementation is solid, well-layered, and well-tested.

- Add ValidationError explicitly to both GET handlers' except clause. In pydantic 2.12.2 ValidationError is still a subclass of ValueError (so the prior code worked), but the explicit catch makes the intent self-documenting and survives any future pydantic-internals change. - Lock in the behavior with two new tests: test_invalid_field_value_falls_back_to_defaults (proof config) and test_invalid_field_type_falls_back_to_defaults (workspace config).

frankbria · 2026-05-11T20:29:03Z

Round 9 in 45c085e — the lone pre-merge item:

ValidationError not caught by ValueError: factual correction first — in pydantic 2.12.2 (the version pinned in this repo) ValidationError IS still a subclass of ValueError, so the prior code was working. That said, I added ValidationError explicitly to both GET handlers to make intent self-documenting and survive any future pydantic-internals change. Two new tests lock in the fallback-on-invalid-field-value behavior: test_invalid_field_value_falls_back_to_defaults (strictness="bogus") and test_invalid_field_type_falls_back_to_defaults (default_branch=123).

Optional observation deferred: save_requirement already initializes proof tables internally so the test_workspace fixture in test_proof_config.py doesn't need to call init_proof_tables explicitly. The cited test passes today.

#8 (DRY test_workspace fixture) explicitly tagged "when convenient" — follow-up.

All 25 backend + 16 frontend settings tests pass; ruff clean; build clean. Will merge when CI lands green.

claude · 2026-05-11T20:31:31Z

Code Review — PR #589: PROOF9 Defaults + Workspace Config tabs

Solid implementation. The feature is well-scoped, architecture rules are respected (constants in core, persistence in UI layer, no FastAPI in core), and test coverage is thorough. A few items worth addressing before merge.

Correctness

Vacuous-pass is a merge-gate risk (noted but under-specified)

docs/PRODUCT_ROADMAP.md calls this out as a known follow-up, which is good — but the current gap is worth making explicit: when enabled_gates is empty, RunProofResponse returns success=True, passed=True with an empty results dict. The Phase 4A merge gate will need to distinguish this from a real pass. The warning log + UI banner are helpful guard rails, but the API has no structural signal. A vacuous_pass: bool field on RunProofResponse (defaulting to False) would let the merge gate enforce this without a schema change later. Not a blocker for this PR since the roadmap item is documented, but worth tracking explicitly.

_load_proof_config is called on every run_proof() invocation

# runner.py line ~132
enabled_gates, strictness = _load_proof_config(workspace)

This is fine for the current file-based config. Worth a note: if the runner is ever called in a tight loop (batch mode), each call hits the filesystem. A per-workspace cache invalidated by mtime would be a cheap win later. Not urgent.

Design

workspace_root round-trips through UpdateWorkspaceConfigRequest but is always ignored

The PUT request model requires workspace_root: str with min_length=1, but the handler immediately drops it:

payload = body.model_dump(exclude={"workspace_root"})
payload["workspace_root"] = str(workspace.repo_path)

This is safe, but it's confusing API design — the client must supply a field the server ignores. Consider making it Optional[str] = None on the request model with a docstring note, so clients that omit it get the same behavior without being forced to invent a value. The UI always sends it, but CLI or third-party callers would appreciate not being required to.

Pydantic models for config responses are defined inline in routers

ProofConfigResponse, UpdateProofConfigRequest, WorkspaceConfigResponse, and UpdateWorkspaceConfigRequest are all defined in the router files. Per the project's existing pattern (codeframe/ui/models.py), shared response models belong there. Low-priority but worth a follow-up to keep the routers thin.

Security

Gate name injection is correctly blocked — the _validate_gates field validator rejects anything not in _VALID_GATES. Good.

workspace_root path override is correctly neutralized — the GET always re-sources from workspace.repo_path; the PUT stores the live path regardless of what the client sends. The test_put_ignores_client_workspace_root test explicitly covers the attacker-controlled-path case. Well done.

atomic_write_json — one edge case

If os.replace(tmp_name, path) fails after the file is fully written (cross-device rename on some Linux configs), the cleanup path calls os.unlink(tmp_name). But tmp_name is already fully written — no data is lost, and the target is untouched, so this is safe. However, os.replace raises OSError, which is caught by the broad except Exception, so it will be cleaned up. Confirmed correct — just flagging the mental model.

Test Coverage

Coverage is comprehensive. A few gaps:

test_proof_config.py / test_workspace_config.py use tempfile.mkdtemp() directly — the rest of the test suite uses pytest's tmp_path fixture, which is cleaner and handles cleanup automatically. Minor inconsistency.
No test for the _load_proof_config fallback when strictness is an invalid string in the JSON — the code defaults to "strict", which is correct behavior, but it's not asserted by any test. Easy to add.
TestRunCachePassedSemantics in test_proof_config.py is a great integration regression test. The get_run lookup to verify the persisted overall_passed correctly mirrors strictness semantics is exactly the right thing to test here.

Frontend

isDirty in Proof9DefaultsTab is correct but could be simplified

function isDirty(a: ProofConfigResponse, b: ProofConfigResponse): boolean {
  if (a.strictness !== b.strictness) return true;
  if (a.enabled_gates.length !== b.enabled_gates.length) return true;
  const setA = new Set(a.enabled_gates);
  return b.enabled_gates.some((g) => !setA.has(g));
}

This is logically correct. The length check + set-membership check covers symmetric difference. No change needed, just confirming the logic is sound.

useEffect resets draft on every SWR revalidation

useEffect(() => {
  if (data) {
    setDraft({ ...data, enabled_gates: [...data.enabled_gates] });
  }
}, [data]);

If SWR revalidates in the background while the user has unsaved changes, the draft will be silently reset. This is acceptable for a settings page (low-frequency edits, no concurrent writers), but a user who leaves the tab open long enough might lose in-progress changes. A guard like if (data && !dirty) would prevent this. The current behavior at least isn't data-destroying (the server state wins), but worth noting.

web-ui/src/lib/proof.ts sync risk

The gate list is duplicated from codeframe/core/proof/models.py::Gate. The SYNC comment is the right mitigation. Consider adding a backend test that asserts PROOF9_GATE_ORDER matches a known list — that way a new gate added to the enum produces a test failure as a reminder to update the frontend too.

Summary

Area	Status
Architecture compliance	✅ Core headless, UI thin adapter
Atomic writes	✅ Correct temp-file pattern
Gate filtering + strictness	✅ Correctly wired end-to-end
Security (path override)	✅ Correctly neutralized
Test coverage	✅ Comprehensive — minor gaps noted
`workspace_root` in PUT schema	⚠️ Optional[str] would be cleaner
Vacuous-pass API signal	⚠️ Documented follow-up — needs `vacuous_pass` flag before 4A merge gate
Frontend draft reset on revalidate	⚠️ Low-risk but worth a guard

The two ⚠️ items with the highest impact are the workspace_root request model cleanup and the vacuous_pass flag — both are low-effort and would prevent confusion downstream. The rest are polish.

frankbria · 2026-05-11T20:40:05Z

Round 10: reviewer's items are all explicitly non-blocking — "Not a blocker for this PR", "Not urgent", "Low-priority but worth a follow-up". The vacuous-pass API surface is already documented in the roadmap under Phase 4A. The other items (tmp_path adoption, request-model location, optional workspace_root field, _load_proof_config caching, missing test for invalid-strictness fallback) belong on a follow-up PR — keeping this one focused on the #556 acceptance criteria.

Proceeding with merge.

coderabbitai Bot reviewed May 11, 2026

View reviewed changes

Comment thread web-ui/src/components/settings/Proof9DefaultsTab.tsx Outdated

coderabbitai Bot reviewed May 11, 2026

View reviewed changes

frankbria merged commit 9c19e0d into main May 11, 2026
11 checks passed

frankbria deleted the feature/issue-556-settings-proof9-workspace branch May 11, 2026 20:40

Conversation

frankbria commented May 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Acceptance criteria

Plan adaptation

Test plan

New tests

Demo (run locally)

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

claude Bot commented May 11, 2026

Review: feat(settings): PROOF9 defaults + workspace config tabs (#556)

Issues

Minor

What looks good

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

frankbria commented May 11, 2026

Uh oh!

claude Bot commented May 11, 2026

Code Review — PR #589: PROOF9 Defaults + Workspace Config tabs

Agreeing with CodeRabbit's two flags

Additional observations

What's solid

Uh oh!

frankbria commented May 11, 2026

Uh oh!

claude Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Follow-up Review

One open item from the previous review

One new minor observation

Summary

This is in good shape. The atomic-write and OSError fixes address the two most important correctness issues from the prior review. The remaining items are low-priority cleanup. Ready to merge once CI is green.

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot commented May 11, 2026

Code Review — PR #589: PROOF9 Defaults + Workspace Config tabs

What has been resolved

Still open from previous CodeRabbit review

New observations

Test coverage

Summary

Uh oh!

frankbria commented May 11, 2026

Uh oh!

claude Bot commented May 11, 2026

Code Review — PR #589: feat(settings): PROOF9 defaults + workspace config tabs

Issues

Observations (no action required)

Summary

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 11, 2026

frankbria commented May 11, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 11, 2026 •

edited

Loading

claude Bot commented May 11, 2026 •

edited

Loading

Architecture: `_WORKSPACE_CONFIG_FILENAME` is in the wrong layer

Silent gate-name drops in `_load_proof_config`

`workspace_root` is writeable via API but has no server-side effect

Minor: `Proof9DefaultsTab` Save does not guard against all-gates-disabled

Minor: duplicated `test_workspace` fixture

New: Pydantic `ValidationError` not caught by `ValueError` in GET endpoints