Slice 1 — Mock Call, Mock CRM (feature 001) by brettheap · Pull Request #1 · opensoft/openCloser

brettheap · 2026-05-20T15:11:00Z

Summary

First end-to-end product slice for openCloser: a single ALF prospect queue record moves through eligibility → mock call → scripted persona conversation → normalized result → local persistence → mock CRM write-back, entirely fixture-driven. No SignalWire, no Dataverse, no UI, no live LLM — those are deferred to later slices. Proves the five FR-033 module boundaries hold before real telephony/CRM swap in.

Built via the Spec Kit workflow (specify → clarify → plan → checklist → tasks → analyze → implement), then hardened by a 6-agent code review and a full 3-wave remediation.

Spec/plan/contracts/checklists under specs/001-mock-call-mock-crm/ — 36 FRs, 9 SCs, 5 module contracts, 15 quality checklists, project constitution.
Implementation — src/opencloser/ across the 5 boundary modules (eligibility, transport, persona, crm, core) + state/artifacts/CLI. All 75 tasks in tasks.md complete.
Review remediation — a multi-agent review surfaced 78 findings (6 CRITICAL, 13 HIGH, rest MEDIUM/LOW); all 78 are fixed across 5 commits. The branch went from 8 failing tests + 2 broken CI gates to fully green.

What's in scope (Slice 1)

One queue record per CLI invocation; SQLite local state; fixture-driven mock transport + scripted persona.
6 FR-004 eligibility rules, 11 dispositions, FR-019/020/021 idempotency + conflicting-event audit, FR-031/032 per-disposition write-back, deterministic JSON artifacts.

Out of scope (later slices)

SignalWire, Dataverse, Pipecat, admin UI, live LLM persona, multi-worker scaling, batch processing — all explicitly deferred per the constitution.

Quality gates

Gate	Result
Tests	183 passed, 0 failed
Coverage	97.26% (floor 90%)
ruff lint + format	clean
Review findings	78/78 remediated

Notable remediation fixes

FR-020 conflicting-event audit channel — orchestrator was returning on the first terminal event; rewritten to drain the full stream.
FR-036 persona rules 6/7/10 — were unreachable behind an over-eager escalation; intent classifier broadened and escalation now defers to the owning rule.
Hardening — state DB owner-only file modes, transport-payload key allowlisting, event-loop idempotency atomicity, partial-session-on-exception fix.
All 5 boundary Protocols made @runtime_checkable; CLI test coverage added to clear the gate.

Test plan

uv sync && uv run pytest — 183 pass, coverage ≥ 90%
uv run ruff check . && uv run ruff format --check . — clean
uv run opencloser init-state && uv run opencloser load-queue-item --file tests/fixtures/queue_items/alf-prospect-001.json && uv run opencloser run-one --queue-item-id alf-prospect-001 --conversation-fixture tests/fixtures/conversations/interested_callback_requested.json --transport-fixture tests/fixtures/transport_events/connected.json — produces 5 artifacts, disposition interested_callback_requested
Walk specs/001-mock-call-mock-crm/quickstart.md §1–8

🤖 Generated with Claude Code

First thin MVP slice spec proving the local end-to-end product loop: eligibility → mock call transport → ALF appointment-setter persona → normalized result → mock CRM write-back, with JSON artifact export and idempotent duplicate-event handling. Includes Constitution Alignment for all five principles, four prioritized user stories, 27 functional requirements, 9 measurable success criteria, and a requirements quality checklist (all items passing, no NEEDS CLARIFICATION markers). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds plan/research/data-model/tasks/quickstart, 5 module contracts, 14 domain checklists, and the project constitution. Integrates 12 cross-artifact analyzer findings; pipeline is now analyzer-clean (0 CRITICAL/HIGH/MEDIUM) and ready for /speckit.implement. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ice 1 Encodes 21 decisions from the post-walk Q&A (open-questions.md): pins canonical disclosure wording, allowed/disallowed claim categories, PHI data classes, DNC trigger phrases, refusal_topics enum, FR-035 trigger conditions, call window semantics, summary format, Mock Call Event payload sub-schemas, and Task assigned_to field. Adds DST edge case. Updates all 5 contracts with a syntax note and the pinned disclosure validator. Plan gains a Slice 2 forward-looking carry-overs section. All 531 checklist items now ticked. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Adds pyproject.toml (uv + Typer + Pydantic v2 + pytest + ruff per research.md), config/slice1.toml with default call window / max attempts / default timezone / artifact dir / persona version, .gitignore anchored to repo root for /artifacts/ and /state/, minimal README pointing to quickstart, and the empty src/opencloser/{core,eligibility,transport, persona,crm,state,artifacts}/ + tests/{unit,integration,fixtures/...}/ package skeleton. Ticks T001-T007 in tasks.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…er + tests (T008-T016) T008: Pydantic v2 entity models — CallableStatus / Disposition / HumanReviewReason + 4 supporting enums + 14 entity classes including the new Q19 assigned_to field on TaskPayload, the Q15 EventType + FailureReason payload enums, and the Q9 RefusalTopic enum. Mutual-exclusion validators on NormalizedResult and Extraction; task-kind validator on TaskPayload. T009: state/schema.sql — 12 SQLite tables + indexes + CHECK constraints + schema_meta seed; PRAGMAs (WAL, FK on, sync NORMAL) applied at connection time by store.connect. T010: state/store.py — connect/transaction/init_schema + per-table CRUD for queue_items, sessions, eligibility_decisions, mock_call_events, idempotency_keys (try_record), conflicting_event_audit_records, and the three write-back tables + normalized_results. T011: core/ids.py — UUID4-prefixed ID generators for session, mock provider call, decision, task, audit. T012: core/clock.py — Clock protocol + SystemClock + FrozenClock with the canonical ISO 8601 UTC ms formatter per FR-014. T013: core/idempotency.py — IdempotencyKey dataclass + compute_key + record_or_skip (UNIQUE-violation = duplicate no-op per FR-019). T014: core/config.py — tomllib loader + OPENCLOSER_* env-var overrides returning a validated SliceConfig. T015: artifacts/writer.py — write_session_artifacts emits the 6 per-session files (session-result, writeback, task, transcript, eligibility-decision, conflicting-events) with deterministic 2-space-indented sorted-keys UTF-8/LF JSON via tempfile + os.replace for atomicity. Supports SC-005 byte-identity across reruns. T016: tests/unit/{test_models, test_state_store, test_idempotency, test_artifacts_writer}.py + tests/conftest.py — coverage of every public function, every Pydantic validator, the FK cascade, the UNIQUE constraint, and the SC-005 byte-identity property. All 24 Python files pass ast.parse. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The demoable Slice 1 MVP. 22 tasks across 8 modules + 7 test files + 3 fixtures, ~3,100 lines: T017-T020: Protocol-based ABCs for the 4 boundary modules (eligibility, transport, persona, crm-writeback) per FR-033. T021-T022: BuiltinEligibilityEvaluator applies all 6 FR-004 rules in canonical order with no short-circuit; default-tz fallback per Edge Cases. Unit tests cover every rule, multi-rule failure, and the inclusive 20:00 boundary. T023-T024: FixtureDrivenTransport reads JSON fixtures and yields events verbatim (duplicates handed through; orchestrator dedups). Unknown event_types are skipped per Edge Case. T025-T029: Persona module — extraction (FR-034 rule-based), escalation (FR-035 9-code dispatch), disposition_rules (FR-036 10-rule precedence, first-match-wins), ALFAppointmentSetterPersona with disclosure validator (Q2 canonical string exact-match). Tests cover every FR-036 rule + the disclosure validator + determinism. T030-T031: MockWriteBackAdapter persists payloads to SQLite + maintains per-session WriteBack aggregate. emit_task belt-and-suspenders per FR-018 silently drops tasks for the 5 excluded dispositions. Test covers all five. T032-T033: Orchestrator wires the full loop. Block branch (FR-005 always creates a `blocked` session, queue-status-only writeback). Allow branch (place_call → in_flight → event stream with FR-019 idempotency-key dedup → persona on `connected` → FR-031 emission + FR-032 new_status → artifacts). FR-020 conflicting-late-event audit channel via `conflicting_events.json`. Unit tests cover happy path + block + unknown queue-item-id. T034: Typer CLI — init-state, load-queue-item, run-one. FR-027 output surface with wall_time_ms per Q1 tightening. T035-T037: alf-prospect-001 queue fixture, interested_callback_requested conversation fixture (using the Q2 canonical disclosure), connected transport fixture. T038: US1 Story 1 integration test asserts all (a)-(f) of the spec's Independent Test bullet against real state-store + real artifact writer + frozen clock. All 38 Python files (24 src + 14 tests) and 3 JSON fixtures parse cleanly. Phase 3 ticked in tasks.md. Slice 1 is now demoable end-to-end on the happy path. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…rage, polish (T039-T075) Phase 4 — US2 block-path (T039-T045): 5 queue-item fixtures covering each FR-004 blocking condition (DNC, after-hours, max-attempts, missing-phone, callable_status!='ready'). Parametrized integration test asserts the spec's Story 2 Independent Test (a)-(e) per condition + a multi-rule failure case (a/d/e/f). Phase 5 — US3 paths + idempotency (T046-T056): 6 transport fixtures (no_answer, voicemail, failed, duplicate_connected, duplicate_callback, conflicting_failed_after_completed). Orchestrator-level idempotency unit tests (test_idempotency_orchestrator.py) + integration test (test_us3_paths.py) covering FR-019 duplicate no-ops, FR-020 audit channel, FR-021 exactly-once attempt-count increment. Phase 6 — US4 disposition coverage (T057-T066): 9 conversation fixtures exercising every connected disposition (interested_email_captured, Q5 email+callback, needs_human_review variants, do_not_call mid-call, wrong_number, not_interested, call_back_later, script_truncated). Parametrized integration test asserts FR-014 fields, FR-031 write-back shape, and FR-032 new_status per disposition. Phase 7 — Polish (T067-T075): - T067: tests/test_imports.py — SC-009 dep-direction lint walks the AST of every src module and enforces contracts/*.md allowed-import rules - T068: tests/integration/test_sc001_budget.py — parametrized over every conversation fixture; asserts wall_time_ms < 60_000 - T069: tests/integration/test_sc005_determinism.py — two-run byte- identity check (modulo runtime IDs) - T070: tests/integration/test_sc006_no_false_activity.py — no_answer /voicemail/failed phone_call_activity summaries don't claim connected-flow content - T071: 90% coverage floor already wired in pyproject.toml - T072: tests/integration/test_constitution_sync.py — verifies the authored constitution file is present + lists the 5 principles + references the right FRs/SCs - T073: README already authored in Phase 1 - T074: ruff config already wired (run via `uv run ruff check .`) - T075: SC008_REVIEW.md documents the plan-time forward-compat review of each Slice 1 → Slice 2 substitution All 75 tasks now ticked. 51 Python files + 23 JSON fixtures parse cleanly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Multi-agent review (swarm + external) found the branch failing 8 of its own tests with a broken FR-020 audit channel and 3 unreachable FR-036 disposition rules. This commit fixes every CRITICAL/P1 finding. Orchestrator (core/orchestrator.py): - C1/P1-1: _run_allowed_session no longer returns on the first terminal event. The event loop drains the full stream; conflicting late events now reach the FR-020 audit branch. Restores conflicting-event auditing. - H1/P2-2: each event-loop iteration's idempotency-key INSERT + state mutation now run in one store.transaction() — no crash window that records a key without persisting the event. - P2-1/N4: _finalize_session writes normalized-result + CRM rows + artifacts BEFORE flipping session.state=FINALIZED (last commit), so a write-back/artifact failure can't leave a finalized session with no operator evidence. - P1-5/N3: transport_fixture_id is validated before any session row is created; a missing conversation fixture at `connected` finalizes the session cleanly as `failed` instead of raising mid-stream. Persona (persona/{extraction,escalation,alf_appointment_setter}.py): - C2: _classify_intent broadened — email-volunteering and affirmative- engagement phrases now classify as INTERESTED instead of falling to UNCERTAIN. - C3/C4/C5: escalation defers — uncertain_role/uncertain_intent return None when a captured email (FR-036 rules 6/7) or a signal-starved script (rule 10) already owns the conversation. Safety escalations (PHI, legal, etc.) are never suppressed. The 10-rule precedence in contracts/persona.md now holds literally; rules 6/7/10 are reachable. - N2: "we'll get back" / "i'm busy right now" / "not the best time" removed from the ambiguous-DNC set — spec FR-010/Q5 says these route to call_back_later, not needs_human_review. Tests: - P1-4/H2: test_orchestrator_happy.py — mkdir before fixture write. - test_sc005_determinism.py — same mkdir-ordering bug class. - C6: new test_us3_duplicate_callback_event_does_not_emit_second_task exercises the previously-unused duplicate_callback_requested fixture (Story 3 AS5 / FR-019 callback idempotency). Also commits uv.lock for reproducible installs. Full suite: 8 failing -> 0 failing, 127 passed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Wave 2 of the multi-agent review remediation. Core / state / models: - N5: state DB + directory chmod'd to 0o600/0o700 (best-effort) so local CRM data is owner-only; chmod failure is logged, not fatal. - N6: MockCallEvent gains a payload-key allowlist validator — per event type, only the Q15-schema keys survive; arbitrary keys are dropped before they can reach conflicting-events.json (FR-024 "no secrets"). - H3: EligibilityDecision + ExportedEligibilityDecision gain a default_tz_applied bool; schema, DAO, evaluator, orchestrator, and contracts/eligibility.md all updated so a default-timezone substitution is visible in the persisted decision (spec Edge Case). Transport: - H4: unknown event types now logged via logging.warning before skip (spec Edge Case "MUST log it"); misleading comment corrected; the unit test asserts the warning via caplog. Persona: - H5: the FR-035 escalation reason-code priority order is now documented in escalation.py's docstring and contracts/persona.md. - H6: over-broad "i handle " decision-maker phrase narrowed to four role-specific phrases — no more "I handle the laundry" false positive. Integration tests: - H7: all 9 FR-035 reason codes now pinned — 3 new conversation fixtures (ambiguous_dnc, non_clinical_topic_escalation, uncertain_intent) + a parametrized test. - H8: new duplicate_conflicting_failed fixture + test proving FR-019 beats FR-020 on overlap (a redelivered late event adds no 2nd audit row). - H9/H10: test_us2_blocked now asserts queue_status_updates has exactly 1 row (FR-029) and task_payloads has 0 rows (FR-018) for blocked runs. Full suite: 127 -> 137 passed, 0 failed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Wave 3 of the multi-agent review remediation — the long tail across all 6 modules. ~57 MEDIUM/LOW findings plus N7 (CLI coverage gate). Cross-cutting: - All 5 boundary Protocols (EligibilityEvaluator, CallTransport, Persona, WriteBackAdapter) are now @runtime_checkable — closes the spec Q23 "ABCs are the runtime enforcement" gap; isinstance smoke tests added. Eligibility: validate the configured default_timezone (a bogus default now fails rule b instead of silently passing); drop the datetime placeholder import; re-export from __init__; +5 tests (start boundary, DST, invalid-default-tz, envelope shape). Transport: document the session_id-carries-call-id rewrite contract + one-shot event_stream behavior; contracts/transport.md reconciled to the 5-field runtime MockCallEvent; +10 tests (Q15 payload pass-through, all 6 event types). Persona: narrow over-broad _WRONG_NUMBER_PHRASES; tighten hour_re so a bare number isn't a callback window; populate all 7 RefusalTopic enum values; drop dead code in disposition rule 10; disclosure validator is now byte-exact (no .strip()); +1 Q8 edge-case test. CRM: emit_task belt-and-suspenders now fails closed; contract recap gains assigned_to; +6 tests (protocol isinstance, assigned_to default, FR-029 double-emit, Q5 carve-out, fail-closed branches). Core/models: TaskPayload validator now forbids reason_code on callback tasks and captured_email on review tasks (FR-030); insert_normalized_ result gated by a NORMALIZED_RESULT idempotency key; schema FK gains DEFERRABLE INITIALLY DEFERRED; _resolve_final_disposition dead branch removed; conflicting-event fallback asserts instead of masking; new test_config.py (5 tests); spec.md FR-033 signatures reconciled to the contracts. Integration + N7: new test_cli.py takes cli.py 0%->100% and config.py 0%->100% — overall coverage 87.7%->97.26%, clearing the configured --cov-fail-under=90 gate. SC-006 asserts the structured disposition field; SC-005 covers transcript + conflicting-events artifacts; SC-001 covers terminal + blocked paths; SC-009 lint has an explicit top-level carve-out; multi-rule blocking test exercises 5 rules; constitution- sync test checks canonical order + Traceability section; SC008_REVIEW field counts corrected. Full suite: 137 -> 183 passed, 0 failed. Coverage 97.26%. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The branch failed its own configured `ruff` gate (pyproject.toml [tool.ruff]) with 26 errors. This commit clears them so `ruff check` and `ruff format --check` both pass. - 19 auto-fixed: quoted-annotation (UP037), unsorted-imports (I001), unused-noqa (RUF100), unused-import (F401), deprecated-import (UP035). - 20 files reformatted by `ruff format`. - 8 manual fixes: 2 collapsible-if (SIM102) in orchestrator.py + test_imports.py; 1 unused unpacked variable (RUF059) in the orchestrator finalize call; 5 ambiguous EN DASH characters (RUF002/ RUF003) in docstrings/comments replaced with ASCII hyphens. No behavior change. Full suite: 183 passed, coverage 97.26%. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The multi-rule-failure test's docstring and inline comments claimed rule (b) "cannot fail by design". That became inaccurate when the eligibility evaluator was hardened to fail rule (b) on a bogus configured default_timezone. The test's behavior and assertions are unchanged and correct (it uses a valid tz, so (b) passes here) — only the explanatory prose needed to match the current evaluator + contracts/eligibility.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

sourcery-ai

Sorry @brettheap, your pull request is larger than the review limit of 500000 diff characters

Copilot

Pull request overview

Implements Slice 1 (“Mock Call, Mock CRM”) end-to-end for openCloser: a single queued ALF prospect flows through eligibility → mock transport events → scripted persona → normalized result → artifact export → mock CRM persistence, all fixture-driven, plus accompanying specs/contracts/checklists and a comprehensive test suite.

Changes:

Added Slice 1 runtime modules (core/orchestrator, eligibility, transport, persona, CRM adapter, artifacts writer) plus CLI/config loader and ID/idempotency utilities.
Added fixture catalog (queue items, conversation scripts, transport event streams) and unit/integration tests covering determinism, idempotency, performance budget, CLI, and module-boundary import rules.
Added Spec Kit artifacts: constitution, contracts, checklists, SC-008 review, and git/worktree helper scripts.

Reviewed changes

Copilot reviewed 139 out of 148 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
tests/unit/test_idempotency.py	Unit coverage for FR-019 idempotency-key computation and duplicate-detection persistence.
tests/unit/test_config.py	Unit coverage for TOML + env-var config loading and env int coercion.
tests/unit/init.py	Unit test package marker.
tests/test_imports.py	AST-based dependency-direction lint enforcing FR-033 boundary import rules.
tests/integration/test_us1_happy_path.py	Story 1 end-to-end acceptance test for the happy path and artifact presence.
tests/integration/test_sc006_no_false_activity.py	Integration guardrail for SC-006 (no false “connected-call” activity on terminal paths).
tests/integration/test_sc005_determinism.py	Integration test for SC-005 deterministic artifacts across reruns.
tests/integration/test_sc001_budget.py	Integration budget tests ensuring fixture runs complete under SC-001 timing constraints.
tests/integration/test_constitution_sync.py	Integration smoke check keeping constitution headings/traceability in sync.
tests/integration/test_cli.py	Integration tests for Typer CLI subcommands and exit-code behavior.
tests/integration/init.py	Integration test package marker.
tests/fixtures/transport_events/voicemail.json	Transport fixture for voicemail terminal event path.
tests/fixtures/transport_events/no_answer.json	Transport fixture for no-answer terminal event path.
tests/fixtures/transport_events/failed.json	Transport fixture for failed terminal event path.
tests/fixtures/transport_events/duplicate_connected.json	Transport fixture covering duplicate connected/completed events.
tests/fixtures/transport_events/duplicate_conflicting_failed.json	Transport fixture covering duplicated conflicting late failed events.
tests/fixtures/transport_events/duplicate_callback_requested.json	Transport fixture covering duplicate callback_requested events.
tests/fixtures/transport_events/connected.json	Transport fixture for connected → completed happy path.
tests/fixtures/transport_events/conflicting_failed_after_completed.json	Transport fixture for FR-020 conflicting late event audit path.
tests/fixtures/transport_events/.gitkeep	Keeps the transport fixtures directory in version control.
tests/fixtures/queue_items/alf-prospect-not-ready.json	Queue-item fixture for eligibility blocking via callable_status.
tests/fixtures/queue_items/alf-prospect-missing-phone.json	Queue-item fixture for eligibility blocking via missing phone.
tests/fixtures/queue_items/alf-prospect-max-attempts.json	Queue-item fixture for eligibility blocking via max attempts.
tests/fixtures/queue_items/alf-prospect-dnc.json	Queue-item fixture for eligibility blocking via DNC.
tests/fixtures/queue_items/alf-prospect-after-hours.json	Queue-item fixture for eligibility blocking via call-window.
tests/fixtures/queue_items/alf-prospect-001.json	Primary queue-item fixture for happy-path runs.
tests/fixtures/queue_items/.gitkeep	Keeps the queue-item fixtures directory in version control.
tests/fixtures/conversations/wrong_number.json	Persona conversation fixture for wrong_number disposition.
tests/fixtures/conversations/uncertain_intent.json	Persona conversation fixture for needs_human_review (uncertain intent).
tests/fixtures/conversations/script_truncated.json	Persona conversation fixture for script_truncated escalation/disposition.
tests/fixtures/conversations/not_interested.json	Persona conversation fixture for not_interested disposition.
tests/fixtures/conversations/non_clinical_topic_escalation.json	Persona fixture for non_clinical_topic_escalation reason.
tests/fixtures/conversations/needs_human_review_uncertain_role.json	Persona fixture for uncertain_role escalation.
tests/fixtures/conversations/needs_human_review_email_invalid.json	Persona fixture for FR-036 rule 7 (unverified email, no callback).
tests/fixtures/conversations/interested_email_captured.json	Persona fixture for interested_email_captured disposition.
tests/fixtures/conversations/interested_email_and_callback.json	Persona fixture for Q5 “email + callback” carve-out behavior.
tests/fixtures/conversations/interested_callback_requested.json	Persona fixture for interested_callback_requested happy path.
tests/fixtures/conversations/do_not_call_mid_call.json	Persona fixture for dnc_stated → do_not_call behavior.
tests/fixtures/conversations/call_back_later.json	Persona fixture for call_back_later disposition.
tests/fixtures/conversations/ambiguous_dnc.json	Persona fixture for ambiguous_dnc escalation path.
tests/fixtures/conversations/.gitkeep	Keeps the conversation fixtures directory in version control.
tests/fixtures/init.py	Test fixtures package marker.
tests/conftest.py	Shared pytest fixtures (SQLite temp DB, frozen clock, temp artifact dir).
tests/init.py	Tests package marker.
src/opencloser/transport/mock.py	Fixture-driven mock transport yielding scripted call events.
src/opencloser/transport/base.py	CallTransport protocol defining the transport boundary surface.
src/opencloser/transport/init.py	Transport boundary package marker.
src/opencloser/state/init.py	State boundary package marker.
src/opencloser/persona/escalation.py	Pure escalation-reason derivation per FR-035 with deterministic priority.
src/opencloser/persona/disposition_rules.py	Deterministic disposition precedence rules per FR-036.
src/opencloser/persona/base.py	Persona protocol and shared dataclasses for persona boundary I/O.
src/opencloser/persona/alf_appointment_setter.py	Slice 1 deterministic persona implementation (disclosure, extraction, disposition).
src/opencloser/persona/init.py	Persona boundary package marker.
src/opencloser/eligibility/evaluator.py	Builtin eligibility evaluator implementing FR-004 rules and timezone fallback.
src/opencloser/eligibility/init.py	Eligibility boundary protocol + evaluator export.
src/opencloser/crm/mock.py	Mock write-back adapter persisting payloads to SQLite and building writeback aggregate.
src/opencloser/crm/base.py	WriteBackAdapter protocol defining CRM boundary surface.
src/opencloser/crm/init.py	CRM boundary package marker.
src/opencloser/core/ids.py	UUID-based ID generators for session/call/decision/task/audit identifiers.
src/opencloser/core/idempotency.py	FR-019 idempotency-key helper functions integrated with the state store.
src/opencloser/core/config.py	TOML + env override configuration loader returning validated SliceConfig.
src/opencloser/core/clock.py	Clock protocol + System/Frozen implementations and UTC-ms formatting.
src/opencloser/core/init.py	Core boundary package marker.
src/opencloser/core/orchestrator.py	Orchestrates Slice 1 session lifecycle, event handling, persistence, and artifact export.
src/opencloser/cli.py	Typer CLI entrypoint (init-state, load-queue-item, run-one).
src/opencloser/artifacts/writer.py	Deterministic, atomic artifact writer for JSON/text outputs.
src/opencloser/artifacts/init.py	Artifacts module package marker.
src/opencloser/init.py	Package metadata/version for opencloser.
specs/001-mock-call-mock-crm/SC008_REVIEW.md	Plan-time forward-compat review for Slice 2 substitutions (SC-008).
specs/001-mock-call-mock-crm/contracts/transport.md	Transport boundary contract documentation.
specs/001-mock-call-mock-crm/contracts/orchestrator.md	Orchestrator boundary contract documentation.
specs/001-mock-call-mock-crm/contracts/eligibility.md	Eligibility boundary contract documentation.
specs/001-mock-call-mock-crm/contracts/crm-writeback.md	CRM write-back boundary contract documentation.
specs/001-mock-call-mock-crm/checklists/transport.md	Requirements quality checklist for transport boundary.
specs/001-mock-call-mock-crm/checklists/tests-strategy.md	Quality checklist for testing strategy and fixture catalog.
specs/001-mock-call-mock-crm/checklists/requirements.md	Specification quality checklist.
specs/001-mock-call-mock-crm/checklists/plan-quality.md	Plan-phase artifact quality checklist.
specs/001-mock-call-mock-crm/checklists/eligibility.md	Requirements quality checklist for eligibility boundary.
specs/001-mock-call-mock-crm/checklists/contracts.md	Quality checklist for boundary contract documents.
specs/001-mock-call-mock-crm/checklists/cli.md	Requirements quality checklist for CLI/operator surface.
README.md	Repository-level overview, links to Slice 1 spec artifacts, and bootstrap commands.
QWEN.md	Agent context and Spec Kit workflow rules.
KIMI.md	Agent context and Spec Kit workflow rules.
GEMINI.md	Agent context and Spec Kit workflow rules.
CLAUDE.md	Agent context and Spec Kit workflow rules + active plan link.
AGENTS.md	Agent context and Spec Kit workflow rules.
.github/copilot-instructions.md	Repository Copilot instructions referencing constitution/plan workflow.
pyproject.toml	Python packaging, dependencies, ruff configuration, and pytest/coverage gates.
config/slice1.toml	Default Slice 1 configuration values (call window, eligibility, artifacts, persona, state DB).
.gitignore	Ignores runtime artifacts/state DB files and Python/tooling outputs.
.specify/memory/constitution.md	Repository constitution (binding principles) derived from Slice 1 spec alignment.
.specify/feature.json	Declares current feature directory for Spec Kit.
.specify/extensions.yml	Spec Kit extensions configuration enabling git hooks.
.specify/shell/select-worktree.sh	Helper script to select/list Spec Kit worktrees.
.specify/shell/ct.zsh	Zsh helper sourcing worktree tools.
.specify/extensions/git/README.md	Documentation for bundled Spec Kit git extension and configuration.
.specify/extensions/git/extension.yml	Git extension metadata, hooks, and configuration schema.
.specify/extensions/git/git-config.yml	Repo’s git extension configuration (branch numbering, worktree root, auto-commit toggles).
.specify/extensions/git/config-template.yml	Template for git extension configuration.
.specify/extensions/git/commands/speckit.git.validate.md	Spec Kit command doc: validate feature branch naming.
.specify/extensions/git/commands/speckit.git.remote.md	Spec Kit command doc: detect remote URL.
.specify/extensions/git/commands/speckit.git.initialize.md	Spec Kit command doc: initialize git repo.
.specify/extensions/git/commands/speckit.git.feature.md	Spec Kit command doc: create feature checkout/worktree.
.specify/extensions/git/commands/speckit.git.commit.md	Spec Kit command doc: auto-commit hook behavior.
.specify/extensions/git/scripts/bash/initialize-repo.sh	Bash implementation of git init hook.
.specify/extensions/git/scripts/bash/git-common.sh	Bash shared functions for git detection and branch validation.
.specify/extensions/git/scripts/bash/auto-commit.sh	Bash auto-commit hook implementation.
.specify/extensions/git/scripts/powershell/initialize-repo.ps1	PowerShell implementation of git init hook.
.specify/extensions/git/scripts/powershell/git-common.ps1	PowerShell shared functions for git detection and branch validation.
.specify/extensions/git/scripts/powershell/auto-commit.ps1	PowerShell auto-commit hook implementation.
.specify/extensions/git/scripts/powershell/git-common.ps1	PowerShell shared functions for branch validation and git detection.
.agents/skills/speckit-git-feature/SKILL.md	Agent skill doc mirroring the git feature-checkout command behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+import json
+import sqlite3
+from pathlib import Path


+    )
+
+
+_ = sqlite3  # keep import for state-level callers; CLI itself doesn't use it directly


@@ -0,0 +1 @@
+"""Interaction Core / Orchestrator — FR-033 module boundary #5."""


+## Status
+
+Slice 1 spec, plan, tasks, contracts, and checklists are complete. Implementation is in progress per `tasks.md`.
+


+**Module boundary**: FR-033, principle #5 (write-back side)
+**Implementation**: `src/opencloser/crm/base.py` (interface) + `src/opencloser/crm/mock.py` (Slice 1 mock)
+**Owns**: payload assembly per FR-028 / FR-029 / FR-030, FR-031 per-disposition mapping, FR-032 queue-status mapping
+**MUST NOT contain**: persona logic, eligibility logic, transport-event interpretation, session lifecycle management


+
+mapfile -t WORKTREE_LINES < <(find "$WORKTREE_ROOT" -maxdepth 1 -mindepth 1 -type d -printf '%T@|%f|%p\n' 2>/dev/null | sort -t'|' -k1,1nr)
+if [ "${#WORKTREE_LINES[@]}" -eq 0 ]; then


+def process_one_queue_item(
+    queue_item_id: str,
+    *,
+    conn: sqlite3.Connection,
+    config: SliceConfig,
+    eligibility: BuiltinEligibilityEvaluator,
+    transport: FixtureDrivenTransport,
+    persona: ALFAppointmentSetterPersona,
+    conversation_fixture: ConversationFixture | None,
+    transport_fixture_id: str | None,
+    clock: Clock | None = None,


+    """FR-005 block path: queue-status update only, no Phone Call activity, no task."""
+    crm = MockWriteBackAdapter(conn)
+    new_status = queue_item.callable_status  # FR-032: blocked → unchanged


Address all 8 Copilot review comments on PR #1: - Inject the WriteBackAdapter into process_one_queue_item instead of constructing MockWriteBackAdapter inside core, and type the eligibility/ transport/persona/crm params against their boundary Protocols rather than the concrete Slice 1 classes — decouples core from the mock per SC-008. - Add build_writeback to the WriteBackAdapter Protocol + crm-writeback contract so the orchestrator depends only on the documented surface. - Reconcile the crm-writeback "Owns" line: the adapter persists payloads and assembles the WriteBack aggregate; FR-031/FR-032 decisions stay in the orchestrator. - Correct the core docstrings (FR-033 boundary #1, not #5). - Drop the dead sqlite3 import + keepalive from cli.py. - Update the stale README status line (Slice 1 implementation is complete). - Make select-worktree.sh portable: replace GNU `find -printf` + `mapfile` with a stat-based approach that also runs on macOS (BSD find, Bash 3.2). 183 tests passing, 97.25% coverage, ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 139 out of 148 changed files in this pull request and generated 4 comments.

+def _imported_names(tree: ast.AST) -> set[str]:
+    names: set[str] = set()
+    for node in ast.walk(tree):
+        if isinstance(node, ast.Import):
+            for alias in node.names:
+                names.add(alias.name)
+        elif isinstance(node, ast.ImportFrom) and node.module:
+            names.add(node.module)
+    return names


+        try:
+            report = process_one_queue_item(
+                queue_item_id,
+                conn=conn,
+                config=config,
+                eligibility=BuiltinEligibilityEvaluator(),
+                transport=FixtureDrivenTransport(transport_dir),
+                persona=ALFAppointmentSetterPersona(),
+                crm=MockWriteBackAdapter(conn),
+                conversation_fixture=conversation,
+                transport_fixture_id=transport_fixture_id,
+            )
+        except QueueItemNotFound as exc:
+            typer.echo(f"error:       queue_item_id not found: {exc}", err=True)
+            raise typer.Exit(code=2) from None
+


+        try:
+            report = process_one_queue_item(
+                queue_item_id,
+                conn=conn,
+                config=config,
+                eligibility=BuiltinEligibilityEvaluator(),
+                transport=FixtureDrivenTransport(transport_dir),
+                persona=ALFAppointmentSetterPersona(),
+                crm=MockWriteBackAdapter(conn),
+                conversation_fixture=conversation,
+                transport_fixture_id=transport_fixture_id,
+            )


+    "crm": {
+        "opencloser.models",
+        "opencloser.state",  # adapter persists; allowed
+        "opencloser.core",
+    },


Copilot

Pull request overview

Copilot reviewed 139 out of 148 changed files in this pull request and generated 3 comments.

+    def _resolve_fixture_path(self, fixture_id: str) -> Path:
+        # Accept either a bare fixture_id ("no_answer") or a full filename ("no_answer.json").
+        if fixture_id.endswith(".json"):
+            return self._fixtures_dir / fixture_id
+        return self._fixtures_dir / f"{fixture_id}.json"


+def _is_within_call_window(*, tz_name: str, start_str: str, end_str: str, clock: Clock) -> bool:
+    """Q12 — `[start, end]` both ends inclusive at minute resolution; applies all 7 days (Q11)."""
+    try:
+        local = clock.now_local(tz_name)
+    except (ZoneInfoNotFoundError, ValueError):
+        return False
+    start = _parse_hhmm(start_str)
+    end = _parse_hhmm(end_str)
+    now_local_time = time(local.hour, local.minute)
+    return start <= now_local_time <= end
+
+
+def _parse_hhmm(value: str) -> time:
+    hours, minutes = value.split(":", 1)
+    return time(int(hours), int(minutes))


+Walks the AST of every `src/opencloser/*.py` and asserts the FR-033 boundary's
+dependency-allowed rules from `contracts/*.md` hold. Per boundary:


Address the 4 findings from Copilot's re-review of b18a679: - run-one now catches ValueError from process_one_queue_item (e.g. an eligible call invoked without --transport-fixture) and surfaces it as a clean `error:` line + exit code 2 instead of an uncaught traceback. - run-one validates config persona.version against the available persona and fails fast with a clear error when they disagree, instead of silently producing artifacts whose persona_version contradicts config. - The SC-009 dependency-direction lint now rejects relative imports: `_collect_imports` inspects node.level so a cross-boundary `from ..persona import X` can no longer bypass the gate. - Dropped opencloser.core from the crm allowlist + lint docstring so the gate matches crm-writeback.md, which forbids core for the write-back boundary (no crm module imports core today). Adds two CLI tests for the new error paths. 185 tests passing, 97.35% coverage, ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 139 out of 148 changed files in this pull request and generated 3 comments.

+    start = _parse_hhmm(start_str)
+    end = _parse_hhmm(end_str)
+    now_local_time = time(local.hour, local.minute)
+    return start <= now_local_time <= end
+
+
+def _parse_hhmm(value: str) -> time:
+    hours, minutes = value.split(":", 1)
+    return time(int(hours), int(minutes))


+    def place_call(self, queue_item: QueueItem, fixture_id: str) -> str:
+        """FR-007: assign a globally-unique mock_provider_call_id and stash the fixture path."""
+        del queue_item  # unused; the fixture itself carries the queue_item_ref
+        fixture_path = self._resolve_fixture_path(fixture_id)
+        if not fixture_path.exists():


+## Dependencies allowed
+
+- `opencloser.models` (for QueueItem and event shapes)
+- stdlib `json`, `uuid`, `pathlib`
+


…ation Address the 5 findings from Copilot's review of c950e69 / b18a679: - FixtureDrivenTransport._resolve_fixture_path now rejects fixture ids containing path separators or parent refs, so a fixture_id cannot escape the fixtures directory (path traversal). - CallWindowConfig validates that start/end are in-range HH:MM times. The field pattern only checked the digit shape, so a value like "99:99" would pass config load and then crash eligibility evaluation when _parse_hhmm built a datetime.time — now it fails fast at load. - transport.md "Dependencies allowed" now lists opencloser.core: the Slice 1 transport legitimately imports core.ids, and the lint already permits it — the contract was the stale side. - Corrected the place_call comment (the mock transport is queue-item- agnostic; the fixture does not carry a queue_item_ref) and the test_imports docstring (it walks src/opencloser/**/*.py via rglob). Adds tests for the traversal guard and the HH:MM range validator. 190 tests passing, 97.37% coverage, ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 139 out of 148 changed files in this pull request and generated 2 comments.

+        # Locate transport fixtures dir + fixture id.
+        if transport_fixture is not None:
+            transport_dir = transport_fixture.parent
+            transport_fixture_id = transport_fixture.stem
+        else:
+            transport_dir = Path("tests/fixtures/transport_events")
+            transport_fixture_id = None
+
+        # Load conversation fixture if provided.
+        conversation = (
+            _load_conversation_fixture(conversation_fixture) if conversation_fixture else None
+        )
+
+        try:
+            report = process_one_queue_item(
+                queue_item_id,
+                conn=conn,
+                config=config,
+                eligibility=BuiltinEligibilityEvaluator(),
+                transport=FixtureDrivenTransport(transport_dir),
+                persona=persona,
+                crm=MockWriteBackAdapter(conn),
+                conversation_fixture=conversation,
+                transport_fixture_id=transport_fixture_id,
+            )
+        except QueueItemNotFound as exc:
+            typer.echo(f"error:       queue_item_id not found: {exc}", err=True)
+            raise typer.Exit(code=2) from None
+        except ValueError as exc:
+            # process_one_queue_item raises ValueError for bad operator input
+            # (e.g. an allowed call with no --transport-fixture).
+            typer.echo(f"error:       {exc}", err=True)
+            raise typer.Exit(code=2) from None


+    "eligibility": {
+        "opencloser.models",
+        "opencloser.core",
+    },
+    "transport": {
+        "opencloser.models",
+        "opencloser.core",
+    },
+    "persona": {
+        "opencloser.models",
+        "opencloser.core",
+        "opencloser.persona",  # internal submodules
+    },
+    "crm": {
+        "opencloser.models",
+        "opencloser.state",  # adapter persists; allowed
+    },
+    "state": {
+        "opencloser.models",
+    },
+    "artifacts": {
+        "opencloser.models",
+    },


Address the 2 findings from Copilot's review of 6426c8b: - run-one now catches OSError as well as ValueError around fixture loading and the orchestrator call, and the conversation-fixture load moved inside that try. A typoed/unreadable --transport-fixture or --conversation-fixture path, or invalid fixture JSON, now produces a clean `error:` line + exit code 2 instead of an uncaught traceback (FileNotFoundError is an OSError; JSONDecodeError is a ValueError). - The dependency-direction allowlist now includes each boundary's own prefix (transport/crm/eligibility/state/artifacts), matching what core/persona already had. An intra-boundary import such as opencloser.transport.mock importing opencloser.transport.base is not a cross-boundary violation and must not be flagged. Adds a CLI test for the missing-fixture-path error path. 191 tests passing, 97.37% coverage, ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

brettheap · 2026-05-21T03:19:07Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0063fc26e6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-21T03:26:37Z

+    if not read_back:
+        return False
+    confirmations = ("yes", "that's right", "that is right", "correct", "yep", "yeah")
+    return any(_any_in(turn.text.lower(), confirmations) for turn in contact_turns[-3:])


Require post-readback confirmation before verifying email

_was_email_read_back_and_confirmed treats any confirmation phrase in the last three contact turns as sufficient, but it never checks that the confirmation happened after the persona read back the email. If a contact says "yes" earlier in the conversation for an unrelated prompt and never confirms after readback, this still marks the address as verified, which can incorrectly populate captured_email and drive the wrong disposition/writebacks instead of the unverified-email path required by FR-014.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-21T03:26:37Z

+    if day and hour:
+        return text[day.start() : hour.end()].strip()


Handle callback windows when time appears before weekday

When both a day token and an hour token are found, _capture_window_fragment always slices text[day.start():hour.end()]. This assumes weekday appears before time; for inputs like "2 PM Thursday works" (time first), day.start() is greater than hour.end(), producing an empty string and losing the callback window signal. That degrades task payload quality and violates the intent to preserve the contact’s callback wording.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-21T03:26:37Z

+    if not read_back:
+        return False
+    confirmations = ("yes", "that's right", "that is right", "correct", "yep", "yeah")
+    return any(_any_in(turn.text.lower(), confirmations) for turn in contact_turns[-3:])


Match confirmation tokens as words, not substrings

Email verification currently uses substring matching via _any_in, so unrelated words like "yesterday" satisfy the "yes" confirmation token and can mark an email as verified even when no explicit confirmation was given. In conversations where the persona read back an email, this can incorrectly set captured_email (verified) instead of captured_email_unverified, affecting downstream disposition and CRM payloads.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-21T03:26:37Z

+    """
+    path = Path(db_path)
+    path.parent.mkdir(parents=True, exist_ok=True)
+    _restrict_permissions(path.parent, 0o700)


Avoid chmod on CWD when DB path has no parent

connect() always runs _restrict_permissions(path.parent, 0o700), but for a bare filename like state.db, path.parent is .. That causes the process to chmod the current working directory to 0700, unexpectedly changing permissions for the entire checkout/worktree instead of only the state directory, which can break other tooling and collaborator access.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-21T03:26:37Z

+fi
+
+# Trim whitespace and validate description is not empty
+FEATURE_DESCRIPTION=$(echo "$FEATURE_DESCRIPTION" | xargs)


Replace xargs-based trimming for feature descriptions

Using xargs to trim FEATURE_DESCRIPTION makes branch creation fail on valid descriptions containing quotes/apostrophes (for example Bob's feature), because xargs treats quotes as syntax and exits with an unmatched-quote error. This turns a common input into a hard failure in speckit.git.feature and blocks feature-worktree creation.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-21T03:26:37Z

+    persona_turns: Sequence[ConversationTurn],
+    contact_turns: Sequence[ConversationTurn],
+) -> bool:
+    read_back = any(candidate in turn.text for turn in persona_turns)


Compare read-back email case-insensitively

Read-back detection uses candidate in turn.text with exact casing, so a persona readback like amy@example.com will not match a captured Amy@Example.com candidate and the email is marked unverified despite explicit confirmation. Because this path drives captured_email vs captured_email_unverified, case-only differences can incorrectly change disposition and writeback content.

Useful? React with 👍 / 👎.

Copilot

Pull request overview

Copilot reviewed 139 out of 148 changed files in this pull request and generated 5 comments.

+    report = process_one_queue_item(
+        "alf-prospect-001",
+        conn=tmp_state_db,
+        config=_config(tmp_artifact_dir, tmp_path / "db"),
+        eligibility=BuiltinEligibilityEvaluator(),


+    report = process_one_queue_item(
+        "alf-prospect-001",
+        conn=tmp_state_db,
+        config=_config(tmp_artifact_dir, tmp_path / "db"),
+        eligibility=BuiltinEligibilityEvaluator(),


+    report = process_one_queue_item(
+        "alf-prospect-001",
+        conn=tmp_state_db,
+        config=_config(tmp_artifact_dir, tmp_path / "db"),
+        eligibility=BuiltinEligibilityEvaluator(),


+    report = process_one_queue_item(
+        "alf-prospect-001",
+        conn=tmp_state_db,
+        config=_config(tmp_artifact_dir, tmp_path / "db"),
+        eligibility=BuiltinEligibilityEvaluator(),


+    report = process_one_queue_item(
+        qi.queue_item_id,
+        conn=tmp_state_db,
+        config=_config(tmp_artifact_dir, tmp_path / "db"),
+        eligibility=BuiltinEligibilityEvaluator(),


Codex review of 0063fc2 (6 findings): - extraction: email is verified only when a contact confirmation comes AFTER the persona read it back — a stray earlier "yes" no longer verifies the address (P1). Read-back match is now case-insensitive. - extraction: confirmation tokens are matched as whole words, so "yesterday" no longer satisfies the "yes" token (P2). - extraction: _capture_window_fragment captures the callback window regardless of day/time order, so "2 PM Thursday" is no longer lost (P2). - store.connect: skip the directory chmod when db_path is a bare filename (parent "."), so it never chmods the working directory (P2). - create-new-feature.sh: trim the feature description with parameter expansion instead of xargs, which choked on apostrophes/quotes (P2). Copilot review of 0063fc2 (1 finding): - tests: use a real DB filename (slice1.db) instead of the extensionless tmp_path / "db" for the unused config.state.db value. Adds 5 tests covering the after-readback confirmation, word-boundary matching, case-insensitive read-back, time-before-day callback window, and the bare-filename connect path. 196 tests passing, 97.39% coverage, ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

brettheap · 2026-05-21T03:50:00Z

@codex review
@copilot review

Copilot · 2026-05-21T03:51:19Z

@codex review

Reviewed the branch at 5e15f5b; there are no new actionable code-change requests in this thread, so no additional commit was needed.

@codex review
@copilot review

Reviewed the branch at 5e15f5b; there are no new actionable code-change requests in this thread, so no additional commit was needed.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5e15f5bf16

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-21T04:01:40Z

+    if path.parent != Path("."):
+        _restrict_permissions(path.parent, 0o700)


Avoid chmod on arbitrary parent directories

This unconditionally chmods path.parent for any path whose parent is not literal .. If state.db is configured as an absolute path in an existing directory (for example, the repo root or another shared location), the code will tighten that directory to 0700, which can unexpectedly break other processes/users that rely on prior permissions. The permission hardening should be limited to a dedicated state directory the app creates/owns, not any existing parent directory.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-21T04:01:40Z

+    except sqlite3.IntegrityError:
+        return False


Differentiate duplicate keys from other integrity errors

Catching all sqlite3.IntegrityError and returning False treats every constraint violation as a benign duplicate. The idempotency_keys insert is also subject to foreign-key and CHECK constraints, so non-duplicate data integrity failures can be silently downgraded to no-ops, masking real corruption/ordering bugs and causing skipped writes without surfacing an error. Only unique-key conflicts should map to False; other integrity failures should be raised.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-21T04:01:40Z

+    event_id: str
+    event_type: EventType
+    received_at: UtcMs
+    payload: dict[str, Any] = Field(default_factory=dict)


Enforce per-event payload value types

The model only allowlists payload keys but never validates payload value types, so malformed events like failed with {"failure_reason": "carrier_err"} or voicemail with a non-integer length are accepted and persisted/exported as if valid. This violates the Q15 payload schema contract and can break downstream logic that assumes enum/type-correct write-back and artifact data.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-21T04:01:40Z

+WORKTREE_ROOT=$(resolve_worktree_root "$REPO_ROOT")
+LATEST_WORKTREE=""
+if [ -d "$WORKTREE_ROOT" ]; then
+    LATEST_WORKTREE=$(find "$WORKTREE_ROOT" -maxdepth 1 -mindepth 1 -type d -printf '%T@ %p\n' 2>/dev/null | sort -nr | awk 'NR == 1 {print $2}')


Replace GNU-only find flag in worktree fallback

The fallback path discovery uses find ... -printf, which is GNU-specific; BSD/macOS find does not support this flag, so when the state file is missing the script can fail to detect existing worktrees and incorrectly exit with “no Speckit worktree handoff” even though directories are present. This breaks ct/ctp recovery behavior on non-GNU environments.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-21T04:01:40Z

+        fixture_path = self._pending.pop(mock_provider_call_id, None)
+        if fixture_path is None:
+            raise ValueError(f"No fixture pending for {mock_provider_call_id!r}")
+        data = json.loads(fixture_path.read_text(encoding="utf-8"))


Validate transport fixture JSON before mutating session state

event_stream() parses fixture JSON only after the orchestrator has already set the session to in_flight and incremented attempt_count; if the fixture file is malformed JSON, json.loads raises and the CLI reports a user-input error, but the database is left with a partially progressed session and consumed attempt. This creates inconsistent state from recoverable operator input errors and makes reruns non-idempotent at the workflow level.

Useful? React with 👍 / 👎.

Addresses 3 of the 5 Codex findings on 5e15f5b. Findings 3 (payload value-typing) and 5 (transport fixture-validation timing) are deferred as deliberate Slice-2 scope. - store.connect: harden the state directory only when this call creates it. The previous `!= "."` guard still chmod-ed any pre-existing parent (e.g. a shared dir an absolute db_path points into); now a pre-existing directory is left untouched and only a freshly-created one gets 0o700. - store.try_record_idempotency_key: only a PRIMARY KEY / UNIQUE conflict is the FR-019 duplicate no-op. Other sqlite3.IntegrityError causes (foreign-key, CHECK, NOT NULL) now re-raise instead of being silently swallowed as duplicates. - get-last-worktree.sh: replace the GNU-only `find -printf` newest- worktree discovery with a portable stat-based scan (works on BSD/macOS). Adds tests for the created-vs-pre-existing state dir and the non-duplicate IntegrityError re-raise. 198 tests passing, 97.40% coverage, ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

brettheap · 2026-05-21T19:10:55Z

@codex review

Copilot

Pull request overview

Copilot reviewed 139 out of 148 changed files in this pull request and generated 4 comments.

+        for raw_event in data.get("events", []):
+            event_type_str = raw_event["type"]
+            try:
+                event_type = EventType(event_type_str)
+            except ValueError:
+                # Edge Case "Mock transport emits an unknown event type": the system
+                # MUST log it, MUST NOT mutate state, MUST NOT crash. The transport
+                # skips the event (the orchestrator never sees it), so the transport
+                # itself is responsible for the log.
+                logger.warning(
+                    "transport fixture %s: unknown event type %r (event_id=%s) — skipping",
+                    fixture_path.name,
+                    event_type_str,
+                    raw_event.get("event_id"),
+                )
+                continue
+            yield MockCallEvent(
+                session_id=mock_provider_call_id,  # orchestrator rewrites to session_id on insert
+                event_id=raw_event["event_id"],
+                event_type=event_type,
+                received_at=raw_event["timestamp"],
+                payload=raw_event.get("payload", {}),
+            )


+def _python_files(root: Path) -> Iterable[Path]:
+    for p in root.rglob("*.py"):
+        if p.name == "__init__.py":
+            # Allow flexible re-exports in package roots; they typically just import for typing.
+            continue
+        yield p


+        if group not in _ALLOWED:
+            continue
+        allowed = _ALLOWED[group]


+        for imp in imported:
+            if not imp.startswith("opencloser."):
+                continue  # stdlib + third-party are unrestricted
+            if not any(imp == ok or imp.startswith(ok + ".") for ok in allowed):
+                violations.append(f"{path.relative_to(_SRC)}: imports {imp} (group={group})")


chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 92f3b96384

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-21T19:33:03Z

+        allowed = _ALLOWED_PAYLOAD_KEYS.get(self.event_type, frozenset())
+        filtered = {k: v for k, v in self.payload.items() if k in allowed}
+        if len(filtered) != len(self.payload):


Validate mock-event payload values per event type

The validator only strips unknown payload keys, but it never validates value types/enums for known keys. As a result, a fixture like {"type":"failed","payload":{"failure_reason":"bogus"}} or {"type":"voicemail","payload":{"voicemail_length_seconds":"abc"}} is accepted and persisted, producing artifacts that violate the declared transport contract and can break downstream consumers expecting the Q15 schema.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-21T19:33:03Z

+    confirmations = ("yes", "that's right", "that is right", "correct", "yep", "yeah")
+    return any(
+        turn.role == "contact" and _contains_word(turn.text.lower(), confirmations)
+        for turn in turns[read_back_idx + 1 :]


Tie email verification to an explicit email confirmation

Email verification is currently satisfied by any later contact utterance containing tokens like "yes"/"correct", even if that response is to an unrelated question. In conversations where the persona reads back an email and then asks something else, this can incorrectly upgrade an unverified email to captured_email, which then propagates into normalized results and callback task payloads.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-21T19:33:03Z

+            raise ValueError(f"No fixture pending for {mock_provider_call_id!r}")
+        data = json.loads(fixture_path.read_text(encoding="utf-8"))
+        for raw_event in data.get("events", []):
+            event_type_str = raw_event["type"]


Raise ValueError for malformed transport event objects

The transport parser indexes required keys directly (type, event_id, timestamp), so a malformed fixture entry raises KeyError mid-stream. run-one only converts ValueError/OSError into clean operator errors, so this currently escapes as an unhandled traceback instead of returning the intended CLI error path for bad fixture input.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-21T19:33:03Z

+
+def _load_conversation_fixture(path: Path) -> ConversationFixture:
+    raw = json.loads(path.read_text(encoding="utf-8"))
+    turns = [ConversationTurn(role=t["role"], text=t["text"]) for t in raw.get("turns", [])]


Guard conversation turn parsing against missing keys

Conversation turns are parsed with t["role"] and t["text"] without validation, so a malformed fixture turn triggers KeyError. This is not caught by the command’s current (ValueError, OSError) handler, causing run-one to crash with a traceback instead of emitting a controlled user-facing input error.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-21T19:33:03Z

+        if fixture_path is None:
+            raise ValueError(f"No fixture pending for {mock_provider_call_id!r}")
+        data = json.loads(fixture_path.read_text(encoding="utf-8"))
+        for raw_event in data.get("events", []):


Reject transport fixtures missing the events array

Using data.get("events", []) silently treats a malformed fixture with no events key as an empty stream, which drives the session to a synthesized failed disposition instead of surfacing invalid operator input. That can produce misleading CRM/task artifacts from bad fixtures rather than failing fast with a user-correctable error.

Useful? React with 👍 / 👎.

Resolves all 9 findings from the Codex and Copilot reviews of 92f3b96. Transport / CLI robustness — malformed operator input fails cleanly: - FixtureDrivenTransport.event_stream raises ValueError for a fixture with no `events` array or an event missing type/event_id/timestamp, instead of a bare KeyError escaping as a traceback. - cli._load_conversation_fixture validates the fixture is a JSON object and each turn carries role/text, raising ValueError (caught by run-one as a clean operator error + exit 2). Correctness: - MockCallEvent now type-checks payload values: voicemail_length_seconds must be int, failure_reason must be a FailureReason member, window_hint a string (None allowed as "not provided"). - Email verification is tied to the contact's DIRECT response to the read-back — a later "yes" to an unrelated question no longer verifies. SC-009 dependency-lint hardening (test_imports.py): - `__init__.py` files are now parsed — a forbidden import can't hide in a package root. - An unknown boundary group fails the lint instead of being skipped. - `from opencloser import X` is expanded to `opencloser.X`; bare `import opencloser` is flagged — neither can bypass the gate. Adds 5 tests. 203 passing, 97.37% coverage, ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b63a8f2433

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-22T00:29:57Z

+        if not isinstance(data, dict) or "events" not in data:
+            raise ValueError(
+                f"transport fixture {fixture_path.name!r} is not a JSON object with an 'events' array"
+            )
+        for raw_event in data["events"]:


Validate events is a list before iterating

event_stream() only checks that the fixture object has an events key, then iterates data["events"] directly. If a malformed fixture sets "events": null (or another non-list type), this raises TypeError instead of the intended user-input ValueError, and run-one does not catch TypeError, so the CLI can crash with a traceback rather than returning a clean operator error.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-22T00:29:57Z

+    if not isinstance(raw, dict):
+        raise ValueError(f"conversation fixture {path.name!r} is not a JSON object")
+    turns: list[ConversationTurn] = []
+    for t in raw.get("turns", []):


Enforce turns array type in conversation fixtures

_load_conversation_fixture() iterates raw.get("turns", []) without validating that turns is actually an array. A malformed fixture like {"turns": null} raises TypeError at iteration time, which bypasses the command’s (ValueError, OSError) handling and surfaces as an unhandled traceback instead of a user-facing input error.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-22T00:29:57Z

+    if not isinstance(raw, dict):
+        raise ValueError(f"conversation fixture {path.name!r} is not a JSON object")
+    turns: list[ConversationTurn] = []
+    for t in raw.get("turns", []):


Reject fixtures that omit the turns array

When turns is missing, _load_conversation_fixture() falls back to [] via raw.get("turns", []), so malformed input is treated as an empty conversation instead of invalid input. In run-one, that can silently drive a synthetic needs_human_review outcome and artifact writes from a bad fixture, which is misleading for operators and harder to diagnose than a direct validation error.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-22T00:29:57Z

+            raise ValueError(
+                f"conversation fixture {path.name!r}: every turn needs 'role' and 'text'"
+            )
+        turns.append(ConversationTurn(role=t["role"], text=t["text"]))


Validate conversation turn roles against allowed enum

_load_conversation_fixture() only checks that each turn has role/text keys, but it never enforces role to be one of "persona" or "contact" (the contract shape). A typo like "Contact" or an unknown role is accepted and then silently ignored by downstream extraction/classification filters, producing incorrect dispositions and artifacts instead of a clear operator-facing fixture error.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-22T00:29:57Z

+            raise ValueError(
+                f"conversation fixture {path.name!r}: every turn needs 'role' and 'text'"
+            )
+        turns.append(ConversationTurn(role=t["role"], text=t["text"]))


Require turn text to be a string at fixture load

_load_conversation_fixture() does not validate the type of text, so non-string values (e.g., null or numbers) pass through into ConversationTurn. Later persona/extraction paths call string methods like .strip()/.lower() on text; with non-strings this can raise uncaught runtime exceptions and crash run-one instead of returning a controlled input-validation error.

Useful? React with 👍 / 👎.

Reconcile divergent main after PR #1 (Slice 1 — Mock Call, Mock CRM) was squash-merged on origin. Conflict resolution: - Kept local: constitution.md (formal v1.0.0), AGENTS/CLAUDE/GEMINI/KIMI/ QWEN.md, .github/copilot-instructions.md (newer scaffold content). - Kept origin: create-new-feature.sh, get-last-worktree.sh, select-worktree.sh (carry PR review fixes); executable bit re-applied. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

1. Owner bind respects entity-set (Copilot #1): _resolve_task_owner now returns (owner_id, entity_set) and _owner_bind targets /teams(<id>) when the resolved owner verified as a team. A team override would otherwise POST under /systemusers and 400 against real Dataverse. 2. Active-enabled gate + default validation (Copilot #2): systemuser lookup now filters by isdisabled=false (teams have no such column). The configured default owner is verified too; if neither override nor default resolves to an active enabled user/team, Task emission is blocked with an operator-visible task_owner_default_unverifiable warning rather than writing an unverified id (FR-025 — never write an unverified owner). 3. Simplified --write flag (Copilot #3): dropped the misleading --dry-run alias on run-crm. The CLI now declares only --write and the help/error message explicitly point to US2 / T024-T026 for the dry-run wiring. Tests: - Contract test seeds the default owners as active enabled systemusers and adds: disabled-systemuser override falls back, team override binds to /teams, and default-owner-unverifiable blocks emission. - Integration test seeds both default owners as active enabled systemusers so the US1 happy/alternate/blocked paths verify under the stricter gate. 271 / 273 pass (28 in this PR's new suite; the 2 remaining failures are the pre-existing constitution-sync regressions inherited from fab3ebd). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot #1 + #2 — vacuous integration tests The original tests called validate_fixture / place_call directly and asserted on DB state, but neither touches the state store, so the "no session row / no attempt" assertions were tautologically true. To make them meaningful, drive process_one_queue_item with a malformed fixture and prove the orchestrator writes nothing — which required moving transport.place_call before session-row creation (the FR-014 allowed exception for fixture pre-validation). - src/opencloser/core/orchestrator.py: place_call now runs after eligibility + the transport_fixture_id None-check but BEFORE session insert. The session is inserted with mock_provider_call_id already set (was: None, backfilled later). _run_allowed_session receives the pre-placed mock_call_id and skips its own place_call; transport_fixture_id is no longer threaded into the helper. - tests/integration/test_us5_malformed_fixture.py: rewritten to drive process_one_queue_item end-to-end with each FR-020 rejection class. Asserts zero sessions, zero eligibility_decisions, zero mock_call_events, zero idempotency_keys, attempt_count == 0, callable_status unchanged, last_decision_at null. Also adds a CLI-handler smoke test asserting MalformedFixtureError surfaces as a ValueError through the full orchestrator stack. Copilot #3 — TOCTOU between place_call and event_stream Before: place_call validated the file; event_stream re-read it later. A fixture mutating on disk between the two calls could re-introduce a mid-stream failure. - src/opencloser/transport/mock.py: validate_fixture now returns the validated events list. place_call stashes (fixture_name, events) in _pending; event_stream consumes the in-memory snapshot instead of re-reading the file. The CallTransport public surface (place_call's signature and event_stream's externally observable behavior) is unchanged; only the internal implementation moves to an in-memory snapshot. - tests/unit/test_transport_fixture_validation.py: two new tests prove the snapshot behavior — event_stream yields the captured events even if the fixture is overwritten with garbage or deleted between place_call and event_stream. Updated the two accept tests to assert on the new return value. Sourcery #1 — error-type contract test - tests/unit/test_transport_fixture_validation.py: new test_malformed_fixture_error_is_a_value_error_subclass locks the ValueError-subclass contract that the Slice 1 CLI's (ValueError, OSError) handler depends on. Verification: uv run pytest → 270 passed, 2 pre-existing constitution_sync failures unrelated to this work. Ruff lint + format clean.

* [Spec Kit] Slice 2 implement — Phase 7 (T035-T036): US5 fixture pre-validation Adds FR-019/FR-020 transport-fixture structural pre-validation, resolving GitHub issue #2: a malformed mock-transport fixture is now rejected before any session row, attempt increment, or Dataverse queue change. T035 — src/opencloser/transport/mock.py (the FR-014 allowed exception): - New MalformedFixtureError (ValueError subclass — preserves the Slice 1 CLI's clean "error:" + exit-2 contract for bad operator input). - New module-level validate_fixture(fixture_path): rejects missing file, invalid JSON, missing/non-list 'events', non-dict events, and events missing type/event_id/timestamp. Structurally valid but semantically inconsistent fixtures pass (spec §Edge Cases). - place_call calls validate_fixture before allocating a mock_provider_call_id, so the orchestrator cannot record a call id or transition the session to in_flight for a malformed fixture. T036 — tests: - tests/unit/test_transport_fixture_validation.py: validate_fixture and place_call rejection cases, plus accept-cases for empty events and semantically inconsistent (out-of-order) event sequences. - tests/integration/test_us5_malformed_fixture.py: SC-006 — for each FR-020 rejection class (invalid JSON / no events / missing identity / missing file), assert no session row, no mock_call_event row, no attempt-count idempotency key, queue_item.attempt_count unchanged through both the runner-style validate_fixture gate and the transport's place_call gate. - tests/unit/test_transport_mock.py: updated three pre-existing tests whose semantics shifted from event_stream-time / FileNotFoundError to place_call-time / MalformedFixtureError. Verification: full pytest suite — 270 passed, 2 pre-existing constitution_sync failures unrelated to this phase. * Fix PR #5 review findings (Copilot x3 + Sourcery x1) Copilot #1 + #2 — vacuous integration tests The original tests called validate_fixture / place_call directly and asserted on DB state, but neither touches the state store, so the "no session row / no attempt" assertions were tautologically true. To make them meaningful, drive process_one_queue_item with a malformed fixture and prove the orchestrator writes nothing — which required moving transport.place_call before session-row creation (the FR-014 allowed exception for fixture pre-validation). - src/opencloser/core/orchestrator.py: place_call now runs after eligibility + the transport_fixture_id None-check but BEFORE session insert. The session is inserted with mock_provider_call_id already set (was: None, backfilled later). _run_allowed_session receives the pre-placed mock_call_id and skips its own place_call; transport_fixture_id is no longer threaded into the helper. - tests/integration/test_us5_malformed_fixture.py: rewritten to drive process_one_queue_item end-to-end with each FR-020 rejection class. Asserts zero sessions, zero eligibility_decisions, zero mock_call_events, zero idempotency_keys, attempt_count == 0, callable_status unchanged, last_decision_at null. Also adds a CLI-handler smoke test asserting MalformedFixtureError surfaces as a ValueError through the full orchestrator stack. Copilot #3 — TOCTOU between place_call and event_stream Before: place_call validated the file; event_stream re-read it later. A fixture mutating on disk between the two calls could re-introduce a mid-stream failure. - src/opencloser/transport/mock.py: validate_fixture now returns the validated events list. place_call stashes (fixture_name, events) in _pending; event_stream consumes the in-memory snapshot instead of re-reading the file. The CallTransport public surface (place_call's signature and event_stream's externally observable behavior) is unchanged; only the internal implementation moves to an in-memory snapshot. - tests/unit/test_transport_fixture_validation.py: two new tests prove the snapshot behavior — event_stream yields the captured events even if the fixture is overwritten with garbage or deleted between place_call and event_stream. Updated the two accept tests to assert on the new return value. Sourcery #1 — error-type contract test - tests/unit/test_transport_fixture_validation.py: new test_malformed_fixture_error_is_a_value_error_subclass locks the ValueError-subclass contract that the Slice 1 CLI's (ValueError, OSError) handler depends on. Verification: uv run pytest → 270 passed, 2 pre-existing constitution_sync failures unrelated to this work. Ruff lint + format clean. * Address PR #5 review round 2 (Codex P1+P2 + Copilot) Codex P1 — restore "session row before call attempt" ordering The previous commit moved transport.place_call before session-row creation. That works for the mock transport (place_call has no real side effect) but weakens the architecture for any future real transport: a DB failure after place_call but before insert_session would leave a placed call with no audit row and no idempotency context. Revert that reorder and add a side-effect-free hook on the transport that the orchestrator can call *before* DB writes without dialing anything: - src/opencloser/transport/base.py: extend the CallTransport protocol with pre_validate_fixture(fixture_id) — explicitly side-effect-free, raises MalformedFixtureError. FR-014's allowed exception covers adding this hook for FR-019/FR-020. - src/opencloser/transport/mock.py: implement pre_validate_fixture on FixtureDrivenTransport as `validate_fixture(self._resolve_fixture_path(...))`. No call-id allocation, no _pending mutation. - src/opencloser/core/orchestrator.py: process_one_queue_item now calls transport.pre_validate_fixture(transport_fixture_id) before session creation; place_call moves back to _run_allowed_session (its original location after session insert). The internal place_call still re-runs validate_fixture as defense in depth (idempotent with the pre-validate hook). - tests/unit/test_transport_fixture_validation.py: new tests for pre_validate_fixture — happy path, all FR-020 rejection classes, no call-id allocation, no _pending mutation, and the path-traversal guard. Codex P2 + Copilot — normalize all read/decode failures to MalformedFixtureError validate_fixture previously only caught JSONDecodeError. Non-UTF-8 bytes (UnicodeDecodeError), directories (IsADirectoryError), permission errors, and broken symlinks would bubble through as different exception types, bypassing the FR-020 single-class rejection contract. - src/opencloser/transport/mock.py: switch the existence check from path.exists() to path.is_file() (covers directories, broken symlinks, special files). Catch (OSError, UnicodeDecodeError) around read_text and re-raise as MalformedFixtureError (chained with `from exc` for diagnostics). - tests/unit/test_transport_fixture_validation.py: three new tests covering a directory at the fixture path, non-UTF-8 bytes, and an unreadable chmod-0 file (skipped when the test process can still read it, e.g. root). Verification: uv run pytest → 280 passed, 2 pre-existing constitution_sync failures unrelated to this work. Ruff lint + format clean. Stale bot comments (TOCTOU on mock.py / ValueError-subclass test) were already addressed in c5dee38; the bots re-posted them on the new commit anyway. * Address PR #5 review round 3 (Copilot) Copilot (suppressed, low-confidence — but real edge case): non-string event 'type' validate_fixture didn't enforce that 'type' is a string. A fixture with {"type": ["list"], ...} would pass pre-validation, then EventType(["list"]) in event_stream would raise TypeError (not ValueError) mid-stream, escape the unknown-type handler, escape the orchestrator, and bypass the CLI's (ValueError, OSError) handler — defeating the FR-020 single-class rejection contract. - src/opencloser/transport/mock.py: validate_fixture now requires isinstance(raw_event["type"], str) and raises MalformedFixtureError otherwise (with the offending event_id + actual type-name in the message for operator triage). - tests/unit/test_transport_fixture_validation.py: new parametrized test test_validate_fixture_rejects_non_string_event_type covers list, dict, int, null, and bool 'type' values. Copilot: stale-docstring sync - tests/integration/test_us5_malformed_fixture.py: the inline docstring on test_us5_orchestrator_rejects_malformed_fixture_with_no_state_mutations still credited "the orchestrator's pre-session place_call gate"; the actual gate is now `transport.pre_validate_fixture(...)`. Updated to match — including the regression-detection language so anyone tracing a future failure here lands on the right hook. Copilot: tasks.md / FR-014 wording - specs/002-mock-call-real-crm/tasks.md: T035 used to claim mock.py was "the only permitted change to the transport module (FR-014)". This PR also adds CallTransport.pre_validate_fixture in transport/base.py and wires it from core/orchestrator.py. Rewrote the task description to name all three files explicitly and note they jointly compose the FR-014 fixture-pre-validation exception (no other Slice-2 behavior added to those modules). Stale repeats (no action): - Copilot mock.py:88 (OSError on read_text): already fixed in 7f447e3 (is_file() + (OSError, UnicodeDecodeError) catch around read_text). Bot is pointing at the inner json.loads try and re-posting the previous round's text. Verification: uv run pytest → 285 passed, 2 pre-existing constitution_sync failures unrelated. Ruff lint + format clean. * Address PR #5 review round 4 (Copilot) Copilot (190bead, mock.py:112): validate event_id + timestamp shape too validate_fixture previously only checked that 'type' was a string. A fixture with a non-string event_id or a timestamp that doesn't match the canonical UtcMs pattern would pass pre-validation, then trigger pydantic.ValidationError when event_stream constructs MockCallEvent — *after* the orchestrator has already created the session row and incremented the attempt count. That reintroduces the partial-state problem FR-019/FR-020 are meant to prevent. - src/opencloser/transport/mock.py: validate_fixture now requires isinstance(raw_event["event_id"], str) and validates raw_event["timestamp"] against the canonical UtcMs schema via pydantic.TypeAdapter(UtcMs). Reusing the TypeAdapter keeps a single source of truth for the YYYY-MM-DDTHH:MM:SS.mmmZ pattern instead of duplicating the regex. - tests/unit/test_transport_fixture_validation.py: two new parametrized test classes — 5 non-string event_id shapes and 7 invalid-timestamp shapes (missing ms, space-separator, offset-not-Z, gibberish, empty, non-string, null) — plus an acceptance test for the canonical UtcMs format. Verification: uv run pytest → 298 passed, 2 pre-existing constitution_sync failures unrelated. Ruff lint + format clean. --------- Co-authored-by: brettheap <brett.heap@users.noreply.github.com>

* [Spec Kit] Slice 2 implement — Phase 3 (T017-T023): US1 MVP Stack on PR #3. Wires the end-to-end write-enabled loop against the in-process Dataverse fake — the constitution's CRM-first principle made operational. - T018: DataverseWriteBackAdapter — emit_phone_call_activity / emit_queue_status_update / emit_task / build_writeback against the Dataverse Web API, with FR-024 idempotency pre-query, FR-003 approved-update-field gating, FR-025 task-owner resolution (default + verified override + fallback warning), and crm_correlations / writeback_progress ledger writes. - T019: slice2/runner.py write-enabled coordinator — readiness gate, mapping approval check, verify() metadata gate, DataverseQueueLoader, stages the queue row locally, calls the unchanged process_one_queue_item (FR-014), stamps writeback_progress.run_status, maps to exit_status per cli-slice2.md. - T020: discover-crm CLI command — refreshes the mapping artifact via discover() and rewrites config/dataverse_mapping.json with approved=false. - T021: run-crm CLI command — explicit --write flag; reports exit_status / session_id / final_disposition / warnings. - T022: FR-034 non-E.164 phone warning — recorded into CrmRunReport and appended to queue.last_error via the adapter, never changing exit status. - T017: contract test — adapter satisfies the unchanged WriteBackAdapter Protocol and produces the right Dataverse interactions for every one of the 11 final dispositions (SC-011). - T023: US1 integration tests — happy path, interested_email_captured, needs_human_review, do_not_call, blocked, non-E.164 warning, and empty-queue no-op (SC-001 / SC-003 / SC-004 / SC-008). ruff + pytest pass (270 tests; pre-existing failures in test_constitution_sync.py inherited from fab3ebd are unrelated to this PR). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Address PR #6 review findings (Copilot) 1. Owner bind respects entity-set (Copilot #1): _resolve_task_owner now returns (owner_id, entity_set) and _owner_bind targets /teams(<id>) when the resolved owner verified as a team. A team override would otherwise POST under /systemusers and 400 against real Dataverse. 2. Active-enabled gate + default validation (Copilot #2): systemuser lookup now filters by isdisabled=false (teams have no such column). The configured default owner is verified too; if neither override nor default resolves to an active enabled user/team, Task emission is blocked with an operator-visible task_owner_default_unverifiable warning rather than writing an unverified id (FR-025 — never write an unverified owner). 3. Simplified --write flag (Copilot #3): dropped the misleading --dry-run alias on run-crm. The CLI now declares only --write and the help/error message explicitly point to US2 / T024-T026 for the dry-run wiring. Tests: - Contract test seeds the default owners as active enabled systemusers and adds: disabled-systemuser override falls back, team override binds to /teams, and default-owner-unverifiable blocks emission. - Integration test seeds both default owners as active enabled systemusers so the US1 happy/alternate/blocked paths verify under the stricter gate. 271 / 273 pass (28 in this PR's new suite; the 2 remaining failures are the pre-existing constitution-sync regressions inherited from fab3ebd). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Address PR #6 review findings (Sourcery + Codex + Copilot round 2) P1 — Use Dataverse EntitySetName for record API paths (Codex). The adapter now hits /phonecalls, /tasks, /medx_callqueueitems, /systemusers, /teams via a `_entity_set_name` helper (logical_name + "s" with override hooks for irregular plurals). Previously hitting /phonecall etc. would 404 against real Dataverse even when metadata verification succeeded. P1 — Drop scheduledend free-form-text write (Codex). preferred_callback_window is intentionally free-form ("Thursday afternoon"); writing to Dataverse `scheduledend` (Edm.DateTimeOffset) 400s on non-ISO input. The window now surfaces in the task description only. P1 — Normalize verify() failures into structured CrmRunReport (Codex). `run_one_crm_item` catches `DataverseError`/`MetadataError` from the readiness `verify()` call and returns exit_status=blocked instead of bubbling an unhandled exception to the CLI. P1 — Validate non-empty OData-EntityId after POST (Copilot). An empty header would otherwise lead to a CONFIRMED crm_correlations row with no record id, masking the failure on every later resume/audit. Now raises DataverseWriteBackError. P2 — Use mapped primary_id instead of f"{entity}id" (Sourcery + Codex). The adapter exposes a `_primary_id` helper that prefers the mapping's explicit `entities[...].primary_id` and falls back to the singular-form derivation only when the mapping omits it — important for activity tables whose primary key is `activityid`, not `<logical>id`. P2 — Catch queue-loader failures + Dataverse adapter exceptions in the run coordinator (Codex). `run-crm` now produces operator-visible exit_status=failed reports for QueueLoadError/MappingError/DataverseError /DataverseWriteBackError/missing fixture files instead of unhandled exceptions. P2 — Refresh ALL mutable queue_item fields when staging from Dataverse (Codex). _stage_queue_item now refreshes phone_number/timezone/ attempt_count in addition to callable_status/dnc_flag, so eligibility decisions are made against current Dataverse state rather than stale locally-cached data. State DAO `update_queue_item_status` extended additively with the new optional kwargs. P2 — discover-crm: load scaffold from configured artifact, not --out (Codex). `--out` is now strictly an output override; the scaffold is always read from slice2.toml's mapping_artifact, so `discover-crm --out new-path.json` works on the first run. P2 — `--next-ready` falls back to the configured [run].campaign (Codex). A missing `--campaign` no longer drops the campaign scope; the CLI loads slice2.toml first and uses `slice2_config.run.campaign` as the default. P2 — Active-enabled check (Codex + Copilot reflected). systemuser verification now filters by `isdisabled eq false`; a disabled user is treated as unverifiable. The default-owner verification stays — neither override nor default verifying blocks Task emission. Cleanup — Remove unused ReadinessError (Sourcery); update the runner doc comment that incorrectly described non-E.164 warnings as being written to transition_reason (they go to queue.last_error via the adapter's _compose_last_error). Test fixture changes — The in-process Dataverse fake now auto-aliases each registered entity with its `<name>s` form so the foundation queue_loader (logical names) and the adapter (entity-set names) hit the same data through one fake instance. `_handle_create`/`_handle_patch` derive the singular-form primary id so both URL forms resolve. Contract + integration tests now query fake.created/fake.patched via the entity-set names the adapter actually uses. 271 / 273 pass — only the pre-existing constitution-sync regressions inherited from fab3ebd remain. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Address PR #6 round-3 review findings (Codex) P1 — Wrap Pydantic ValidationError as MappingError in `load_mapping`. Both the runner and `discover-crm` now see a single error type for "mapping artifact is wrong"; previously a schema-invalid JSON file escaped as an unhandled traceback. P1 — Require explicit `entities[...].primary_id`. The previous fallback derived `<entity_set>-rstrip('s')+"id"`, which silently produces `phonecallid` where Dataverse activity tables expect `activityid`. The adapter now raises `MappingError` if the mapping omits primary_id, pointing the operator at `dataverse_mapping.json`. The test fixture mapping populates primary_id explicitly for queue items, phone calls (phonecallid), and tasks (activityid). P1 — Add explicit `entity_set_name` to DataverseEntityRef. Real Dataverse exposes `EntityDefinition.EntitySetName` which can differ from `logical_name + "s"` for irregular plurals; the adapter prefers the mapping's value and falls back to a curated override map + `+s` only when the field is unset. The override map now covers the common irregular plurals (`opportunity`, `businessunit`, etc.). P1 — Record failed `crm_correlations` rows on POST/PATCH errors. `emit_phone_call_activity` / `emit_queue_status_update` / `emit_task` each now persist a `write_status=FAILED` correlation + stamp the resume ledger with `run_status=blocked` and `last_error` before re-raising. Audit/recovery flows can now distinguish "not attempted" from "attempted and failed" per CRM record kind. P2 — Null-preserving queue refresh. `_stage_queue_item` now uses a new `store.replace_queue_item_mutable_fields` DAO that overwrites every mutable column from the live Dataverse snapshot — null clears included. The old `update_queue_item_status` retains its "`None` means leave it alone" partial-update semantics for callers that want that behavior; the docstring documents the split. Test fixture changes: the in-process Dataverse fake now stamps `activityid` as an alias on activity-table creates (phonecall, task, email, appointment) so a mapping that addresses activities by the canonical `activityid` primary key resolves correctly. Activity test attribute sets include `activityid` so $select / $filter pass through the fake's strict known-field check. 271 / 273 pass; pre-existing constitution-sync failures inherited from fab3ebd remain. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Address PR #6 round-4 review findings (Copilot) `--next-ready` now requires a non-empty campaign — the CLI checks the effective campaign (`--campaign` or `slice2.toml [run].campaign`) and exits with an actionable error when both are blank. This stops the loader from silently picking items outside the operator's intended campaign while the loader-side filter remains a queue-loader (T014) follow-up tracked in US3. Docstrings on `_derive_entity_set` and `DataverseEntityRef` now match the actual `discover-crm` behavior — it refreshes `_meta` only and does NOT auto-populate `entity_set_name` from `EntityDefinition.EntitySetName`. Operators set the field by hand on PR review; auto-population is tracked as a follow-on enhancement. New `replace_queue_item_mutable_fields` DAO is now covered by three unit tests: full overwrite, null-clearing of nullable columns, and FK-child preservation (sessions remain pinned to the refreshed queue row). These pin the new "always overwrite, null included" semantics that the runner depends on for accurate eligibility decisions. 274 / 276 pass; pre-existing constitution-sync failures from fab3ebd remain. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Address PR #6 round-5 review findings (Codex) P1 — Apply the campaign filter in queue_loader's NextReady path (contracts/dataverse-queue-loader.md). When the mapping carries a `queue.campaign` field the loader scopes the query to the selector's campaign. When the mapping omits the field — the current Slice 2 fixture — the loader stays campaign-agnostic and the CLI gate (`--next-ready` requires a non-empty campaign) is the user-facing signal. Two unit tests pin the new behavior: campaign scoping when mapped, campaign-agnostic when unmapped. P2 — Catch MappingError from adapter emit paths in the runner's write-back try/except. The adapter's `_primary_id` raises MappingError when the mapping omits `entities[...].primary_id`; the runner now normalizes that into an operator-visible exit_status=failed report alongside DataverseError/DataverseWriteBackError. 276 / 278 pass; pre-existing constitution-sync failures from fab3ebd remain. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Address PR #6 round-6 review findings (Copilot) P1/security — Escape single quotes in OData `$filter` string literals. A new `errors.odata_string_literal(value)` helper wraps a value in single quotes with embedded `'` doubled per OData v4 spec (section 5.1.1.6.1). Used by: - queue_loader.py — `campaign` filter on the NextReady path - adapter.py — idempotency pre-query on every emit_* Without this escape, a campaign like `O'Brien` would terminate the literal and invalidate the whole filter clause; the helper also shuts the door on OData-filter injection for user-supplied values. Three unit tests pin the escape semantics: basic wrap, embedded single-quote doubling, empty string. Docstrings — Two more stale "discover-crm reads EntitySetName" / "queue-loader is campaign-agnostic" remnants corrected. The `_entity_set` docstring now matches reality (`entity_set_name` set by hand on PR review today), and the cli.py comment notes the loader DOES filter by campaign when the mapping carries `queue.campaign`. 279 / 281 pass; pre-existing constitution-sync failures from fab3ebd remain. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Address PR #6 round-7 review findings (Codex) P1 — queue_loader uses entity-set names for record queries. The `DataverseQueueLoader.load` path was sending OData record GETs to the table's logical name (e.g. `/medx_callqueueitem`); real Dataverse keys record endpoints by the entity-set name (`/medx_callqueueitems`) and would 404 the request before the row could be loaded. P1 — Facility-name lookup now derives the entity-set form for the mapping's `lookup_target` too. `lookup_target` is a Dataverse logical- name reference (e.g. `account`), so a record query needs the entity-set form (`accounts`); the previous code 404'd against real environments on every queue intake that needed facility hydration. Refactor — Centralized entity-set resolution in `mapping.py` as `resolve_entity_set(mapping, entity_key)` (prefers `entity_set_name`, falls back to `derive_entity_set(logical)`). The override map for known-irregular plurals (`activitypointers`, `opportunities`, `businessunits`, `transactioncurrencies`) moved with it. Adapter + queue_loader now share one source of truth; the duplicated `_derive_entity_set` / `_ENTITY_SET_OVERRIDES` in `adapter.py` are gone. 279 / 281 pass; pre-existing constitution-sync failures from fab3ebd remain. The in-process Dataverse fake's `<name>` ↔ `<name>s` alias keeps the new entity-set queries resolving against the same records the foundation tests register by logical name. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Address PR #6 round-8 review findings (Copilot) Wrap the WHOLE write-back path in the failure-recording try/except for each emit_* method, not just the POST/PATCH itself. The `_resolve_task_owner` / `_lookup_owner_override` / `_owner_entity_set` calls can raise DataverseError before `_idempotent_create` runs; the queue-status `_fetch_queue_last_session` pre-query can too. Those failures previously bypassed `_record_failure`, leaving the audit / resume ledger inconsistent. The exception tuple is now `(DataverseError, DataverseWriteBackError, MappingError)` — the base `DataverseError` covers both `Permanent…` and `Transient…` so any Dataverse access failure in emit_phone_call_activity / emit_queue_status_update / emit_task records a consistent `crm_correlations(write_status=failed)` row plus `writeback_progress(run_status=blocked)`. Fake — pre-allocate one shared records list per registered entity and alias both `<name>` and `<name>s` to it. Previously the constructor only aliased entities that had seed rows in the initial `records` dict, so a POST against an empty `phonecall`/`task` table would create a list only under the entity-set key (`phonecalls`/`tasks`), and a later GET under the singular alias couldn't see the row. The new logic guarantees the alias on every registered entity, with seed rows overlaid into the shared list. 279 / 281 pass; pre-existing constitution-sync failures from fab3ebd remain. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Address PR #6 round-9 review findings (Copilot) Validate persona version in the Slice 2 runner. `run_one_crm_item` defaults to `ALFAppointmentSetterPersona()` but never checked that `slice1_config.persona.version` matched the persona's actual version; a misconfigured slice1.toml would silently tag every write-back with the wrong persona version. The runner now fails fast with `exit_status=blocked` on mismatch, mirroring the Slice 1 CLI's `run-one` check. Validate `--queue-item-id` shape at the CLI. `--queue-item-id` is interpolated into OData `$filter` clauses and record URLs downstream; accepting an arbitrary string opened a filter-injection vector for operator-supplied input and could turn typos into 404s from the queue loader. The CLI now validates against the Dataverse GUID shape (8-4-4-4-12 hex with hyphens) and exits 2 with a clear message otherwise. Two new test_cli integration tests pin the new behavior: - `run-crm --queue-item-id <not-a-guid>` exits 2 with the GUID error. - `run-crm` without `--write` still exits 2 with the dry-run pointer. 281 / 283 pass; pre-existing constitution-sync failures from fab3ebd remain. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Address PR #6 round-10 review findings (Copilot + Codex) P1 (Codex) — emit_task now SURFACES unverifiable-owner as a failed task write rather than silently returning. The earlier behavior let a disposition that requires a Task (interested_callback_requested, needs_human_review, etc.) complete the run as `completed` while the required Task never wrote, hiding broken human-follow-up handoff. When `_resolve_task_owner` returns None the adapter now raises `DataverseWriteBackError`; the outer try/except records a failed task correlation + stamps `writeback_progress(run_status=blocked)`, and the runner normalizes that into `exit_status=failed`. P1 (Codex) — `_owner_entity_set` no longer swallows every PermanentDataverseError. Only HTTP 404 (the table genuinely absent in this environment) falls through to the next entity set; 401/403 permission regressions and 400 malformed queries propagate so emit_task records a failed task correlation rather than silently degrading to "owner unverifiable → fallback". P1 (Codex) — queue_loader rejects NextReady when `queue.campaign` is missing from the mapping. In a multi-campaign environment a campaign-agnostic query would pick the wrong row; failing fast at queue-load with a clear `QueueLoadError` is the safer default. Existing tests updated to seed the queue.campaign mapping where campaign-scoped selection is expected. Copilot — `_record_failure` no longer downgrades a prior CONFIRMED correlation to FAILED, and no longer flips `*_done` from True to False. A later transient failure on an already-confirmed record (e.g. a duplicate emit retry) preserves the confirmed state and the completed `*_done` flag; only when the prior state was unset/not-done does the failure path mark them. P2 (Codex) — `_ENTITY_SET_OVERRIDES` had the wrong logical-name key (`"currency"` instead of `"transactioncurrency"`). With that key, `derive_entity_set("transactioncurrency")` fell through to the `+s` rule and produced `transactioncurrencys` — invalid Web API path. Fixed. Contract test updated: `test_task_blocked_when_default_owner_unverifiable` now expects `DataverseWriteBackError` AND verifies the failed `crm_correlations` row. 281 / 283 pass; pre-existing constitution-sync failures from fab3ebd remain. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Address PR #6 round-11 review finding (Codex) P2 — Catch Pydantic `ValidationError` from queue_loader.load() in the run coordinator. `DataverseQueueLoader.load(...)` constructs a `QueueItem` from the Dataverse row, and a schema-corrupted row (negative `attempt_count`, etc.) surfaces as `pydantic.ValidationError` from that constructor. The runner's loader try/except only caught `MappingError`/`QueueLoadError`/`DataverseError`, so a corrupted row escaped as an unhandled traceback instead of producing the documented operator-visible `exit_status=failed` report. 281 / 283 pass; pre-existing constitution-sync failures from fab3ebd remain. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Address PR #6 round-12 review findings (Copilot + Codex) P1 (Codex×3) — Persist `_record_failure` writes outside the orchestrator's rolling-back transaction. The orchestrator wraps every `crm.emit_*` in `with store.transaction(conn)`; when an emit raises, the ROLLBACK undoes ANY rows the adapter wrote — including the crm_correlations(failed) / writeback_progress(blocked) markers the new failure ledger depends on. The adapter now accepts an optional `failure_conn_factory` callable and the runner wires it to a peer SQLite connection on the same WAL db. Each failure write commits independently of the rolling-back transaction so resume/audit can distinguish "not attempted" from "attempted and failed" in production. Copilot — Catch `pydantic.ValidationError` in `discover-crm` and `run-crm` CLI config loads. `load_config` / `load_slice2_config` raise ValidationError (not ValueError) on schema-invalid TOML; the previous handlers crashed with a traceback. Both commands now exit 2 with a clean error message. Copilot — Preserve a terminal `COMPLETED` run_status in `_record_failure`. The same pattern as the CONFIRMED-correlation preservation: a later transient failure on an already-completed session must not regress the ledger from COMPLETED back to BLOCKED. Copilot — Fix stale cli.py comment that claimed the queue-loader falls back campaign-agnostic when `queue.campaign` is unmapped. The loader now raises `QueueLoadError` in that case; the CLI gate is just an earlier-exit convenience for missing operator input. Copilot — Fix the E.164 regex comment. `^\+[1-9]\d{1,14}$` is 2-15 digits (mandatory non-zero leading digit), not 1-15. Copilot — Fix step numbering in `run_one_crm_item` after the earlier refactor renumbered the steps but missed updating the trailing comment (5 → 6, was incorrectly 7). 281 / 283 pass; pre-existing constitution-sync failures from fab3ebd remain. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Address PR #6 round-13 review finding (Codex) P1 — Replace the peer-connection failure-ledger fix with an in-memory staging + deferred-write model. SQLite is single-writer, so the round-12 "open a second connection during emit_*" approach would deadlock (or hit SQLITE_BUSY) because the orchestrator's `with store.transaction(conn)` block already holds the write lock while the emit is in flight. The peer connection writes never reached disk in production environments — masking the original Dataverse failure with an OperationalError on top. New shape: - `_record_failure` (called inside the failing emit) appends to a new `self._pending_failures: list[_PendingFailure]` instead of touching the DB. The orchestrator's rollback fires without erasing anything. - The runner's `except (DataverseError, DataverseWriteBackError, MappingError)` handler now calls `adapter.flush_pending_failures()` AFTER the exception bubbles up (the rollback has released the write lock). Each staged failure is persisted via the shared connection with the same preserve-CONFIRMED / preserve-`*_done` / preserve-COMPLETED semantics from round-12. - The `failure_conn_factory` constructor parameter is gone — no more peer connections, no more lock contention. Contract test `test_task_blocked_when_default_owner_unverifiable` updated: calls `adapter.flush_pending_failures()` explicitly to verify the failed `crm_correlations` row that downstream resume/audit will see (the test bypasses the runner, so it has to drive flush itself, just as the runner does in production). 281 / 283 pass; pre-existing constitution-sync failures from fab3ebd remain. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Address PR #6 round-14 review finding (Copilot) Preserve terminal `run_status` in `_mark_progress`. The default of `RunStatus.IN_PROGRESS` was unconditionally overwriting the column on every per-emit `*_done` flip, so a follow-up call (idempotent retry, mid-run progress update) could regress an already-terminal `COMPLETED` or `BLOCKED` ledger row back to `IN_PROGRESS`. The default is now `None` ("leave the existing value alone"); the first call for a brand-new session still gets `IN_PROGRESS` via `_new_progress`, and explicit terminal transitions (the runner's `finalize_progress` / `flush_pending_failures`) pass the value they want. 281 / 283 pass; pre-existing constitution-sync failures from fab3ebd remain. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Fix Sonar issues after US1 umbrella merge --------- Co-authored-by: brettheap <brett.heap@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…closes review HIGH #1/#2/#3) Closes 3 HIGH findings from the 2026-05-24 swarm code-review pass: HIGH #1 (security-secrets): tenant_id leaked via OAuth endpoint URL in auth.py error messages. `_wrap_auth_transport_error` and `_auth_status_error` interpolated `_TOKEN_ENDPOINT.format(tenant=tenant_id)` into every Permanent/TransientDataverseError. The string lands in `writeback_progress.last_error`, `CrmRunReport.message`, stdout, and SQLite rows on any auth failure. HIGH #2 (security-secrets): env_url host leaked via response.request.url in errors.py for every non-2xx response. The pre-existing redaction stripped only the query string; the host (= DATAVERSE_ENV_URL) survived. HIGH #3 (security-secrets): T047 only tested the success path. It asserted `exit_status == "completed"` before walking the inspection surface and stubbed the OAuth flow, so neither HIGH #1 nor HIGH #2 could be caught. Fixes: - auth.py: new module-level `_REDACTED_TOKEN_ENDPOINT` constant ("https://login.microsoftonline.com/<redacted-tenant>/oauth2/v2.0/token"). `_wrap_auth_transport_error` + `_auth_status_error` use the constant; they no longer take an `endpoint` parameter. The real endpoint stays internal to `_acquire` for the POST only. Underlying exception is preserved via `from exc` chain (debug visibility) but never serialized via `{exc!r}` — uses `type(exc).__name__` instead. - errors.py:96-104: strip BOTH host and query from response URL; only path survives. Prefixed with `<env>` placeholder so the redaction is visible at a glance. - tests/contract/test_no_secrets_in_artifacts.py: split into 5 tests covering: (a) success path with tightened sanity list (task.json, transcript.txt added; SQLite row-count assertions), (b) Dataverse-401 failure path that drives the runner's error-formatter code path, (c-e) unit-level tests pinning the redaction contract on `_wrap_auth_transport_error`, `_auth_status_error`, and `raise_for_dataverse_response` directly so a regression surfaces without needing a working OAuth e2e fake. Result: 611 pass (+4 new T047 tests) / 2 pre-existing constitution_sync failures unrelated. ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

brettheap and others added 13 commits May 18, 2026 22:12

chore: add speckit worktree workflow

d295891

Copilot AI review requested due to automatic review settings May 20, 2026 15:11

sourcery-ai Bot reviewed May 20, 2026

View reviewed changes

Copilot started reviewing on behalf of brettheap May 20, 2026 15:13 View session

Copilot AI reviewed May 20, 2026

View reviewed changes

brettheap requested a review from Copilot May 20, 2026 18:56

Copilot started reviewing on behalf of brettheap May 20, 2026 18:57 View session

Copilot AI reviewed May 20, 2026

View reviewed changes

brettheap requested a review from Copilot May 21, 2026 01:21

Copilot started reviewing on behalf of brettheap May 21, 2026 01:21 View session