Skip to content

Slice 1 — Mock Call, Mock CRM (feature 001)#1

Merged
brettheap merged 20 commits into
mainfrom
001-mock-call-mock-crm
May 22, 2026
Merged

Slice 1 — Mock Call, Mock CRM (feature 001)#1
brettheap merged 20 commits into
mainfrom
001-mock-call-mock-crm

Conversation

@brettheap
Copy link
Copy Markdown
Contributor

Summary

First end-to-end product slice for openCloser: a single ALF prospect queue record moves through eligibility → mock call → scripted persona conversation → normalized result → local persistence → mock CRM write-back, entirely fixture-driven. No SignalWire, no Dataverse, no UI, no live LLM — those are deferred to later slices. Proves the five FR-033 module boundaries hold before real telephony/CRM swap in.

Built via the Spec Kit workflow (specify → clarify → plan → checklist → tasks → analyze → implement), then hardened by a 6-agent code review and a full 3-wave remediation.

  • Spec/plan/contracts/checklists under specs/001-mock-call-mock-crm/ — 36 FRs, 9 SCs, 5 module contracts, 15 quality checklists, project constitution.
  • Implementationsrc/opencloser/ across the 5 boundary modules (eligibility, transport, persona, crm, core) + state/artifacts/CLI. All 75 tasks in tasks.md complete.
  • Review remediation — a multi-agent review surfaced 78 findings (6 CRITICAL, 13 HIGH, rest MEDIUM/LOW); all 78 are fixed across 5 commits. The branch went from 8 failing tests + 2 broken CI gates to fully green.

What's in scope (Slice 1)

  • One queue record per CLI invocation; SQLite local state; fixture-driven mock transport + scripted persona.
  • 6 FR-004 eligibility rules, 11 dispositions, FR-019/020/021 idempotency + conflicting-event audit, FR-031/032 per-disposition write-back, deterministic JSON artifacts.

Out of scope (later slices)

SignalWire, Dataverse, Pipecat, admin UI, live LLM persona, multi-worker scaling, batch processing — all explicitly deferred per the constitution.

Quality gates

Gate Result
Tests 183 passed, 0 failed
Coverage 97.26% (floor 90%)
ruff lint + format clean
Review findings 78/78 remediated

Notable remediation fixes

  • FR-020 conflicting-event audit channel — orchestrator was returning on the first terminal event; rewritten to drain the full stream.
  • FR-036 persona rules 6/7/10 — were unreachable behind an over-eager escalation; intent classifier broadened and escalation now defers to the owning rule.
  • Hardening — state DB owner-only file modes, transport-payload key allowlisting, event-loop idempotency atomicity, partial-session-on-exception fix.
  • All 5 boundary Protocols made @runtime_checkable; CLI test coverage added to clear the gate.

Test plan

  • uv sync && uv run pytest — 183 pass, coverage ≥ 90%
  • uv run ruff check . && uv run ruff format --check . — clean
  • uv run opencloser init-state && uv run opencloser load-queue-item --file tests/fixtures/queue_items/alf-prospect-001.json && uv run opencloser run-one --queue-item-id alf-prospect-001 --conversation-fixture tests/fixtures/conversations/interested_callback_requested.json --transport-fixture tests/fixtures/transport_events/connected.json — produces 5 artifacts, disposition interested_callback_requested
  • Walk specs/001-mock-call-mock-crm/quickstart.md §1–8

🤖 Generated with Claude Code

brettheap and others added 13 commits May 18, 2026 22:12
First thin MVP slice spec proving the local end-to-end product loop:
eligibility → mock call transport → ALF appointment-setter persona →
normalized result → mock CRM write-back, with JSON artifact export and
idempotent duplicate-event handling. Includes Constitution Alignment for
all five principles, four prioritized user stories, 27 functional
requirements, 9 measurable success criteria, and a requirements quality
checklist (all items passing, no NEEDS CLARIFICATION markers).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds plan/research/data-model/tasks/quickstart, 5 module contracts, 14 domain
checklists, and the project constitution. Integrates 12 cross-artifact analyzer
findings; pipeline is now analyzer-clean (0 CRITICAL/HIGH/MEDIUM) and ready
for /speckit.implement.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ice 1

Encodes 21 decisions from the post-walk Q&A (open-questions.md): pins
canonical disclosure wording, allowed/disallowed claim categories, PHI
data classes, DNC trigger phrases, refusal_topics enum, FR-035 trigger
conditions, call window semantics, summary format, Mock Call Event
payload sub-schemas, and Task assigned_to field. Adds DST edge case.
Updates all 5 contracts with a syntax note and the pinned disclosure
validator. Plan gains a Slice 2 forward-looking carry-overs section.
All 531 checklist items now ticked.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds pyproject.toml (uv + Typer + Pydantic v2 + pytest + ruff per
research.md), config/slice1.toml with default call window / max attempts
/ default timezone / artifact dir / persona version, .gitignore anchored
to repo root for /artifacts/ and /state/, minimal README pointing to
quickstart, and the empty src/opencloser/{core,eligibility,transport,
persona,crm,state,artifacts}/ + tests/{unit,integration,fixtures/...}/
package skeleton. Ticks T001-T007 in tasks.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…er + tests (T008-T016)

T008: Pydantic v2 entity models — CallableStatus / Disposition / HumanReviewReason
+ 4 supporting enums + 14 entity classes including the new Q19 assigned_to
field on TaskPayload, the Q15 EventType + FailureReason payload enums, and
the Q9 RefusalTopic enum. Mutual-exclusion validators on NormalizedResult and
Extraction; task-kind validator on TaskPayload.

T009: state/schema.sql — 12 SQLite tables + indexes + CHECK constraints
+ schema_meta seed; PRAGMAs (WAL, FK on, sync NORMAL) applied at connection
time by store.connect.

T010: state/store.py — connect/transaction/init_schema + per-table CRUD for
queue_items, sessions, eligibility_decisions, mock_call_events,
idempotency_keys (try_record), conflicting_event_audit_records, and the
three write-back tables + normalized_results.

T011: core/ids.py — UUID4-prefixed ID generators for session, mock provider
call, decision, task, audit.

T012: core/clock.py — Clock protocol + SystemClock + FrozenClock with the
canonical ISO 8601 UTC ms formatter per FR-014.

T013: core/idempotency.py — IdempotencyKey dataclass + compute_key +
record_or_skip (UNIQUE-violation = duplicate no-op per FR-019).

T014: core/config.py — tomllib loader + OPENCLOSER_* env-var overrides
returning a validated SliceConfig.

T015: artifacts/writer.py — write_session_artifacts emits the 6 per-session
files (session-result, writeback, task, transcript, eligibility-decision,
conflicting-events) with deterministic 2-space-indented sorted-keys UTF-8/LF
JSON via tempfile + os.replace for atomicity. Supports SC-005 byte-identity
across reruns.

T016: tests/unit/{test_models, test_state_store, test_idempotency,
test_artifacts_writer}.py + tests/conftest.py — coverage of every public
function, every Pydantic validator, the FK cascade, the UNIQUE constraint,
and the SC-005 byte-identity property. All 24 Python files pass ast.parse.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The demoable Slice 1 MVP. 22 tasks across 8 modules + 7 test files + 3
fixtures, ~3,100 lines:

T017-T020: Protocol-based ABCs for the 4 boundary modules (eligibility,
transport, persona, crm-writeback) per FR-033.

T021-T022: BuiltinEligibilityEvaluator applies all 6 FR-004 rules in
canonical order with no short-circuit; default-tz fallback per Edge Cases.
Unit tests cover every rule, multi-rule failure, and the inclusive 20:00
boundary.

T023-T024: FixtureDrivenTransport reads JSON fixtures and yields events
verbatim (duplicates handed through; orchestrator dedups). Unknown
event_types are skipped per Edge Case.

T025-T029: Persona module — extraction (FR-034 rule-based), escalation
(FR-035 9-code dispatch), disposition_rules (FR-036 10-rule precedence,
first-match-wins), ALFAppointmentSetterPersona with disclosure validator
(Q2 canonical string exact-match). Tests cover every FR-036 rule + the
disclosure validator + determinism.

T030-T031: MockWriteBackAdapter persists payloads to SQLite + maintains
per-session WriteBack aggregate. emit_task belt-and-suspenders per
FR-018 silently drops tasks for the 5 excluded dispositions. Test covers
all five.

T032-T033: Orchestrator wires the full loop. Block branch (FR-005 always
creates a `blocked` session, queue-status-only writeback). Allow branch
(place_call → in_flight → event stream with FR-019 idempotency-key dedup
→ persona on `connected` → FR-031 emission + FR-032 new_status →
artifacts). FR-020 conflicting-late-event audit channel via
`conflicting_events.json`. Unit tests cover happy path + block + unknown
queue-item-id.

T034: Typer CLI — init-state, load-queue-item, run-one. FR-027 output
surface with wall_time_ms per Q1 tightening.

T035-T037: alf-prospect-001 queue fixture, interested_callback_requested
conversation fixture (using the Q2 canonical disclosure), connected
transport fixture.

T038: US1 Story 1 integration test asserts all (a)-(f) of the spec's
Independent Test bullet against real state-store + real artifact writer
+ frozen clock.

All 38 Python files (24 src + 14 tests) and 3 JSON fixtures parse cleanly.
Phase 3 ticked in tasks.md. Slice 1 is now demoable end-to-end on the
happy path.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…rage, polish (T039-T075)

Phase 4 — US2 block-path (T039-T045): 5 queue-item fixtures covering each
FR-004 blocking condition (DNC, after-hours, max-attempts, missing-phone,
callable_status!='ready'). Parametrized integration test asserts the
spec's Story 2 Independent Test (a)-(e) per condition + a multi-rule
failure case (a/d/e/f).

Phase 5 — US3 paths + idempotency (T046-T056): 6 transport fixtures
(no_answer, voicemail, failed, duplicate_connected, duplicate_callback,
conflicting_failed_after_completed). Orchestrator-level idempotency
unit tests (test_idempotency_orchestrator.py) + integration test
(test_us3_paths.py) covering FR-019 duplicate no-ops, FR-020 audit
channel, FR-021 exactly-once attempt-count increment.

Phase 6 — US4 disposition coverage (T057-T066): 9 conversation fixtures
exercising every connected disposition (interested_email_captured, Q5
email+callback, needs_human_review variants, do_not_call mid-call,
wrong_number, not_interested, call_back_later, script_truncated).
Parametrized integration test asserts FR-014 fields, FR-031 write-back
shape, and FR-032 new_status per disposition.

Phase 7 — Polish (T067-T075):
- T067: tests/test_imports.py — SC-009 dep-direction lint walks the AST
  of every src module and enforces contracts/*.md allowed-import rules
- T068: tests/integration/test_sc001_budget.py — parametrized over every
  conversation fixture; asserts wall_time_ms < 60_000
- T069: tests/integration/test_sc005_determinism.py — two-run byte-
  identity check (modulo runtime IDs)
- T070: tests/integration/test_sc006_no_false_activity.py — no_answer
  /voicemail/failed phone_call_activity summaries don't claim
  connected-flow content
- T071: 90% coverage floor already wired in pyproject.toml
- T072: tests/integration/test_constitution_sync.py — verifies the
  authored constitution file is present + lists the 5 principles +
  references the right FRs/SCs
- T073: README already authored in Phase 1
- T074: ruff config already wired (run via `uv run ruff check .`)
- T075: SC008_REVIEW.md documents the plan-time forward-compat review
  of each Slice 1 → Slice 2 substitution

All 75 tasks now ticked. 51 Python files + 23 JSON fixtures parse cleanly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Multi-agent review (swarm + external) found the branch failing 8 of its
own tests with a broken FR-020 audit channel and 3 unreachable FR-036
disposition rules. This commit fixes every CRITICAL/P1 finding.

Orchestrator (core/orchestrator.py):
- C1/P1-1: _run_allowed_session no longer returns on the first terminal
  event. The event loop drains the full stream; conflicting late events
  now reach the FR-020 audit branch. Restores conflicting-event auditing.
- H1/P2-2: each event-loop iteration's idempotency-key INSERT + state
  mutation now run in one store.transaction() — no crash window that
  records a key without persisting the event.
- P2-1/N4: _finalize_session writes normalized-result + CRM rows +
  artifacts BEFORE flipping session.state=FINALIZED (last commit), so a
  write-back/artifact failure can't leave a finalized session with no
  operator evidence.
- P1-5/N3: transport_fixture_id is validated before any session row is
  created; a missing conversation fixture at `connected` finalizes the
  session cleanly as `failed` instead of raising mid-stream.

Persona (persona/{extraction,escalation,alf_appointment_setter}.py):
- C2: _classify_intent broadened — email-volunteering and affirmative-
  engagement phrases now classify as INTERESTED instead of falling to
  UNCERTAIN.
- C3/C4/C5: escalation defers — uncertain_role/uncertain_intent return
  None when a captured email (FR-036 rules 6/7) or a signal-starved
  script (rule 10) already owns the conversation. Safety escalations
  (PHI, legal, etc.) are never suppressed. The 10-rule precedence in
  contracts/persona.md now holds literally; rules 6/7/10 are reachable.
- N2: "we'll get back" / "i'm busy right now" / "not the best time"
  removed from the ambiguous-DNC set — spec FR-010/Q5 says these route
  to call_back_later, not needs_human_review.

Tests:
- P1-4/H2: test_orchestrator_happy.py — mkdir before fixture write.
- test_sc005_determinism.py — same mkdir-ordering bug class.
- C6: new test_us3_duplicate_callback_event_does_not_emit_second_task
  exercises the previously-unused duplicate_callback_requested fixture
  (Story 3 AS5 / FR-019 callback idempotency).

Also commits uv.lock for reproducible installs.

Full suite: 8 failing -> 0 failing, 127 passed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wave 2 of the multi-agent review remediation.

Core / state / models:
- N5: state DB + directory chmod'd to 0o600/0o700 (best-effort) so local
  CRM data is owner-only; chmod failure is logged, not fatal.
- N6: MockCallEvent gains a payload-key allowlist validator — per event
  type, only the Q15-schema keys survive; arbitrary keys are dropped
  before they can reach conflicting-events.json (FR-024 "no secrets").
- H3: EligibilityDecision + ExportedEligibilityDecision gain a
  default_tz_applied bool; schema, DAO, evaluator, orchestrator, and
  contracts/eligibility.md all updated so a default-timezone
  substitution is visible in the persisted decision (spec Edge Case).

Transport:
- H4: unknown event types now logged via logging.warning before skip
  (spec Edge Case "MUST log it"); misleading comment corrected; the
  unit test asserts the warning via caplog.

Persona:
- H5: the FR-035 escalation reason-code priority order is now documented
  in escalation.py's docstring and contracts/persona.md.
- H6: over-broad "i handle " decision-maker phrase narrowed to four
  role-specific phrases — no more "I handle the laundry" false positive.

Integration tests:
- H7: all 9 FR-035 reason codes now pinned — 3 new conversation
  fixtures (ambiguous_dnc, non_clinical_topic_escalation,
  uncertain_intent) + a parametrized test.
- H8: new duplicate_conflicting_failed fixture + test proving FR-019
  beats FR-020 on overlap (a redelivered late event adds no 2nd audit
  row).
- H9/H10: test_us2_blocked now asserts queue_status_updates has exactly
  1 row (FR-029) and task_payloads has 0 rows (FR-018) for blocked runs.

Full suite: 127 -> 137 passed, 0 failed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wave 3 of the multi-agent review remediation — the long tail across all
6 modules. ~57 MEDIUM/LOW findings plus N7 (CLI coverage gate).

Cross-cutting:
- All 5 boundary Protocols (EligibilityEvaluator, CallTransport, Persona,
  WriteBackAdapter) are now @runtime_checkable — closes the spec Q23
  "ABCs are the runtime enforcement" gap; isinstance smoke tests added.

Eligibility: validate the configured default_timezone (a bogus default
now fails rule b instead of silently passing); drop the datetime
placeholder import; re-export from __init__; +5 tests (start boundary,
DST, invalid-default-tz, envelope shape).

Transport: document the session_id-carries-call-id rewrite contract +
one-shot event_stream behavior; contracts/transport.md reconciled to
the 5-field runtime MockCallEvent; +10 tests (Q15 payload pass-through,
all 6 event types).

Persona: narrow over-broad _WRONG_NUMBER_PHRASES; tighten hour_re so a
bare number isn't a callback window; populate all 7 RefusalTopic enum
values; drop dead code in disposition rule 10; disclosure validator is
now byte-exact (no .strip()); +1 Q8 edge-case test.

CRM: emit_task belt-and-suspenders now fails closed; contract recap
gains assigned_to; +6 tests (protocol isinstance, assigned_to default,
FR-029 double-emit, Q5 carve-out, fail-closed branches).

Core/models: TaskPayload validator now forbids reason_code on callback
tasks and captured_email on review tasks (FR-030); insert_normalized_
result gated by a NORMALIZED_RESULT idempotency key; schema FK gains
DEFERRABLE INITIALLY DEFERRED; _resolve_final_disposition dead branch
removed; conflicting-event fallback asserts instead of masking; new
test_config.py (5 tests); spec.md FR-033 signatures reconciled to the
contracts.

Integration + N7: new test_cli.py takes cli.py 0%->100% and config.py
0%->100% — overall coverage 87.7%->97.26%, clearing the configured
--cov-fail-under=90 gate. SC-006 asserts the structured disposition
field; SC-005 covers transcript + conflicting-events artifacts; SC-001
covers terminal + blocked paths; SC-009 lint has an explicit top-level
carve-out; multi-rule blocking test exercises 5 rules; constitution-
sync test checks canonical order + Traceability section; SC008_REVIEW
field counts corrected.

Full suite: 137 -> 183 passed, 0 failed. Coverage 97.26%.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The branch failed its own configured `ruff` gate (pyproject.toml
[tool.ruff]) with 26 errors. This commit clears them so `ruff check`
and `ruff format --check` both pass.

- 19 auto-fixed: quoted-annotation (UP037), unsorted-imports (I001),
  unused-noqa (RUF100), unused-import (F401), deprecated-import (UP035).
- 20 files reformatted by `ruff format`.
- 8 manual fixes: 2 collapsible-if (SIM102) in orchestrator.py +
  test_imports.py; 1 unused unpacked variable (RUF059) in the
  orchestrator finalize call; 5 ambiguous EN DASH characters (RUF002/
  RUF003) in docstrings/comments replaced with ASCII hyphens.

No behavior change. Full suite: 183 passed, coverage 97.26%.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The multi-rule-failure test's docstring and inline comments claimed rule
(b) "cannot fail by design". That became inaccurate when the eligibility
evaluator was hardened to fail rule (b) on a bogus configured
default_timezone. The test's behavior and assertions are unchanged and
correct (it uses a valid tz, so (b) passes here) — only the explanatory
prose needed to match the current evaluator + contracts/eligibility.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 20, 2026 15:11
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @brettheap, your pull request is larger than the review limit of 500000 diff characters

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements Slice 1 (“Mock Call, Mock CRM”) end-to-end for openCloser: a single queued ALF prospect flows through eligibility → mock transport events → scripted persona → normalized result → artifact export → mock CRM persistence, all fixture-driven, plus accompanying specs/contracts/checklists and a comprehensive test suite.

Changes:

  • Added Slice 1 runtime modules (core/orchestrator, eligibility, transport, persona, CRM adapter, artifacts writer) plus CLI/config loader and ID/idempotency utilities.
  • Added fixture catalog (queue items, conversation scripts, transport event streams) and unit/integration tests covering determinism, idempotency, performance budget, CLI, and module-boundary import rules.
  • Added Spec Kit artifacts: constitution, contracts, checklists, SC-008 review, and git/worktree helper scripts.

Reviewed changes

Copilot reviewed 139 out of 148 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
tests/unit/test_idempotency.py Unit coverage for FR-019 idempotency-key computation and duplicate-detection persistence.
tests/unit/test_config.py Unit coverage for TOML + env-var config loading and env int coercion.
tests/unit/init.py Unit test package marker.
tests/test_imports.py AST-based dependency-direction lint enforcing FR-033 boundary import rules.
tests/integration/test_us1_happy_path.py Story 1 end-to-end acceptance test for the happy path and artifact presence.
tests/integration/test_sc006_no_false_activity.py Integration guardrail for SC-006 (no false “connected-call” activity on terminal paths).
tests/integration/test_sc005_determinism.py Integration test for SC-005 deterministic artifacts across reruns.
tests/integration/test_sc001_budget.py Integration budget tests ensuring fixture runs complete under SC-001 timing constraints.
tests/integration/test_constitution_sync.py Integration smoke check keeping constitution headings/traceability in sync.
tests/integration/test_cli.py Integration tests for Typer CLI subcommands and exit-code behavior.
tests/integration/init.py Integration test package marker.
tests/fixtures/transport_events/voicemail.json Transport fixture for voicemail terminal event path.
tests/fixtures/transport_events/no_answer.json Transport fixture for no-answer terminal event path.
tests/fixtures/transport_events/failed.json Transport fixture for failed terminal event path.
tests/fixtures/transport_events/duplicate_connected.json Transport fixture covering duplicate connected/completed events.
tests/fixtures/transport_events/duplicate_conflicting_failed.json Transport fixture covering duplicated conflicting late failed events.
tests/fixtures/transport_events/duplicate_callback_requested.json Transport fixture covering duplicate callback_requested events.
tests/fixtures/transport_events/connected.json Transport fixture for connected → completed happy path.
tests/fixtures/transport_events/conflicting_failed_after_completed.json Transport fixture for FR-020 conflicting late event audit path.
tests/fixtures/transport_events/.gitkeep Keeps the transport fixtures directory in version control.
tests/fixtures/queue_items/alf-prospect-not-ready.json Queue-item fixture for eligibility blocking via callable_status.
tests/fixtures/queue_items/alf-prospect-missing-phone.json Queue-item fixture for eligibility blocking via missing phone.
tests/fixtures/queue_items/alf-prospect-max-attempts.json Queue-item fixture for eligibility blocking via max attempts.
tests/fixtures/queue_items/alf-prospect-dnc.json Queue-item fixture for eligibility blocking via DNC.
tests/fixtures/queue_items/alf-prospect-after-hours.json Queue-item fixture for eligibility blocking via call-window.
tests/fixtures/queue_items/alf-prospect-001.json Primary queue-item fixture for happy-path runs.
tests/fixtures/queue_items/.gitkeep Keeps the queue-item fixtures directory in version control.
tests/fixtures/conversations/wrong_number.json Persona conversation fixture for wrong_number disposition.
tests/fixtures/conversations/uncertain_intent.json Persona conversation fixture for needs_human_review (uncertain intent).
tests/fixtures/conversations/script_truncated.json Persona conversation fixture for script_truncated escalation/disposition.
tests/fixtures/conversations/not_interested.json Persona conversation fixture for not_interested disposition.
tests/fixtures/conversations/non_clinical_topic_escalation.json Persona fixture for non_clinical_topic_escalation reason.
tests/fixtures/conversations/needs_human_review_uncertain_role.json Persona fixture for uncertain_role escalation.
tests/fixtures/conversations/needs_human_review_email_invalid.json Persona fixture for FR-036 rule 7 (unverified email, no callback).
tests/fixtures/conversations/interested_email_captured.json Persona fixture for interested_email_captured disposition.
tests/fixtures/conversations/interested_email_and_callback.json Persona fixture for Q5 “email + callback” carve-out behavior.
tests/fixtures/conversations/interested_callback_requested.json Persona fixture for interested_callback_requested happy path.
tests/fixtures/conversations/do_not_call_mid_call.json Persona fixture for dnc_stated → do_not_call behavior.
tests/fixtures/conversations/call_back_later.json Persona fixture for call_back_later disposition.
tests/fixtures/conversations/ambiguous_dnc.json Persona fixture for ambiguous_dnc escalation path.
tests/fixtures/conversations/.gitkeep Keeps the conversation fixtures directory in version control.
tests/fixtures/init.py Test fixtures package marker.
tests/conftest.py Shared pytest fixtures (SQLite temp DB, frozen clock, temp artifact dir).
tests/init.py Tests package marker.
src/opencloser/transport/mock.py Fixture-driven mock transport yielding scripted call events.
src/opencloser/transport/base.py CallTransport protocol defining the transport boundary surface.
src/opencloser/transport/init.py Transport boundary package marker.
src/opencloser/state/init.py State boundary package marker.
src/opencloser/persona/escalation.py Pure escalation-reason derivation per FR-035 with deterministic priority.
src/opencloser/persona/disposition_rules.py Deterministic disposition precedence rules per FR-036.
src/opencloser/persona/base.py Persona protocol and shared dataclasses for persona boundary I/O.
src/opencloser/persona/alf_appointment_setter.py Slice 1 deterministic persona implementation (disclosure, extraction, disposition).
src/opencloser/persona/init.py Persona boundary package marker.
src/opencloser/eligibility/evaluator.py Builtin eligibility evaluator implementing FR-004 rules and timezone fallback.
src/opencloser/eligibility/init.py Eligibility boundary protocol + evaluator export.
src/opencloser/crm/mock.py Mock write-back adapter persisting payloads to SQLite and building writeback aggregate.
src/opencloser/crm/base.py WriteBackAdapter protocol defining CRM boundary surface.
src/opencloser/crm/init.py CRM boundary package marker.
src/opencloser/core/ids.py UUID-based ID generators for session/call/decision/task/audit identifiers.
src/opencloser/core/idempotency.py FR-019 idempotency-key helper functions integrated with the state store.
src/opencloser/core/config.py TOML + env override configuration loader returning validated SliceConfig.
src/opencloser/core/clock.py Clock protocol + System/Frozen implementations and UTC-ms formatting.
src/opencloser/core/init.py Core boundary package marker.
src/opencloser/core/orchestrator.py Orchestrates Slice 1 session lifecycle, event handling, persistence, and artifact export.
src/opencloser/cli.py Typer CLI entrypoint (init-state, load-queue-item, run-one).
src/opencloser/artifacts/writer.py Deterministic, atomic artifact writer for JSON/text outputs.
src/opencloser/artifacts/init.py Artifacts module package marker.
src/opencloser/init.py Package metadata/version for opencloser.
specs/001-mock-call-mock-crm/SC008_REVIEW.md Plan-time forward-compat review for Slice 2 substitutions (SC-008).
specs/001-mock-call-mock-crm/contracts/transport.md Transport boundary contract documentation.
specs/001-mock-call-mock-crm/contracts/orchestrator.md Orchestrator boundary contract documentation.
specs/001-mock-call-mock-crm/contracts/eligibility.md Eligibility boundary contract documentation.
specs/001-mock-call-mock-crm/contracts/crm-writeback.md CRM write-back boundary contract documentation.
specs/001-mock-call-mock-crm/checklists/transport.md Requirements quality checklist for transport boundary.
specs/001-mock-call-mock-crm/checklists/tests-strategy.md Quality checklist for testing strategy and fixture catalog.
specs/001-mock-call-mock-crm/checklists/requirements.md Specification quality checklist.
specs/001-mock-call-mock-crm/checklists/plan-quality.md Plan-phase artifact quality checklist.
specs/001-mock-call-mock-crm/checklists/eligibility.md Requirements quality checklist for eligibility boundary.
specs/001-mock-call-mock-crm/checklists/contracts.md Quality checklist for boundary contract documents.
specs/001-mock-call-mock-crm/checklists/cli.md Requirements quality checklist for CLI/operator surface.
README.md Repository-level overview, links to Slice 1 spec artifacts, and bootstrap commands.
QWEN.md Agent context and Spec Kit workflow rules.
KIMI.md Agent context and Spec Kit workflow rules.
GEMINI.md Agent context and Spec Kit workflow rules.
CLAUDE.md Agent context and Spec Kit workflow rules + active plan link.
AGENTS.md Agent context and Spec Kit workflow rules.
.github/copilot-instructions.md Repository Copilot instructions referencing constitution/plan workflow.
pyproject.toml Python packaging, dependencies, ruff configuration, and pytest/coverage gates.
config/slice1.toml Default Slice 1 configuration values (call window, eligibility, artifacts, persona, state DB).
.gitignore Ignores runtime artifacts/state DB files and Python/tooling outputs.
.specify/memory/constitution.md Repository constitution (binding principles) derived from Slice 1 spec alignment.
.specify/feature.json Declares current feature directory for Spec Kit.
.specify/extensions.yml Spec Kit extensions configuration enabling git hooks.
.specify/shell/select-worktree.sh Helper script to select/list Spec Kit worktrees.
.specify/shell/ct.zsh Zsh helper sourcing worktree tools.
.specify/extensions/git/README.md Documentation for bundled Spec Kit git extension and configuration.
.specify/extensions/git/extension.yml Git extension metadata, hooks, and configuration schema.
.specify/extensions/git/git-config.yml Repo’s git extension configuration (branch numbering, worktree root, auto-commit toggles).
.specify/extensions/git/config-template.yml Template for git extension configuration.
.specify/extensions/git/commands/speckit.git.validate.md Spec Kit command doc: validate feature branch naming.
.specify/extensions/git/commands/speckit.git.remote.md Spec Kit command doc: detect remote URL.
.specify/extensions/git/commands/speckit.git.initialize.md Spec Kit command doc: initialize git repo.
.specify/extensions/git/commands/speckit.git.feature.md Spec Kit command doc: create feature checkout/worktree.
.specify/extensions/git/commands/speckit.git.commit.md Spec Kit command doc: auto-commit hook behavior.
.specify/extensions/git/scripts/bash/initialize-repo.sh Bash implementation of git init hook.
.specify/extensions/git/scripts/bash/git-common.sh Bash shared functions for git detection and branch validation.
.specify/extensions/git/scripts/bash/auto-commit.sh Bash auto-commit hook implementation.
.specify/extensions/git/scripts/powershell/initialize-repo.ps1 PowerShell implementation of git init hook.
.specify/extensions/git/scripts/powershell/git-common.ps1 PowerShell shared functions for git detection and branch validation.
.specify/extensions/git/scripts/powershell/auto-commit.ps1 PowerShell auto-commit hook implementation.
.specify/extensions/git/scripts/powershell/git-common.ps1 PowerShell shared functions for branch validation and git detection.
.agents/skills/speckit-git-feature/SKILL.md Agent skill doc mirroring the git feature-checkout command behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/opencloser/cli.py
Comment on lines +11 to +13
import json
import sqlite3
from pathlib import Path
Comment thread src/opencloser/cli.py Outdated
)


_ = sqlite3 # keep import for state-level callers; CLI itself doesn't use it directly
Comment thread src/opencloser/core/__init__.py Outdated
@@ -0,0 +1 @@
"""Interaction Core / Orchestrator — FR-033 module boundary #5."""
Comment thread README.md
Comment on lines +19 to +22
## Status

Slice 1 spec, plan, tasks, contracts, and checklists are complete. Implementation is in progress per `tasks.md`.

Comment on lines +5 to +8
**Module boundary**: FR-033, principle #5 (write-back side)
**Implementation**: `src/opencloser/crm/base.py` (interface) + `src/opencloser/crm/mock.py` (Slice 1 mock)
**Owns**: payload assembly per FR-028 / FR-029 / FR-030, FR-031 per-disposition mapping, FR-032 queue-status mapping
**MUST NOT contain**: persona logic, eligibility logic, transport-event interpretation, session lifecycle management
Comment on lines +73 to +75

mapfile -t WORKTREE_LINES < <(find "$WORKTREE_ROOT" -maxdepth 1 -mindepth 1 -type d -printf '%T@|%f|%p\n' 2>/dev/null | sort -t'|' -k1,1nr)
if [ "${#WORKTREE_LINES[@]}" -eq 0 ]; then
Comment on lines +105 to +115
def process_one_queue_item(
queue_item_id: str,
*,
conn: sqlite3.Connection,
config: SliceConfig,
eligibility: BuiltinEligibilityEvaluator,
transport: FixtureDrivenTransport,
persona: ALFAppointmentSetterPersona,
conversation_fixture: ConversationFixture | None,
transport_fixture_id: str | None,
clock: Clock | None = None,
Comment on lines +213 to +215
"""FR-005 block path: queue-status update only, no Phone Call activity, no task."""
crm = MockWriteBackAdapter(conn)
new_status = queue_item.callable_status # FR-032: blocked → unchanged
Address all 8 Copilot review comments on PR #1:

- Inject the WriteBackAdapter into process_one_queue_item instead of
  constructing MockWriteBackAdapter inside core, and type the eligibility/
  transport/persona/crm params against their boundary Protocols rather than
  the concrete Slice 1 classes — decouples core from the mock per SC-008.
- Add build_writeback to the WriteBackAdapter Protocol + crm-writeback
  contract so the orchestrator depends only on the documented surface.
- Reconcile the crm-writeback "Owns" line: the adapter persists payloads
  and assembles the WriteBack aggregate; FR-031/FR-032 decisions stay in
  the orchestrator.
- Correct the core docstrings (FR-033 boundary #1, not #5).
- Drop the dead sqlite3 import + keepalive from cli.py.
- Update the stale README status line (Slice 1 implementation is complete).
- Make select-worktree.sh portable: replace GNU `find -printf` + `mapfile`
  with a stat-based approach that also runs on macOS (BSD find, Bash 3.2).

183 tests passing, 97.25% coverage, ruff clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 139 out of 148 changed files in this pull request and generated 4 comments.

Comment thread tests/test_imports.py Outdated
Comment on lines +91 to +99
def _imported_names(tree: ast.AST) -> set[str]:
names: set[str] = set()
for node in ast.walk(tree):
if isinstance(node, ast.Import):
for alias in node.names:
names.add(alias.name)
elif isinstance(node, ast.ImportFrom) and node.module:
names.add(node.module)
return names
Comment thread src/opencloser/cli.py
Comment on lines +116 to +131
try:
report = process_one_queue_item(
queue_item_id,
conn=conn,
config=config,
eligibility=BuiltinEligibilityEvaluator(),
transport=FixtureDrivenTransport(transport_dir),
persona=ALFAppointmentSetterPersona(),
crm=MockWriteBackAdapter(conn),
conversation_fixture=conversation,
transport_fixture_id=transport_fixture_id,
)
except QueueItemNotFound as exc:
typer.echo(f"error: queue_item_id not found: {exc}", err=True)
raise typer.Exit(code=2) from None

Comment thread src/opencloser/cli.py
Comment on lines +116 to +127
try:
report = process_one_queue_item(
queue_item_id,
conn=conn,
config=config,
eligibility=BuiltinEligibilityEvaluator(),
transport=FixtureDrivenTransport(transport_dir),
persona=ALFAppointmentSetterPersona(),
crm=MockWriteBackAdapter(conn),
conversation_fixture=conversation,
transport_fixture_id=transport_fixture_id,
)
Comment thread tests/test_imports.py
Comment on lines +49 to +53
"crm": {
"opencloser.models",
"opencloser.state", # adapter persists; allowed
"opencloser.core",
},
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 139 out of 148 changed files in this pull request and generated 3 comments.

Comment on lines +88 to +92
def _resolve_fixture_path(self, fixture_id: str) -> Path:
# Accept either a bare fixture_id ("no_answer") or a full filename ("no_answer.json").
if fixture_id.endswith(".json"):
return self._fixtures_dir / fixture_id
return self._fixtures_dir / f"{fixture_id}.json"
Comment on lines +121 to +135
def _is_within_call_window(*, tz_name: str, start_str: str, end_str: str, clock: Clock) -> bool:
"""Q12 — `[start, end]` both ends inclusive at minute resolution; applies all 7 days (Q11)."""
try:
local = clock.now_local(tz_name)
except (ZoneInfoNotFoundError, ValueError):
return False
start = _parse_hhmm(start_str)
end = _parse_hhmm(end_str)
now_local_time = time(local.hour, local.minute)
return start <= now_local_time <= end


def _parse_hhmm(value: str) -> time:
hours, minutes = value.split(":", 1)
return time(int(hours), int(minutes))
Comment thread tests/test_imports.py Outdated
Comment on lines +3 to +4
Walks the AST of every `src/opencloser/*.py` and asserts the FR-033 boundary's
dependency-allowed rules from `contracts/*.md` hold. Per boundary:
Address the 4 findings from Copilot's re-review of b18a679:

- run-one now catches ValueError from process_one_queue_item (e.g. an
  eligible call invoked without --transport-fixture) and surfaces it as a
  clean `error:` line + exit code 2 instead of an uncaught traceback.
- run-one validates config persona.version against the available persona
  and fails fast with a clear error when they disagree, instead of
  silently producing artifacts whose persona_version contradicts config.
- The SC-009 dependency-direction lint now rejects relative imports:
  `_collect_imports` inspects node.level so a cross-boundary
  `from ..persona import X` can no longer bypass the gate.
- Dropped opencloser.core from the crm allowlist + lint docstring so the
  gate matches crm-writeback.md, which forbids core for the write-back
  boundary (no crm module imports core today).

Adds two CLI tests for the new error paths. 185 tests passing, 97.35%
coverage, ruff clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 139 out of 148 changed files in this pull request and generated 3 comments.

Comment on lines +127 to +135
start = _parse_hhmm(start_str)
end = _parse_hhmm(end_str)
now_local_time = time(local.hour, local.minute)
return start <= now_local_time <= end


def _parse_hhmm(value: str) -> time:
hours, minutes = value.split(":", 1)
return time(int(hours), int(minutes))
Comment on lines +29 to +33
def place_call(self, queue_item: QueueItem, fixture_id: str) -> str:
"""FR-007: assign a globally-unique mock_provider_call_id and stash the fixture path."""
del queue_item # unused; the fixture itself carries the queue_item_ref
fixture_path = self._resolve_fixture_path(fixture_id)
if not fixture_path.exists():
Comment on lines +110 to +114
## Dependencies allowed

- `opencloser.models` (for QueueItem and event shapes)
- stdlib `json`, `uuid`, `pathlib`

…ation

Address the 5 findings from Copilot's review of c950e69 / b18a679:

- FixtureDrivenTransport._resolve_fixture_path now rejects fixture ids
  containing path separators or parent refs, so a fixture_id cannot
  escape the fixtures directory (path traversal).
- CallWindowConfig validates that start/end are in-range HH:MM times.
  The field pattern only checked the digit shape, so a value like
  "99:99" would pass config load and then crash eligibility evaluation
  when _parse_hhmm built a datetime.time — now it fails fast at load.
- transport.md "Dependencies allowed" now lists opencloser.core: the
  Slice 1 transport legitimately imports core.ids, and the lint already
  permits it — the contract was the stale side.
- Corrected the place_call comment (the mock transport is queue-item-
  agnostic; the fixture does not carry a queue_item_ref) and the
  test_imports docstring (it walks src/opencloser/**/*.py via rglob).

Adds tests for the traversal guard and the HH:MM range validator.
190 tests passing, 97.37% coverage, ruff clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@brettheap brettheap requested a review from Copilot May 21, 2026 02:50
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 139 out of 148 changed files in this pull request and generated 2 comments.

Comment thread src/opencloser/cli.py
Comment on lines +111 to +143
# Locate transport fixtures dir + fixture id.
if transport_fixture is not None:
transport_dir = transport_fixture.parent
transport_fixture_id = transport_fixture.stem
else:
transport_dir = Path("tests/fixtures/transport_events")
transport_fixture_id = None

# Load conversation fixture if provided.
conversation = (
_load_conversation_fixture(conversation_fixture) if conversation_fixture else None
)

try:
report = process_one_queue_item(
queue_item_id,
conn=conn,
config=config,
eligibility=BuiltinEligibilityEvaluator(),
transport=FixtureDrivenTransport(transport_dir),
persona=persona,
crm=MockWriteBackAdapter(conn),
conversation_fixture=conversation,
transport_fixture_id=transport_fixture_id,
)
except QueueItemNotFound as exc:
typer.echo(f"error: queue_item_id not found: {exc}", err=True)
raise typer.Exit(code=2) from None
except ValueError as exc:
# process_one_queue_item raises ValueError for bad operator input
# (e.g. an allowed call with no --transport-fixture).
typer.echo(f"error: {exc}", err=True)
raise typer.Exit(code=2) from None
Comment thread tests/test_imports.py
Comment on lines +38 to +60
"eligibility": {
"opencloser.models",
"opencloser.core",
},
"transport": {
"opencloser.models",
"opencloser.core",
},
"persona": {
"opencloser.models",
"opencloser.core",
"opencloser.persona", # internal submodules
},
"crm": {
"opencloser.models",
"opencloser.state", # adapter persists; allowed
},
"state": {
"opencloser.models",
},
"artifacts": {
"opencloser.models",
},
Address the 2 findings from Copilot's review of 6426c8b:

- run-one now catches OSError as well as ValueError around fixture
  loading and the orchestrator call, and the conversation-fixture load
  moved inside that try. A typoed/unreadable --transport-fixture or
  --conversation-fixture path, or invalid fixture JSON, now produces a
  clean `error:` line + exit code 2 instead of an uncaught traceback
  (FileNotFoundError is an OSError; JSONDecodeError is a ValueError).
- The dependency-direction allowlist now includes each boundary's own
  prefix (transport/crm/eligibility/state/artifacts), matching what
  core/persona already had. An intra-boundary import such as
  opencloser.transport.mock importing opencloser.transport.base is not
  a cross-boundary violation and must not be flagged.

Adds a CLI test for the missing-fixture-path error path. 191 tests
passing, 97.37% coverage, ruff clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@brettheap
Copy link
Copy Markdown
Contributor Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0063fc26e6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/opencloser/persona/extraction.py Outdated
if not read_back:
return False
confirmations = ("yes", "that's right", "that is right", "correct", "yep", "yeah")
return any(_any_in(turn.text.lower(), confirmations) for turn in contact_turns[-3:])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Require post-readback confirmation before verifying email

_was_email_read_back_and_confirmed treats any confirmation phrase in the last three contact turns as sufficient, but it never checks that the confirmation happened after the persona read back the email. If a contact says "yes" earlier in the conversation for an unrelated prompt and never confirms after readback, this still marks the address as verified, which can incorrectly populate captured_email and drive the wrong disposition/writebacks instead of the unverified-email path required by FR-014.

Useful? React with 👍 / 👎.

Comment thread src/opencloser/persona/extraction.py Outdated
Comment on lines +274 to +275
if day and hour:
return text[day.start() : hour.end()].strip()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle callback windows when time appears before weekday

When both a day token and an hour token are found, _capture_window_fragment always slices text[day.start():hour.end()]. This assumes weekday appears before time; for inputs like "2 PM Thursday works" (time first), day.start() is greater than hour.end(), producing an empty string and losing the callback window signal. That degrades task payload quality and violates the intent to preserve the contact’s callback wording.

Useful? React with 👍 / 👎.

Comment thread src/opencloser/persona/extraction.py Outdated
if not read_back:
return False
confirmations = ("yes", "that's right", "that is right", "correct", "yep", "yeah")
return any(_any_in(turn.text.lower(), confirmations) for turn in contact_turns[-3:])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Match confirmation tokens as words, not substrings

Email verification currently uses substring matching via _any_in, so unrelated words like "yesterday" satisfy the "yes" confirmation token and can mark an email as verified even when no explicit confirmation was given. In conversations where the persona read back an email, this can incorrectly set captured_email (verified) instead of captured_email_unverified, affecting downstream disposition and CRM payloads.

Useful? React with 👍 / 👎.

Comment thread src/opencloser/state/store.py Outdated
"""
path = Path(db_path)
path.parent.mkdir(parents=True, exist_ok=True)
_restrict_permissions(path.parent, 0o700)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid chmod on CWD when DB path has no parent

connect() always runs _restrict_permissions(path.parent, 0o700), but for a bare filename like state.db, path.parent is .. That causes the process to chmod the current working directory to 0700, unexpectedly changing permissions for the entire checkout/worktree instead of only the state directory, which can break other tooling and collaborator access.

Useful? React with 👍 / 👎.

fi

# Trim whitespace and validate description is not empty
FEATURE_DESCRIPTION=$(echo "$FEATURE_DESCRIPTION" | xargs)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Replace xargs-based trimming for feature descriptions

Using xargs to trim FEATURE_DESCRIPTION makes branch creation fail on valid descriptions containing quotes/apostrophes (for example Bob's feature), because xargs treats quotes as syntax and exits with an unmatched-quote error. This turns a common input into a hard failure in speckit.git.feature and blocks feature-worktree creation.

Useful? React with 👍 / 👎.

Comment thread src/opencloser/persona/extraction.py Outdated
persona_turns: Sequence[ConversationTurn],
contact_turns: Sequence[ConversationTurn],
) -> bool:
read_back = any(candidate in turn.text for turn in persona_turns)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 Badge Compare read-back email case-insensitively

Read-back detection uses candidate in turn.text with exact casing, so a persona readback like amy@example.com will not match a captured Amy@Example.com candidate and the email is marked unverified despite explicit confirmation. Because this path drives captured_email vs captured_email_unverified, case-only differences can incorrectly change disposition and writeback content.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 139 out of 148 changed files in this pull request and generated 5 comments.

Comment on lines +80 to +84
report = process_one_queue_item(
"alf-prospect-001",
conn=tmp_state_db,
config=_config(tmp_artifact_dir, tmp_path / "db"),
eligibility=BuiltinEligibilityEvaluator(),
Comment on lines +70 to +74
report = process_one_queue_item(
"alf-prospect-001",
conn=tmp_state_db,
config=_config(tmp_artifact_dir, tmp_path / "db"),
eligibility=BuiltinEligibilityEvaluator(),
Comment on lines +73 to +77
report = process_one_queue_item(
"alf-prospect-001",
conn=tmp_state_db,
config=_config(tmp_artifact_dir, tmp_path / "db"),
eligibility=BuiltinEligibilityEvaluator(),
Comment on lines +100 to +104
report = process_one_queue_item(
"alf-prospect-001",
conn=tmp_state_db,
config=_config(tmp_artifact_dir, tmp_path / "db"),
eligibility=BuiltinEligibilityEvaluator(),
Comment on lines +141 to +145
report = process_one_queue_item(
qi.queue_item_id,
conn=tmp_state_db,
config=_config(tmp_artifact_dir, tmp_path / "db"),
eligibility=BuiltinEligibilityEvaluator(),
Codex review of 0063fc2 (6 findings):

- extraction: email is verified only when a contact confirmation comes
  AFTER the persona read it back — a stray earlier "yes" no longer
  verifies the address (P1). Read-back match is now case-insensitive.
- extraction: confirmation tokens are matched as whole words, so
  "yesterday" no longer satisfies the "yes" token (P2).
- extraction: _capture_window_fragment captures the callback window
  regardless of day/time order, so "2 PM Thursday" is no longer lost (P2).
- store.connect: skip the directory chmod when db_path is a bare
  filename (parent "."), so it never chmods the working directory (P2).
- create-new-feature.sh: trim the feature description with parameter
  expansion instead of xargs, which choked on apostrophes/quotes (P2).

Copilot review of 0063fc2 (1 finding):

- tests: use a real DB filename (slice1.db) instead of the extensionless
  tmp_path / "db" for the unused config.state.db value.

Adds 5 tests covering the after-readback confirmation, word-boundary
matching, case-insensitive read-back, time-before-day callback window,
and the bare-filename connect path. 196 tests passing, 97.39% coverage,
ruff clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@brettheap
Copy link
Copy Markdown
Contributor Author

@codex review
@copilot review

Copy link
Copy Markdown

Copilot AI commented May 21, 2026

@codex review

Reviewed the branch at 5e15f5b; there are no new actionable code-change requests in this thread, so no additional commit was needed.

@codex review
@copilot review

Reviewed the branch at 5e15f5b; there are no new actionable code-change requests in this thread, so no additional commit was needed.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5e15f5bf16

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/opencloser/state/store.py Outdated
Comment on lines +59 to +60
if path.parent != Path("."):
_restrict_permissions(path.parent, 0o700)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid chmod on arbitrary parent directories

This unconditionally chmods path.parent for any path whose parent is not literal .. If state.db is configured as an absolute path in an existing directory (for example, the repo root or another shared location), the code will tighten that directory to 0700, which can unexpectedly break other processes/users that rely on prior permissions. The permission hardening should be limited to a dedicated state directory the app creates/owns, not any existing parent directory.

Useful? React with 👍 / 👎.

Comment thread src/opencloser/state/store.py Outdated
Comment on lines +379 to +380
except sqlite3.IntegrityError:
return False
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Differentiate duplicate keys from other integrity errors

Catching all sqlite3.IntegrityError and returning False treats every constraint violation as a benign duplicate. The idempotency_keys insert is also subject to foreign-key and CHECK constraints, so non-duplicate data integrity failures can be silently downgraded to no-ops, masking real corruption/ordering bugs and causing skipped writes without surfacing an error. Only unique-key conflicts should map to False; other integrity failures should be raised.

Useful? React with 👍 / 👎.

Comment thread src/opencloser/models.py
event_id: str
event_type: EventType
received_at: UtcMs
payload: dict[str, Any] = Field(default_factory=dict)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enforce per-event payload value types

The model only allowlists payload keys but never validates payload value types, so malformed events like failed with {"failure_reason": "carrier_err"} or voicemail with a non-integer length are accepted and persisted/exported as if valid. This violates the Q15 payload schema contract and can break downstream logic that assumes enum/type-correct write-back and artifact data.

Useful? React with 👍 / 👎.

WORKTREE_ROOT=$(resolve_worktree_root "$REPO_ROOT")
LATEST_WORKTREE=""
if [ -d "$WORKTREE_ROOT" ]; then
LATEST_WORKTREE=$(find "$WORKTREE_ROOT" -maxdepth 1 -mindepth 1 -type d -printf '%T@ %p\n' 2>/dev/null | sort -nr | awk 'NR == 1 {print $2}')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 Badge Replace GNU-only find flag in worktree fallback

The fallback path discovery uses find ... -printf, which is GNU-specific; BSD/macOS find does not support this flag, so when the state file is missing the script can fail to detect existing worktrees and incorrectly exit with “no Speckit worktree handoff” even though directories are present. This breaks ct/ctp recovery behavior on non-GNU environments.

Useful? React with 👍 / 👎.

fixture_path = self._pending.pop(mock_provider_call_id, None)
if fixture_path is None:
raise ValueError(f"No fixture pending for {mock_provider_call_id!r}")
data = json.loads(fixture_path.read_text(encoding="utf-8"))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Validate transport fixture JSON before mutating session state

event_stream() parses fixture JSON only after the orchestrator has already set the session to in_flight and incremented attempt_count; if the fixture file is malformed JSON, json.loads raises and the CLI reports a user-input error, but the database is left with a partially progressed session and consumed attempt. This creates inconsistent state from recoverable operator input errors and makes reruns non-idempotent at the workflow level.

Useful? React with 👍 / 👎.

Addresses 3 of the 5 Codex findings on 5e15f5b. Findings 3 (payload
value-typing) and 5 (transport fixture-validation timing) are deferred
as deliberate Slice-2 scope.

- store.connect: harden the state directory only when this call creates
  it. The previous `!= "."` guard still chmod-ed any pre-existing parent
  (e.g. a shared dir an absolute db_path points into); now a pre-existing
  directory is left untouched and only a freshly-created one gets 0o700.
- store.try_record_idempotency_key: only a PRIMARY KEY / UNIQUE conflict
  is the FR-019 duplicate no-op. Other sqlite3.IntegrityError causes
  (foreign-key, CHECK, NOT NULL) now re-raise instead of being silently
  swallowed as duplicates.
- get-last-worktree.sh: replace the GNU-only `find -printf` newest-
  worktree discovery with a portable stat-based scan (works on BSD/macOS).

Adds tests for the created-vs-pre-existing state dir and the
non-duplicate IntegrityError re-raise. 198 tests passing, 97.40%
coverage, ruff clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 21, 2026 19:10
@brettheap
Copy link
Copy Markdown
Contributor Author

@codex review

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 139 out of 148 changed files in this pull request and generated 4 comments.

Comment thread src/opencloser/transport/mock.py Outdated
Comment on lines +64 to +86
for raw_event in data.get("events", []):
event_type_str = raw_event["type"]
try:
event_type = EventType(event_type_str)
except ValueError:
# Edge Case "Mock transport emits an unknown event type": the system
# MUST log it, MUST NOT mutate state, MUST NOT crash. The transport
# skips the event (the orchestrator never sees it), so the transport
# itself is responsible for the log.
logger.warning(
"transport fixture %s: unknown event type %r (event_id=%s) — skipping",
fixture_path.name,
event_type_str,
raw_event.get("event_id"),
)
continue
yield MockCallEvent(
session_id=mock_provider_call_id, # orchestrator rewrites to session_id on insert
event_id=raw_event["event_id"],
event_type=event_type,
received_at=raw_event["timestamp"],
payload=raw_event.get("payload", {}),
)
Comment thread tests/test_imports.py Outdated
Comment on lines +85 to +90
def _python_files(root: Path) -> Iterable[Path]:
for p in root.rglob("*.py"):
if p.name == "__init__.py":
# Allow flexible re-exports in package roots; they typically just import for typing.
continue
yield p
Comment thread tests/test_imports.py
Comment on lines +136 to +138
if group not in _ALLOWED:
continue
allowed = _ALLOWED[group]
Comment thread tests/test_imports.py Outdated
Comment on lines +146 to +150
for imp in imported:
if not imp.startswith("opencloser."):
continue # stdlib + third-party are unrestricted
if not any(imp == ok or imp.startswith(ok + ".") for ok in allowed):
violations.append(f"{path.relative_to(_SRC)}: imports {imp} (group={group})")
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 92f3b96384

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/opencloser/models.py
Comment on lines +231 to +233
allowed = _ALLOWED_PAYLOAD_KEYS.get(self.event_type, frozenset())
filtered = {k: v for k, v in self.payload.items() if k in allowed}
if len(filtered) != len(self.payload):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Validate mock-event payload values per event type

The validator only strips unknown payload keys, but it never validates value types/enums for known keys. As a result, a fixture like {"type":"failed","payload":{"failure_reason":"bogus"}} or {"type":"voicemail","payload":{"voicemail_length_seconds":"abc"}} is accepted and persisted, producing artifacts that violate the declared transport contract and can break downstream consumers expecting the Q15 schema.

Useful? React with 👍 / 👎.

Comment thread src/opencloser/persona/extraction.py Outdated
Comment on lines +325 to +328
confirmations = ("yes", "that's right", "that is right", "correct", "yep", "yeah")
return any(
turn.role == "contact" and _contains_word(turn.text.lower(), confirmations)
for turn in turns[read_back_idx + 1 :]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Tie email verification to an explicit email confirmation

Email verification is currently satisfied by any later contact utterance containing tokens like "yes"/"correct", even if that response is to an unrelated question. In conversations where the persona reads back an email and then asks something else, this can incorrectly upgrade an unverified email to captured_email, which then propagates into normalized results and callback task payloads.

Useful? React with 👍 / 👎.

raise ValueError(f"No fixture pending for {mock_provider_call_id!r}")
data = json.loads(fixture_path.read_text(encoding="utf-8"))
for raw_event in data.get("events", []):
event_type_str = raw_event["type"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Raise ValueError for malformed transport event objects

The transport parser indexes required keys directly (type, event_id, timestamp), so a malformed fixture entry raises KeyError mid-stream. run-one only converts ValueError/OSError into clean operator errors, so this currently escapes as an unhandled traceback instead of returning the intended CLI error path for bad fixture input.

Useful? React with 👍 / 👎.

Comment thread src/opencloser/cli.py Outdated

def _load_conversation_fixture(path: Path) -> ConversationFixture:
raw = json.loads(path.read_text(encoding="utf-8"))
turns = [ConversationTurn(role=t["role"], text=t["text"]) for t in raw.get("turns", [])]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Guard conversation turn parsing against missing keys

Conversation turns are parsed with t["role"] and t["text"] without validation, so a malformed fixture turn triggers KeyError. This is not caught by the command’s current (ValueError, OSError) handler, causing run-one to crash with a traceback instead of emitting a controlled user-facing input error.

Useful? React with 👍 / 👎.

Comment thread src/opencloser/transport/mock.py Outdated
if fixture_path is None:
raise ValueError(f"No fixture pending for {mock_provider_call_id!r}")
data = json.loads(fixture_path.read_text(encoding="utf-8"))
for raw_event in data.get("events", []):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject transport fixtures missing the events array

Using data.get("events", []) silently treats a malformed fixture with no events key as an empty stream, which drives the session to a synthesized failed disposition instead of surfacing invalid operator input. That can produce misleading CRM/task artifacts from bad fixtures rather than failing fast with a user-correctable error.

Useful? React with 👍 / 👎.

Resolves all 9 findings from the Codex and Copilot reviews of 92f3b96.

Transport / CLI robustness — malformed operator input fails cleanly:
- FixtureDrivenTransport.event_stream raises ValueError for a fixture
  with no `events` array or an event missing type/event_id/timestamp,
  instead of a bare KeyError escaping as a traceback.
- cli._load_conversation_fixture validates the fixture is a JSON object
  and each turn carries role/text, raising ValueError (caught by run-one
  as a clean operator error + exit 2).

Correctness:
- MockCallEvent now type-checks payload values: voicemail_length_seconds
  must be int, failure_reason must be a FailureReason member, window_hint
  a string (None allowed as "not provided").
- Email verification is tied to the contact's DIRECT response to the
  read-back — a later "yes" to an unrelated question no longer verifies.

SC-009 dependency-lint hardening (test_imports.py):
- `__init__.py` files are now parsed — a forbidden import can't hide in a
  package root.
- An unknown boundary group fails the lint instead of being skipped.
- `from opencloser import X` is expanded to `opencloser.X`; bare
  `import opencloser` is flagged — neither can bypass the gate.

Adds 5 tests. 203 passing, 97.37% coverage, ruff clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@brettheap brettheap merged commit 2030e83 into main May 22, 2026
1 check passed
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b63a8f2433

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +67 to +71
if not isinstance(data, dict) or "events" not in data:
raise ValueError(
f"transport fixture {fixture_path.name!r} is not a JSON object with an 'events' array"
)
for raw_event in data["events"]:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Validate events is a list before iterating

event_stream() only checks that the fixture object has an events key, then iterates data["events"] directly. If a malformed fixture sets "events": null (or another non-list type), this raises TypeError instead of the intended user-input ValueError, and run-one does not catch TypeError, so the CLI can crash with a traceback rather than returning a clean operator error.

Useful? React with 👍 / 👎.

Comment thread src/opencloser/cli.py
if not isinstance(raw, dict):
raise ValueError(f"conversation fixture {path.name!r} is not a JSON object")
turns: list[ConversationTurn] = []
for t in raw.get("turns", []):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enforce turns array type in conversation fixtures

_load_conversation_fixture() iterates raw.get("turns", []) without validating that turns is actually an array. A malformed fixture like {"turns": null} raises TypeError at iteration time, which bypasses the command’s (ValueError, OSError) handling and surfaces as an unhandled traceback instead of a user-facing input error.

Useful? React with 👍 / 👎.

Comment thread src/opencloser/cli.py
if not isinstance(raw, dict):
raise ValueError(f"conversation fixture {path.name!r} is not a JSON object")
turns: list[ConversationTurn] = []
for t in raw.get("turns", []):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject fixtures that omit the turns array

When turns is missing, _load_conversation_fixture() falls back to [] via raw.get("turns", []), so malformed input is treated as an empty conversation instead of invalid input. In run-one, that can silently drive a synthetic needs_human_review outcome and artifact writes from a bad fixture, which is misleading for operators and harder to diagnose than a direct validation error.

Useful? React with 👍 / 👎.

Comment thread src/opencloser/cli.py
raise ValueError(
f"conversation fixture {path.name!r}: every turn needs 'role' and 'text'"
)
turns.append(ConversationTurn(role=t["role"], text=t["text"]))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Validate conversation turn roles against allowed enum

_load_conversation_fixture() only checks that each turn has role/text keys, but it never enforces role to be one of "persona" or "contact" (the contract shape). A typo like "Contact" or an unknown role is accepted and then silently ignored by downstream extraction/classification filters, producing incorrect dispositions and artifacts instead of a clear operator-facing fixture error.

Useful? React with 👍 / 👎.

Comment thread src/opencloser/cli.py
raise ValueError(
f"conversation fixture {path.name!r}: every turn needs 'role' and 'text'"
)
turns.append(ConversationTurn(role=t["role"], text=t["text"]))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Require turn text to be a string at fixture load

_load_conversation_fixture() does not validate the type of text, so non-string values (e.g., null or numbers) pass through into ConversationTurn. Later persona/extraction paths call string methods like .strip()/.lower() on text; with non-strings this can raise uncaught runtime exceptions and crash run-one instead of returning a controlled input-validation error.

Useful? React with 👍 / 👎.

brettheap pushed a commit that referenced this pull request May 22, 2026
Reconcile divergent main after PR #1 (Slice 1 — Mock Call, Mock CRM) was
squash-merged on origin. Conflict resolution:
- Kept local: constitution.md (formal v1.0.0), AGENTS/CLAUDE/GEMINI/KIMI/
  QWEN.md, .github/copilot-instructions.md (newer scaffold content).
- Kept origin: create-new-feature.sh, get-last-worktree.sh,
  select-worktree.sh (carry PR review fixes); executable bit re-applied.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@brettheap brettheap deleted the 001-mock-call-mock-crm branch May 22, 2026 01:29
brettheap pushed a commit that referenced this pull request May 23, 2026
1. Owner bind respects entity-set (Copilot #1): _resolve_task_owner now
   returns (owner_id, entity_set) and _owner_bind targets /teams(<id>) when
   the resolved owner verified as a team. A team override would otherwise
   POST under /systemusers and 400 against real Dataverse.

2. Active-enabled gate + default validation (Copilot #2): systemuser lookup
   now filters by isdisabled=false (teams have no such column). The
   configured default owner is verified too; if neither override nor default
   resolves to an active enabled user/team, Task emission is blocked with an
   operator-visible task_owner_default_unverifiable warning rather than
   writing an unverified id (FR-025 — never write an unverified owner).

3. Simplified --write flag (Copilot #3): dropped the misleading --dry-run
   alias on run-crm. The CLI now declares only --write and the help/error
   message explicitly point to US2 / T024-T026 for the dry-run wiring.

Tests:
- Contract test seeds the default owners as active enabled systemusers and
  adds: disabled-systemuser override falls back, team override binds to
  /teams, and default-owner-unverifiable blocks emission.
- Integration test seeds both default owners as active enabled systemusers
  so the US1 happy/alternate/blocked paths verify under the stricter gate.

271 / 273 pass (28 in this PR's new suite; the 2 remaining failures are the
pre-existing constitution-sync regressions inherited from fab3ebd).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
brettheap pushed a commit that referenced this pull request May 23, 2026
Copilot #1 + #2 — vacuous integration tests
  The original tests called validate_fixture / place_call directly and asserted
  on DB state, but neither touches the state store, so the "no session row /
  no attempt" assertions were tautologically true. To make them meaningful,
  drive process_one_queue_item with a malformed fixture and prove the
  orchestrator writes nothing — which required moving transport.place_call
  before session-row creation (the FR-014 allowed exception for fixture
  pre-validation).

  - src/opencloser/core/orchestrator.py: place_call now runs after eligibility
    + the transport_fixture_id None-check but BEFORE session insert. The
    session is inserted with mock_provider_call_id already set (was: None,
    backfilled later). _run_allowed_session receives the pre-placed
    mock_call_id and skips its own place_call; transport_fixture_id is no
    longer threaded into the helper.
  - tests/integration/test_us5_malformed_fixture.py: rewritten to drive
    process_one_queue_item end-to-end with each FR-020 rejection class.
    Asserts zero sessions, zero eligibility_decisions, zero
    mock_call_events, zero idempotency_keys, attempt_count == 0,
    callable_status unchanged, last_decision_at null. Also adds a CLI-handler
    smoke test asserting MalformedFixtureError surfaces as a ValueError
    through the full orchestrator stack.

Copilot #3 — TOCTOU between place_call and event_stream
  Before: place_call validated the file; event_stream re-read it later. A
  fixture mutating on disk between the two calls could re-introduce a
  mid-stream failure.

  - src/opencloser/transport/mock.py: validate_fixture now returns the
    validated events list. place_call stashes (fixture_name, events) in
    _pending; event_stream consumes the in-memory snapshot instead of
    re-reading the file. The CallTransport public surface (place_call's
    signature and event_stream's externally observable behavior) is
    unchanged; only the internal implementation moves to an in-memory
    snapshot.
  - tests/unit/test_transport_fixture_validation.py: two new tests prove
    the snapshot behavior — event_stream yields the captured events even
    if the fixture is overwritten with garbage or deleted between
    place_call and event_stream. Updated the two accept tests to assert
    on the new return value.

Sourcery #1 — error-type contract test
  - tests/unit/test_transport_fixture_validation.py: new
    test_malformed_fixture_error_is_a_value_error_subclass locks the
    ValueError-subclass contract that the Slice 1 CLI's (ValueError,
    OSError) handler depends on.

Verification: uv run pytest → 270 passed, 2 pre-existing constitution_sync
failures unrelated to this work. Ruff lint + format clean.
brettheap added a commit that referenced this pull request May 23, 2026
* [Spec Kit] Slice 2 implement — Phase 7 (T035-T036): US5 fixture pre-validation

Adds FR-019/FR-020 transport-fixture structural pre-validation, resolving
GitHub issue #2: a malformed mock-transport fixture is now rejected before
any session row, attempt increment, or Dataverse queue change.

T035 — src/opencloser/transport/mock.py (the FR-014 allowed exception):
  - New MalformedFixtureError (ValueError subclass — preserves the Slice 1
    CLI's clean "error:" + exit-2 contract for bad operator input).
  - New module-level validate_fixture(fixture_path): rejects missing file,
    invalid JSON, missing/non-list 'events', non-dict events, and events
    missing type/event_id/timestamp. Structurally valid but semantically
    inconsistent fixtures pass (spec §Edge Cases).
  - place_call calls validate_fixture before allocating a
    mock_provider_call_id, so the orchestrator cannot record a call id
    or transition the session to in_flight for a malformed fixture.

T036 — tests:
  - tests/unit/test_transport_fixture_validation.py: validate_fixture and
    place_call rejection cases, plus accept-cases for empty events and
    semantically inconsistent (out-of-order) event sequences.
  - tests/integration/test_us5_malformed_fixture.py: SC-006 — for each
    FR-020 rejection class (invalid JSON / no events / missing identity /
    missing file), assert no session row, no mock_call_event row, no
    attempt-count idempotency key, queue_item.attempt_count unchanged
    through both the runner-style validate_fixture gate and the
    transport's place_call gate.
  - tests/unit/test_transport_mock.py: updated three pre-existing tests
    whose semantics shifted from event_stream-time / FileNotFoundError to
    place_call-time / MalformedFixtureError.

Verification: full pytest suite — 270 passed, 2 pre-existing
constitution_sync failures unrelated to this phase.

* Fix PR #5 review findings (Copilot x3 + Sourcery x1)

Copilot #1 + #2 — vacuous integration tests
  The original tests called validate_fixture / place_call directly and asserted
  on DB state, but neither touches the state store, so the "no session row /
  no attempt" assertions were tautologically true. To make them meaningful,
  drive process_one_queue_item with a malformed fixture and prove the
  orchestrator writes nothing — which required moving transport.place_call
  before session-row creation (the FR-014 allowed exception for fixture
  pre-validation).

  - src/opencloser/core/orchestrator.py: place_call now runs after eligibility
    + the transport_fixture_id None-check but BEFORE session insert. The
    session is inserted with mock_provider_call_id already set (was: None,
    backfilled later). _run_allowed_session receives the pre-placed
    mock_call_id and skips its own place_call; transport_fixture_id is no
    longer threaded into the helper.
  - tests/integration/test_us5_malformed_fixture.py: rewritten to drive
    process_one_queue_item end-to-end with each FR-020 rejection class.
    Asserts zero sessions, zero eligibility_decisions, zero
    mock_call_events, zero idempotency_keys, attempt_count == 0,
    callable_status unchanged, last_decision_at null. Also adds a CLI-handler
    smoke test asserting MalformedFixtureError surfaces as a ValueError
    through the full orchestrator stack.

Copilot #3 — TOCTOU between place_call and event_stream
  Before: place_call validated the file; event_stream re-read it later. A
  fixture mutating on disk between the two calls could re-introduce a
  mid-stream failure.

  - src/opencloser/transport/mock.py: validate_fixture now returns the
    validated events list. place_call stashes (fixture_name, events) in
    _pending; event_stream consumes the in-memory snapshot instead of
    re-reading the file. The CallTransport public surface (place_call's
    signature and event_stream's externally observable behavior) is
    unchanged; only the internal implementation moves to an in-memory
    snapshot.
  - tests/unit/test_transport_fixture_validation.py: two new tests prove
    the snapshot behavior — event_stream yields the captured events even
    if the fixture is overwritten with garbage or deleted between
    place_call and event_stream. Updated the two accept tests to assert
    on the new return value.

Sourcery #1 — error-type contract test
  - tests/unit/test_transport_fixture_validation.py: new
    test_malformed_fixture_error_is_a_value_error_subclass locks the
    ValueError-subclass contract that the Slice 1 CLI's (ValueError,
    OSError) handler depends on.

Verification: uv run pytest → 270 passed, 2 pre-existing constitution_sync
failures unrelated to this work. Ruff lint + format clean.

* Address PR #5 review round 2 (Codex P1+P2 + Copilot)

Codex P1 — restore "session row before call attempt" ordering
  The previous commit moved transport.place_call before session-row creation.
  That works for the mock transport (place_call has no real side effect) but
  weakens the architecture for any future real transport: a DB failure after
  place_call but before insert_session would leave a placed call with no
  audit row and no idempotency context.

  Revert that reorder and add a side-effect-free hook on the transport that
  the orchestrator can call *before* DB writes without dialing anything:

  - src/opencloser/transport/base.py: extend the CallTransport protocol with
    pre_validate_fixture(fixture_id) — explicitly side-effect-free, raises
    MalformedFixtureError. FR-014's allowed exception covers adding this hook
    for FR-019/FR-020.
  - src/opencloser/transport/mock.py: implement pre_validate_fixture on
    FixtureDrivenTransport as `validate_fixture(self._resolve_fixture_path(...))`.
    No call-id allocation, no _pending mutation.
  - src/opencloser/core/orchestrator.py: process_one_queue_item now calls
    transport.pre_validate_fixture(transport_fixture_id) before session
    creation; place_call moves back to _run_allowed_session (its original
    location after session insert). The internal place_call still re-runs
    validate_fixture as defense in depth (idempotent with the pre-validate
    hook).
  - tests/unit/test_transport_fixture_validation.py: new tests for
    pre_validate_fixture — happy path, all FR-020 rejection classes, no
    call-id allocation, no _pending mutation, and the path-traversal guard.

Codex P2 + Copilot — normalize all read/decode failures to MalformedFixtureError
  validate_fixture previously only caught JSONDecodeError. Non-UTF-8 bytes
  (UnicodeDecodeError), directories (IsADirectoryError), permission errors,
  and broken symlinks would bubble through as different exception types,
  bypassing the FR-020 single-class rejection contract.

  - src/opencloser/transport/mock.py: switch the existence check from
    path.exists() to path.is_file() (covers directories, broken symlinks,
    special files). Catch (OSError, UnicodeDecodeError) around read_text and
    re-raise as MalformedFixtureError (chained with `from exc` for diagnostics).
  - tests/unit/test_transport_fixture_validation.py: three new tests covering
    a directory at the fixture path, non-UTF-8 bytes, and an unreadable
    chmod-0 file (skipped when the test process can still read it, e.g. root).

Verification: uv run pytest → 280 passed, 2 pre-existing constitution_sync
failures unrelated to this work. Ruff lint + format clean.

Stale bot comments (TOCTOU on mock.py / ValueError-subclass test) were already
addressed in c5dee38; the bots re-posted them on the new commit anyway.

* Address PR #5 review round 3 (Copilot)

Copilot (suppressed, low-confidence — but real edge case): non-string event 'type'
  validate_fixture didn't enforce that 'type' is a string. A fixture with
  {"type": ["list"], ...} would pass pre-validation, then EventType(["list"])
  in event_stream would raise TypeError (not ValueError) mid-stream, escape
  the unknown-type handler, escape the orchestrator, and bypass the CLI's
  (ValueError, OSError) handler — defeating the FR-020 single-class rejection
  contract.

  - src/opencloser/transport/mock.py: validate_fixture now requires
    isinstance(raw_event["type"], str) and raises MalformedFixtureError
    otherwise (with the offending event_id + actual type-name in the message
    for operator triage).
  - tests/unit/test_transport_fixture_validation.py: new parametrized test
    test_validate_fixture_rejects_non_string_event_type covers list, dict,
    int, null, and bool 'type' values.

Copilot: stale-docstring sync
  - tests/integration/test_us5_malformed_fixture.py: the inline docstring
    on test_us5_orchestrator_rejects_malformed_fixture_with_no_state_mutations
    still credited "the orchestrator's pre-session place_call gate"; the
    actual gate is now `transport.pre_validate_fixture(...)`. Updated to
    match — including the regression-detection language so anyone tracing
    a future failure here lands on the right hook.

Copilot: tasks.md / FR-014 wording
  - specs/002-mock-call-real-crm/tasks.md: T035 used to claim mock.py was
    "the only permitted change to the transport module (FR-014)". This PR
    also adds CallTransport.pre_validate_fixture in transport/base.py and
    wires it from core/orchestrator.py. Rewrote the task description to
    name all three files explicitly and note they jointly compose the
    FR-014 fixture-pre-validation exception (no other Slice-2 behavior
    added to those modules).

Stale repeats (no action):
  - Copilot mock.py:88 (OSError on read_text): already fixed in 7f447e3
    (is_file() + (OSError, UnicodeDecodeError) catch around read_text).
    Bot is pointing at the inner json.loads try and re-posting the
    previous round's text.

Verification: uv run pytest → 285 passed, 2 pre-existing constitution_sync
failures unrelated. Ruff lint + format clean.

* Address PR #5 review round 4 (Copilot)

Copilot (190bead, mock.py:112): validate event_id + timestamp shape too
  validate_fixture previously only checked that 'type' was a string. A
  fixture with a non-string event_id or a timestamp that doesn't match the
  canonical UtcMs pattern would pass pre-validation, then trigger
  pydantic.ValidationError when event_stream constructs MockCallEvent —
  *after* the orchestrator has already created the session row and
  incremented the attempt count. That reintroduces the partial-state
  problem FR-019/FR-020 are meant to prevent.

  - src/opencloser/transport/mock.py: validate_fixture now requires
    isinstance(raw_event["event_id"], str) and validates raw_event["timestamp"]
    against the canonical UtcMs schema via pydantic.TypeAdapter(UtcMs).
    Reusing the TypeAdapter keeps a single source of truth for the
    YYYY-MM-DDTHH:MM:SS.mmmZ pattern instead of duplicating the regex.
  - tests/unit/test_transport_fixture_validation.py: two new parametrized
    test classes — 5 non-string event_id shapes and 7 invalid-timestamp
    shapes (missing ms, space-separator, offset-not-Z, gibberish, empty,
    non-string, null) — plus an acceptance test for the canonical UtcMs
    format.

Verification: uv run pytest → 298 passed, 2 pre-existing constitution_sync
failures unrelated. Ruff lint + format clean.

---------

Co-authored-by: brettheap <brett.heap@users.noreply.github.com>
brettheap added a commit that referenced this pull request May 23, 2026
* [Spec Kit] Slice 2 implement — Phase 3 (T017-T023): US1 MVP

Stack on PR #3. Wires the end-to-end write-enabled loop against the
in-process Dataverse fake — the constitution's CRM-first principle made
operational.

- T018: DataverseWriteBackAdapter — emit_phone_call_activity / emit_queue_status_update
  / emit_task / build_writeback against the Dataverse Web API, with FR-024
  idempotency pre-query, FR-003 approved-update-field gating, FR-025
  task-owner resolution (default + verified override + fallback warning), and
  crm_correlations / writeback_progress ledger writes.
- T019: slice2/runner.py write-enabled coordinator — readiness gate, mapping
  approval check, verify() metadata gate, DataverseQueueLoader, stages the
  queue row locally, calls the unchanged process_one_queue_item (FR-014),
  stamps writeback_progress.run_status, maps to exit_status per cli-slice2.md.
- T020: discover-crm CLI command — refreshes the mapping artifact via discover()
  and rewrites config/dataverse_mapping.json with approved=false.
- T021: run-crm CLI command — explicit --write flag; reports exit_status /
  session_id / final_disposition / warnings.
- T022: FR-034 non-E.164 phone warning — recorded into CrmRunReport and
  appended to queue.last_error via the adapter, never changing exit status.
- T017: contract test — adapter satisfies the unchanged WriteBackAdapter
  Protocol and produces the right Dataverse interactions for every one of the
  11 final dispositions (SC-011).
- T023: US1 integration tests — happy path, interested_email_captured,
  needs_human_review, do_not_call, blocked, non-E.164 warning, and empty-queue
  no-op (SC-001 / SC-003 / SC-004 / SC-008).

ruff + pytest pass (270 tests; pre-existing failures in test_constitution_sync.py
inherited from fab3ebd are unrelated to this PR).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Address PR #6 review findings (Copilot)

1. Owner bind respects entity-set (Copilot #1): _resolve_task_owner now
   returns (owner_id, entity_set) and _owner_bind targets /teams(<id>) when
   the resolved owner verified as a team. A team override would otherwise
   POST under /systemusers and 400 against real Dataverse.

2. Active-enabled gate + default validation (Copilot #2): systemuser lookup
   now filters by isdisabled=false (teams have no such column). The
   configured default owner is verified too; if neither override nor default
   resolves to an active enabled user/team, Task emission is blocked with an
   operator-visible task_owner_default_unverifiable warning rather than
   writing an unverified id (FR-025 — never write an unverified owner).

3. Simplified --write flag (Copilot #3): dropped the misleading --dry-run
   alias on run-crm. The CLI now declares only --write and the help/error
   message explicitly point to US2 / T024-T026 for the dry-run wiring.

Tests:
- Contract test seeds the default owners as active enabled systemusers and
  adds: disabled-systemuser override falls back, team override binds to
  /teams, and default-owner-unverifiable blocks emission.
- Integration test seeds both default owners as active enabled systemusers
  so the US1 happy/alternate/blocked paths verify under the stricter gate.

271 / 273 pass (28 in this PR's new suite; the 2 remaining failures are the
pre-existing constitution-sync regressions inherited from fab3ebd).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Address PR #6 review findings (Sourcery + Codex + Copilot round 2)

P1 — Use Dataverse EntitySetName for record API paths (Codex). The
adapter now hits /phonecalls, /tasks, /medx_callqueueitems, /systemusers,
/teams via a `_entity_set_name` helper (logical_name + "s" with override
hooks for irregular plurals). Previously hitting /phonecall etc. would
404 against real Dataverse even when metadata verification succeeded.

P1 — Drop scheduledend free-form-text write (Codex). preferred_callback_window
is intentionally free-form ("Thursday afternoon"); writing to Dataverse
`scheduledend` (Edm.DateTimeOffset) 400s on non-ISO input. The window
now surfaces in the task description only.

P1 — Normalize verify() failures into structured CrmRunReport (Codex).
`run_one_crm_item` catches `DataverseError`/`MetadataError` from the
readiness `verify()` call and returns exit_status=blocked instead of
bubbling an unhandled exception to the CLI.

P1 — Validate non-empty OData-EntityId after POST (Copilot). An empty
header would otherwise lead to a CONFIRMED crm_correlations row with no
record id, masking the failure on every later resume/audit. Now raises
DataverseWriteBackError.

P2 — Use mapped primary_id instead of f"{entity}id" (Sourcery + Codex).
The adapter exposes a `_primary_id` helper that prefers the mapping's
explicit `entities[...].primary_id` and falls back to the singular-form
derivation only when the mapping omits it — important for activity tables
whose primary key is `activityid`, not `<logical>id`.

P2 — Catch queue-loader failures + Dataverse adapter exceptions in the
run coordinator (Codex). `run-crm` now produces operator-visible
exit_status=failed reports for QueueLoadError/MappingError/DataverseError
/DataverseWriteBackError/missing fixture files instead of unhandled
exceptions.

P2 — Refresh ALL mutable queue_item fields when staging from Dataverse
(Codex). _stage_queue_item now refreshes phone_number/timezone/
attempt_count in addition to callable_status/dnc_flag, so eligibility
decisions are made against current Dataverse state rather than stale
locally-cached data. State DAO `update_queue_item_status` extended
additively with the new optional kwargs.

P2 — discover-crm: load scaffold from configured artifact, not --out
(Codex). `--out` is now strictly an output override; the scaffold is
always read from slice2.toml's mapping_artifact, so `discover-crm
--out new-path.json` works on the first run.

P2 — `--next-ready` falls back to the configured [run].campaign (Codex).
A missing `--campaign` no longer drops the campaign scope; the CLI
loads slice2.toml first and uses `slice2_config.run.campaign` as the
default.

P2 — Active-enabled check (Codex + Copilot reflected). systemuser
verification now filters by `isdisabled eq false`; a disabled user is
treated as unverifiable. The default-owner verification stays — neither
override nor default verifying blocks Task emission.

Cleanup — Remove unused ReadinessError (Sourcery); update the runner
doc comment that incorrectly described non-E.164 warnings as being
written to transition_reason (they go to queue.last_error via the
adapter's _compose_last_error).

Test fixture changes — The in-process Dataverse fake now auto-aliases
each registered entity with its `<name>s` form so the foundation
queue_loader (logical names) and the adapter (entity-set names) hit the
same data through one fake instance. `_handle_create`/`_handle_patch`
derive the singular-form primary id so both URL forms resolve. Contract
+ integration tests now query fake.created/fake.patched via the
entity-set names the adapter actually uses.

271 / 273 pass — only the pre-existing constitution-sync regressions
inherited from fab3ebd remain.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Address PR #6 round-3 review findings (Codex)

P1 — Wrap Pydantic ValidationError as MappingError in `load_mapping`.
Both the runner and `discover-crm` now see a single error type for
"mapping artifact is wrong"; previously a schema-invalid JSON file
escaped as an unhandled traceback.

P1 — Require explicit `entities[...].primary_id`. The previous
fallback derived `<entity_set>-rstrip('s')+"id"`, which silently
produces `phonecallid` where Dataverse activity tables expect
`activityid`. The adapter now raises `MappingError` if the mapping
omits primary_id, pointing the operator at `dataverse_mapping.json`.
The test fixture mapping populates primary_id explicitly for queue
items, phone calls (phonecallid), and tasks (activityid).

P1 — Add explicit `entity_set_name` to DataverseEntityRef. Real
Dataverse exposes `EntityDefinition.EntitySetName` which can differ
from `logical_name + "s"` for irregular plurals; the adapter prefers
the mapping's value and falls back to a curated override map +
`+s` only when the field is unset. The override map now covers the
common irregular plurals (`opportunity`, `businessunit`, etc.).

P1 — Record failed `crm_correlations` rows on POST/PATCH errors.
`emit_phone_call_activity` / `emit_queue_status_update` / `emit_task`
each now persist a `write_status=FAILED` correlation + stamp the
resume ledger with `run_status=blocked` and `last_error` before
re-raising. Audit/recovery flows can now distinguish "not attempted"
from "attempted and failed" per CRM record kind.

P2 — Null-preserving queue refresh. `_stage_queue_item` now uses a
new `store.replace_queue_item_mutable_fields` DAO that overwrites
every mutable column from the live Dataverse snapshot — null clears
included. The old `update_queue_item_status` retains its
"`None` means leave it alone" partial-update semantics for callers
that want that behavior; the docstring documents the split.

Test fixture changes: the in-process Dataverse fake now stamps
`activityid` as an alias on activity-table creates (phonecall, task,
email, appointment) so a mapping that addresses activities by the
canonical `activityid` primary key resolves correctly. Activity test
attribute sets include `activityid` so $select / $filter pass through
the fake's strict known-field check.

271 / 273 pass; pre-existing constitution-sync failures inherited
from fab3ebd remain.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Address PR #6 round-4 review findings (Copilot)

`--next-ready` now requires a non-empty campaign — the CLI checks the
effective campaign (`--campaign` or `slice2.toml [run].campaign`) and
exits with an actionable error when both are blank. This stops the
loader from silently picking items outside the operator's intended
campaign while the loader-side filter remains a queue-loader (T014)
follow-up tracked in US3.

Docstrings on `_derive_entity_set` and `DataverseEntityRef` now match
the actual `discover-crm` behavior — it refreshes `_meta` only and does
NOT auto-populate `entity_set_name` from `EntityDefinition.EntitySetName`.
Operators set the field by hand on PR review; auto-population is
tracked as a follow-on enhancement.

New `replace_queue_item_mutable_fields` DAO is now covered by three
unit tests: full overwrite, null-clearing of nullable columns, and
FK-child preservation (sessions remain pinned to the refreshed queue
row). These pin the new "always overwrite, null included" semantics
that the runner depends on for accurate eligibility decisions.

274 / 276 pass; pre-existing constitution-sync failures from fab3ebd
remain.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Address PR #6 round-5 review findings (Codex)

P1 — Apply the campaign filter in queue_loader's NextReady path
(contracts/dataverse-queue-loader.md). When the mapping carries a
`queue.campaign` field the loader scopes the query to the selector's
campaign. When the mapping omits the field — the current Slice 2
fixture — the loader stays campaign-agnostic and the CLI gate
(`--next-ready` requires a non-empty campaign) is the user-facing
signal. Two unit tests pin the new behavior: campaign scoping when
mapped, campaign-agnostic when unmapped.

P2 — Catch MappingError from adapter emit paths in the runner's
write-back try/except. The adapter's `_primary_id` raises
MappingError when the mapping omits `entities[...].primary_id`; the
runner now normalizes that into an operator-visible exit_status=failed
report alongside DataverseError/DataverseWriteBackError.

276 / 278 pass; pre-existing constitution-sync failures from fab3ebd
remain.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Address PR #6 round-6 review findings (Copilot)

P1/security — Escape single quotes in OData `$filter` string literals.
A new `errors.odata_string_literal(value)` helper wraps a value in
single quotes with embedded `'` doubled per OData v4 spec (section
5.1.1.6.1). Used by:
  - queue_loader.py — `campaign` filter on the NextReady path
  - adapter.py — idempotency pre-query on every emit_*

Without this escape, a campaign like `O'Brien` would terminate the
literal and invalidate the whole filter clause; the helper also
shuts the door on OData-filter injection for user-supplied values.
Three unit tests pin the escape semantics: basic wrap, embedded
single-quote doubling, empty string.

Docstrings — Two more stale "discover-crm reads EntitySetName" /
"queue-loader is campaign-agnostic" remnants corrected. The
`_entity_set` docstring now matches reality (`entity_set_name` set by
hand on PR review today), and the cli.py comment notes the loader
DOES filter by campaign when the mapping carries `queue.campaign`.

279 / 281 pass; pre-existing constitution-sync failures from fab3ebd
remain.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Address PR #6 round-7 review findings (Codex)

P1 — queue_loader uses entity-set names for record queries. The
`DataverseQueueLoader.load` path was sending OData record GETs to the
table's logical name (e.g. `/medx_callqueueitem`); real Dataverse
keys record endpoints by the entity-set name (`/medx_callqueueitems`)
and would 404 the request before the row could be loaded.

P1 — Facility-name lookup now derives the entity-set form for the
mapping's `lookup_target` too. `lookup_target` is a Dataverse logical-
name reference (e.g. `account`), so a record query needs the
entity-set form (`accounts`); the previous code 404'd against real
environments on every queue intake that needed facility hydration.

Refactor — Centralized entity-set resolution in `mapping.py` as
`resolve_entity_set(mapping, entity_key)` (prefers
`entity_set_name`, falls back to `derive_entity_set(logical)`).
The override map for known-irregular plurals
(`activitypointers`, `opportunities`, `businessunits`,
`transactioncurrencies`) moved with it. Adapter + queue_loader now
share one source of truth; the duplicated `_derive_entity_set` /
`_ENTITY_SET_OVERRIDES` in `adapter.py` are gone.

279 / 281 pass; pre-existing constitution-sync failures from fab3ebd
remain. The in-process Dataverse fake's `<name>` ↔ `<name>s` alias
keeps the new entity-set queries resolving against the same records
the foundation tests register by logical name.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Address PR #6 round-8 review findings (Copilot)

Wrap the WHOLE write-back path in the failure-recording try/except for
each emit_* method, not just the POST/PATCH itself. The
`_resolve_task_owner` / `_lookup_owner_override` / `_owner_entity_set`
calls can raise DataverseError before `_idempotent_create` runs; the
queue-status `_fetch_queue_last_session` pre-query can too. Those
failures previously bypassed `_record_failure`, leaving the audit /
resume ledger inconsistent. The exception tuple is now
`(DataverseError, DataverseWriteBackError, MappingError)` — the base
`DataverseError` covers both `Permanent…` and `Transient…` so any
Dataverse access failure in emit_phone_call_activity /
emit_queue_status_update / emit_task records a consistent
`crm_correlations(write_status=failed)` row plus
`writeback_progress(run_status=blocked)`.

Fake — pre-allocate one shared records list per registered entity and
alias both `<name>` and `<name>s` to it. Previously the constructor
only aliased entities that had seed rows in the initial `records`
dict, so a POST against an empty `phonecall`/`task` table would
create a list only under the entity-set key (`phonecalls`/`tasks`),
and a later GET under the singular alias couldn't see the row. The
new logic guarantees the alias on every registered entity, with
seed rows overlaid into the shared list.

279 / 281 pass; pre-existing constitution-sync failures from fab3ebd
remain.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Address PR #6 round-9 review findings (Copilot)

Validate persona version in the Slice 2 runner. `run_one_crm_item`
defaults to `ALFAppointmentSetterPersona()` but never checked that
`slice1_config.persona.version` matched the persona's actual version;
a misconfigured slice1.toml would silently tag every write-back with
the wrong persona version. The runner now fails fast with
`exit_status=blocked` on mismatch, mirroring the Slice 1 CLI's
`run-one` check.

Validate `--queue-item-id` shape at the CLI. `--queue-item-id` is
interpolated into OData `$filter` clauses and record URLs downstream;
accepting an arbitrary string opened a filter-injection vector for
operator-supplied input and could turn typos into 404s from the queue
loader. The CLI now validates against the Dataverse GUID shape
(8-4-4-4-12 hex with hyphens) and exits 2 with a clear message
otherwise.

Two new test_cli integration tests pin the new behavior:
- `run-crm --queue-item-id <not-a-guid>` exits 2 with the GUID error.
- `run-crm` without `--write` still exits 2 with the dry-run pointer.

281 / 283 pass; pre-existing constitution-sync failures from fab3ebd
remain.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Address PR #6 round-10 review findings (Copilot + Codex)

P1 (Codex) — emit_task now SURFACES unverifiable-owner as a failed
task write rather than silently returning. The earlier behavior let a
disposition that requires a Task (interested_callback_requested,
needs_human_review, etc.) complete the run as `completed` while the
required Task never wrote, hiding broken human-follow-up handoff.
When `_resolve_task_owner` returns None the adapter now raises
`DataverseWriteBackError`; the outer try/except records a failed task
correlation + stamps `writeback_progress(run_status=blocked)`, and the
runner normalizes that into `exit_status=failed`.

P1 (Codex) — `_owner_entity_set` no longer swallows every
PermanentDataverseError. Only HTTP 404 (the table genuinely absent
in this environment) falls through to the next entity set; 401/403
permission regressions and 400 malformed queries propagate so
emit_task records a failed task correlation rather than silently
degrading to "owner unverifiable → fallback".

P1 (Codex) — queue_loader rejects NextReady when `queue.campaign`
is missing from the mapping. In a multi-campaign environment a
campaign-agnostic query would pick the wrong row; failing fast at
queue-load with a clear `QueueLoadError` is the safer default.
Existing tests updated to seed the queue.campaign mapping where
campaign-scoped selection is expected.

Copilot — `_record_failure` no longer downgrades a prior CONFIRMED
correlation to FAILED, and no longer flips `*_done` from True to
False. A later transient failure on an already-confirmed record (e.g.
a duplicate emit retry) preserves the confirmed state and the
completed `*_done` flag; only when the prior state was unset/not-done
does the failure path mark them.

P2 (Codex) — `_ENTITY_SET_OVERRIDES` had the wrong logical-name key
(`"currency"` instead of `"transactioncurrency"`). With that key,
`derive_entity_set("transactioncurrency")` fell through to the
`+s` rule and produced `transactioncurrencys` — invalid Web API
path. Fixed.

Contract test updated: `test_task_blocked_when_default_owner_unverifiable`
now expects `DataverseWriteBackError` AND verifies the failed
`crm_correlations` row.

281 / 283 pass; pre-existing constitution-sync failures from fab3ebd
remain.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Address PR #6 round-11 review finding (Codex)

P2 — Catch Pydantic `ValidationError` from queue_loader.load() in the
run coordinator. `DataverseQueueLoader.load(...)` constructs a
`QueueItem` from the Dataverse row, and a schema-corrupted row
(negative `attempt_count`, etc.) surfaces as `pydantic.ValidationError`
from that constructor. The runner's loader try/except only caught
`MappingError`/`QueueLoadError`/`DataverseError`, so a corrupted row
escaped as an unhandled traceback instead of producing the documented
operator-visible `exit_status=failed` report.

281 / 283 pass; pre-existing constitution-sync failures from fab3ebd
remain.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Address PR #6 round-12 review findings (Copilot + Codex)

P1 (Codex×3) — Persist `_record_failure` writes outside the
orchestrator's rolling-back transaction. The orchestrator wraps every
`crm.emit_*` in `with store.transaction(conn)`; when an emit raises,
the ROLLBACK undoes ANY rows the adapter wrote — including the
crm_correlations(failed) / writeback_progress(blocked) markers the
new failure ledger depends on. The adapter now accepts an optional
`failure_conn_factory` callable and the runner wires it to a peer
SQLite connection on the same WAL db. Each failure write commits
independently of the rolling-back transaction so resume/audit can
distinguish "not attempted" from "attempted and failed" in production.

Copilot — Catch `pydantic.ValidationError` in `discover-crm` and
`run-crm` CLI config loads. `load_config` / `load_slice2_config` raise
ValidationError (not ValueError) on schema-invalid TOML; the previous
handlers crashed with a traceback. Both commands now exit 2 with a
clean error message.

Copilot — Preserve a terminal `COMPLETED` run_status in
`_record_failure`. The same pattern as the CONFIRMED-correlation
preservation: a later transient failure on an already-completed
session must not regress the ledger from COMPLETED back to BLOCKED.

Copilot — Fix stale cli.py comment that claimed the queue-loader
falls back campaign-agnostic when `queue.campaign` is unmapped. The
loader now raises `QueueLoadError` in that case; the CLI gate is
just an earlier-exit convenience for missing operator input.

Copilot — Fix the E.164 regex comment. `^\+[1-9]\d{1,14}$` is
2-15 digits (mandatory non-zero leading digit), not 1-15.

Copilot — Fix step numbering in `run_one_crm_item` after the earlier
refactor renumbered the steps but missed updating the trailing
comment (5 → 6, was incorrectly 7).

281 / 283 pass; pre-existing constitution-sync failures from fab3ebd
remain.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Address PR #6 round-13 review finding (Codex)

P1 — Replace the peer-connection failure-ledger fix with an in-memory
staging + deferred-write model. SQLite is single-writer, so the round-12
"open a second connection during emit_*" approach would deadlock (or
hit SQLITE_BUSY) because the orchestrator's
`with store.transaction(conn)` block already holds the write lock
while the emit is in flight. The peer connection writes never reached
disk in production environments — masking the original Dataverse
failure with an OperationalError on top.

New shape:
- `_record_failure` (called inside the failing emit) appends to a new
  `self._pending_failures: list[_PendingFailure]` instead of touching
  the DB. The orchestrator's rollback fires without erasing anything.
- The runner's
  `except (DataverseError, DataverseWriteBackError, MappingError)`
  handler now calls `adapter.flush_pending_failures()` AFTER the
  exception bubbles up (the rollback has released the write lock).
  Each staged failure is persisted via the shared connection with the
  same preserve-CONFIRMED / preserve-`*_done` / preserve-COMPLETED
  semantics from round-12.
- The `failure_conn_factory` constructor parameter is gone — no more
  peer connections, no more lock contention.

Contract test `test_task_blocked_when_default_owner_unverifiable`
updated: calls `adapter.flush_pending_failures()` explicitly to
verify the failed `crm_correlations` row that downstream resume/audit
will see (the test bypasses the runner, so it has to drive flush
itself, just as the runner does in production).

281 / 283 pass; pre-existing constitution-sync failures from fab3ebd
remain.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Address PR #6 round-14 review finding (Copilot)

Preserve terminal `run_status` in `_mark_progress`. The default of
`RunStatus.IN_PROGRESS` was unconditionally overwriting the column on
every per-emit `*_done` flip, so a follow-up call (idempotent retry,
mid-run progress update) could regress an already-terminal `COMPLETED`
or `BLOCKED` ledger row back to `IN_PROGRESS`. The default is now
`None` ("leave the existing value alone"); the first call for a
brand-new session still gets `IN_PROGRESS` via `_new_progress`, and
explicit terminal transitions (the runner's `finalize_progress` /
`flush_pending_failures`) pass the value they want.

281 / 283 pass; pre-existing constitution-sync failures from fab3ebd
remain.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Fix Sonar issues after US1 umbrella merge

---------

Co-authored-by: brettheap <brett.heap@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
brettheap pushed a commit that referenced this pull request May 24, 2026
…closes review HIGH #1/#2/#3)

Closes 3 HIGH findings from the 2026-05-24 swarm code-review pass:

HIGH #1 (security-secrets): tenant_id leaked via OAuth endpoint URL in
auth.py error messages. `_wrap_auth_transport_error` and `_auth_status_error`
interpolated `_TOKEN_ENDPOINT.format(tenant=tenant_id)` into every
Permanent/TransientDataverseError. The string lands in `writeback_progress.last_error`,
`CrmRunReport.message`, stdout, and SQLite rows on any auth failure.

HIGH #2 (security-secrets): env_url host leaked via response.request.url in
errors.py for every non-2xx response. The pre-existing redaction stripped
only the query string; the host (= DATAVERSE_ENV_URL) survived.

HIGH #3 (security-secrets): T047 only tested the success path. It asserted
`exit_status == "completed"` before walking the inspection surface and stubbed
the OAuth flow, so neither HIGH #1 nor HIGH #2 could be caught.

Fixes:
- auth.py: new module-level `_REDACTED_TOKEN_ENDPOINT` constant ("https://login.microsoftonline.com/<redacted-tenant>/oauth2/v2.0/token"). `_wrap_auth_transport_error` + `_auth_status_error` use the constant; they no longer take an `endpoint` parameter. The real endpoint stays internal to `_acquire` for the POST only. Underlying exception is preserved via `from exc` chain (debug visibility) but never serialized via `{exc!r}` — uses `type(exc).__name__` instead.
- errors.py:96-104: strip BOTH host and query from response URL; only path survives. Prefixed with `<env>` placeholder so the redaction is visible at a glance.
- tests/contract/test_no_secrets_in_artifacts.py: split into 5 tests covering: (a) success path with tightened sanity list (task.json, transcript.txt added; SQLite row-count assertions), (b) Dataverse-401 failure path that drives the runner's error-formatter code path, (c-e) unit-level tests pinning the redaction contract on `_wrap_auth_transport_error`, `_auth_status_error`, and `raise_for_dataverse_response` directly so a regression surfaces without needing a working OAuth e2e fake.

Result: 611 pass (+4 new T047 tests) / 2 pre-existing constitution_sync
failures unrelated. ruff clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants