Slice 2 — US4 idempotency + resume coordinator (Phase 6, T031/T032/T033/T034)#9
Conversation
…33/T034)
Lands four of the six US4 tasks. T045 (mid-run CRM-state conflict
detection) and T046 (its integration test) are intentionally DEFERRED to
a follow-up sub-PR — the conflict-detection design (queue snapshot
threading + per-field diffing + adapter integration) is a distinct
~300 LoC scope that benefits from its own focused review pass.
- T031 — verified the FR-024 idempotency pre-query is already implemented
by US1's `DataverseWriteBackAdapter._idempotent_create` (pre-query then
POST) and `_fetch_queue_last_session` (PATCH skip). US4's
test_us4_adapter_emit_phone_call_twice_idempotent confirms the
short-circuit holds.
- T032 — NEW `src/opencloser/slice2/resume.py` module. `resume_session()`
loads the persisted `writeback_progress` row + `writeback.json`
artifact, replays ONLY the missing emit_* operations via a fresh
adapter (whose `_idempotent_create` pre-query provides
belt-and-suspenders against duplicates), and stamps run_status=completed
on success. The Slice 1 orchestrator is intentionally NOT re-invoked
per FR-014 — resume completes only the missing CRM writes. ResumeError
for pre-flight failures (missing progress row, missing writeback.json);
ResumeReport with `operations_replayed` listing what was done.
- T033 — `run-crm --resume <session-id>` CLI flag. When supplied:
requires --write (resume is a write-back continuation), is incompatible
with --queue-item-id / --next-ready / --transport-fixture (resume does
not re-run the orchestrator), and routes to the new `_run_crm_resume`
helper that loads configs + mapping, constructs DataverseClient +
MappingTranslator, and calls resume_session(). Exit codes:
completed/no-resume-needed → 0, failed/preflight → 2.
- T034 — NEW `tests/integration/test_us4_idempotency.py` (6 scenarios):
1. within-session FR-024 pre-query (emit_phone_call_activity twice with
the same session_id payload → exactly one POST in the fake)
2. transient-failure retry reuses correlation (fake.fail_next(1, 503),
then succeeds; one record total)
3. resume after exhausted retry budget (simulate resume_needed; call
resume_session; exactly one record kept; progress flips to completed)
4. resume on a finalized session is a no-op (no Dataverse writes)
5. ResumeError when writeback.json missing (FR-023 pre-flight)
6. ResumeError when session id unknown (no progress row)
Cross-session de-dup is documented as BY DESIGN in the test file: each
run-crm invocation creates a new session with its own idempotency key.
The spec's "no duplicates" guarantee is per-session — enforced by
resume for the resume-needed case and by the queue loader's
callable-status filter for the "don't re-process a finalized item" case.
Regression: 426/428 (added 6 new tests; same 2 pre-existing
constitution-sync failures). Ruff clean.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Reviewer's GuideImplements the US4 resume coordinator and CLI resume flow, wires it into the existing Dataverse adapter idempotency behavior, and adds focused integration tests plus spec updates while deferring conflict-detection work to a later PR. Sequence diagram for the new run_crm --resume flow and resume_session coordinatorsequenceDiagram
actor Operator
participant CliRunCrm as run_crm
participant RunCrmResume as _run_crm_resume
participant ResumeSession as resume_session
participant Store as store
participant Adapter as DataverseWriteBackAdapter
Operator->>CliRunCrm: run_crm --resume <session_id> --write
CliRunCrm->>CliRunCrm: validate resume and transport_fixture options
CliRunCrm->>RunCrmResume: _run_crm_resume(session_id, config_paths)
RunCrmResume->>RunCrmResume: _load_run_crm_configs
RunCrmResume->>RunCrmResume: load_dataverse_secrets
RunCrmResume->>RunCrmResume: load_mapping
RunCrmResume->>ResumeSession: resume_session(session_id, conn, artifact_root, client, translator, task_owners, clock)
ResumeSession->>Store: get_writeback_progress(conn, session_id)
Store-->>ResumeSession: progress
alt [no progress]
ResumeSession-->>RunCrmResume: raise ResumeError
RunCrmResume-->>CliRunCrm: typer.Exit(code=2)
else [progress.run_status is not RESUME_NEEDED]
ResumeSession-->>RunCrmResume: ResumeReport(exit_status no-resume-needed)
RunCrmResume-->>CliRunCrm: typer.Exit(code=0)
else [progress.run_status is RESUME_NEEDED]
ResumeSession->>ResumeSession: load WriteBack from writeback.json
ResumeSession->>Adapter: DataverseWriteBackAdapter(...)
par [per operation]
opt [!progress.phone_call_activity_done]
ResumeSession->>Adapter: emit_phone_call_activity
end
opt [!progress.queue_status_update_done]
ResumeSession->>Adapter: emit_queue_status_update
end
opt [!progress.task_done]
ResumeSession->>Adapter: emit_task
end
end
alt [DataverseError or DataverseWriteBackError or MappingError]
ResumeSession->>Adapter: flush_pending_failures
ResumeSession-->>RunCrmResume: ResumeReport(exit_status failed)
RunCrmResume-->>CliRunCrm: typer.Exit(code=2)
else [all replays succeed]
ResumeSession->>Adapter: finalize_progress(session_id, RunStatus.COMPLETED)
ResumeSession-->>RunCrmResume: ResumeReport(exit_status completed)
RunCrmResume-->>CliRunCrm: typer.Exit(code=0)
end
end
CliRunCrm-->>Operator: print exit_status, session_id, artifact_dir, operations_replayed
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The
run-crmresume path duplicates the reporting/exit-code logic already present in_print_crm_reportand the mainrun_crmflow; consider refactoring to share a common helper so that output format and status-to-exit-code mapping stay in sync. - The
ResumeReport.operations_replayedvalues are currently bare strings that are repeated in tests and implementation (e.g."phone_call_activity","queue_status_update","task"); introducing a small enum or typed constants for these would reduce the chance of drift or typos between the coordinator and its callers/tests.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `run-crm` resume path duplicates the reporting/exit-code logic already present in `_print_crm_report` and the main `run_crm` flow; consider refactoring to share a common helper so that output format and status-to-exit-code mapping stay in sync.
- The `ResumeReport.operations_replayed` values are currently bare strings that are repeated in tests and implementation (e.g. `"phone_call_activity"`, `"queue_status_update"`, `"task"`); introducing a small enum or typed constants for these would reduce the chance of drift or typos between the coordinator and its callers/tests.
## Individual Comments
### Comment 1
<location path="src/opencloser/slice2/resume.py" line_range="116-117" />
<code_context>
+ "without persisted payloads (FR-023). Inspect local audit-artifact "
+ "retention (FR-035) and re-run the original `run-crm` instead."
+ )
+ writeback = WriteBack.model_validate_json(
+ writeback_path.read_text(encoding="utf-8")
+ )
+
</code_context>
<issue_to_address>
**suggestion:** Unvalidated writeback.json parsing can surface as an unhandled exception
If `writeback.json` is truncated, corrupted, or schema-incompatible, `model_validate_json` will raise (e.g., `ValidationError`) and currently terminates as an unhandled exception instead of a controlled pre-flight failure. Please catch JSON/validation errors here and wrap them in a `ResumeError` so the CLI emits a clear `error: ...` message and exits with code 2, consistent with the other resume pre-flight checks.
Suggested implementation:
```python
try:
writeback = WriteBack.model_validate_json(
writeback_path.read_text(encoding="utf-8")
)
except (ValidationError, ValueError) as exc:
raise ResumeError(
f"writeback.json under {session_dir!r} is invalid or corrupted; "
"cannot resume without a valid persisted payload. Inspect the local "
"audit-artifact (FR-035) and re-run the original `run-crm` instead."
) from exc
```
To compile and run correctly, ensure that:
1. `ValidationError` is imported in this module, e.g.:
`from pydantic import ValidationError`
If it is already imported elsewhere in the file, no further import changes are needed.
2. `ResumeError` is defined in this module or imported from its defining module, consistent with the other resume pre-flight checks which also raise `ResumeError`.
</issue_to_address>
### Comment 2
<location path="tests/integration/test_us4_idempotency.py" line_range="1" />
<code_context>
+"""US4 — Idempotent CRM write-back across duplicate events and retries.
+
+End-to-end integration tests that exercise the FR-021..FR-024 idempotency
</code_context>
<issue_to_address>
**suggestion (testing):** Add CLI-level tests for `run-crm --resume` to exercise the new entrypoint and error handling
Current tests cover `resume_session` and idempotency but don’t exercise the `run-crm --resume` CLI path (including `_run_crm_resume`). Please add CLI-level tests that:
- Run `run-crm --resume --write` and assert the exit code and output fields for `completed`, `failed`, and `no-resume-needed` cases.
- Verify `--resume` without `--write` returns the documented error and exit code 2.
- Verify `--resume` combined with any of `--queue-item-id`, `--next-ready`, or `--transport-fixture` yields the incompatible-options error and exit code 2.
- Verify a `ResumeError` from `resume_session` is exposed as a clear CLI error message and exit code 2.
These can go next to this file or in a dedicated CLI integration test module to lock in the user-facing behavior of the new flag.
Suggested implementation:
```python
"""CLI integration tests for `run-crm --resume`.
These tests exercise the end-user CLI behavior for the `--resume` flag,
including normal completion, no-op resumes, failure propagation, and
argument validation errors.
"""
import json
from typing import Sequence
import pytest
from click.testing import CliRunner
# NOTE: Adjust these imports to match your actual CLI entrypoint.
# The tests assume a Click-based CLI with a root command `cli`
# exposing a `run-crm` subcommand.
from slice2.cli import cli # type: ignore[import]
def _invoke_run_crm(args: Sequence[str]):
"""Helper to invoke the `run-crm` CLI command via Click's test runner."""
runner = CliRunner()
result = runner.invoke(cli, ["run-crm", *args])
return result
@pytest.mark.integration
def test_run_crm_resume_write_completed_success(prepared_resumable_session):
"""
`run-crm --resume --write` completes a resumable session and reports
a successful completion (no failures, no "no-resume-needed").
"""
# `prepared_resumable_session` should create a pending resume-able session.
result = _invoke_run_crm(["--resume", "--write"])
assert result.exit_code == 0, result.output
# The CLI is expected to emit JSON with `completed`, `failed`,
# and `no-resume-needed` fields.
payload = json.loads(result.output.strip().splitlines()[-1])
assert payload["completed"] >= 1
assert payload["failed"] == 0
assert payload.get("no-resume-needed", 0) == 0
@pytest.mark.integration
def test_run_crm_resume_write_no_resume_needed(empty_resume_queue):
"""
`run-crm --resume --write` when there is nothing to resume should
exit successfully and report `no-resume-needed`.
"""
# `empty_resume_queue` should ensure there are no resumable sessions.
result = _invoke_run_crm(["--resume", "--write"])
assert result.exit_code == 0, result.output
payload = json.loads(result.output.strip().splitlines()[-1])
assert payload["completed"] == 0
assert payload["failed"] == 0
assert payload["no-resume-needed"] >= 1
@pytest.mark.integration
def test_run_crm_resume_write_failed(resumable_session_that_will_fail):
"""
`run-crm --resume --write` when the resumed session fails should still
exit with code 0 but report failures in the JSON output.
"""
# `resumable_session_that_will_fail` should stage a session that will fail
# during resume (e.g. deterministic validation error in the CRM write-back).
result = _invoke_run_crm(["--resume", "--write"])
assert result.exit_code == 0, result.output
payload = json.loads(result.output.strip().splitlines()[-1])
assert payload["completed"] >= 0
assert payload["failed"] >= 1
# In a failing resume, there should not be a "no-resume-needed" increment.
assert payload.get("no-resume-needed", 0) == 0
@pytest.mark.integration
def test_run_crm_resume_without_write_errors():
"""
`run-crm --resume` without `--write` should fail fast with the documented
message and exit code 2.
"""
result = _invoke_run_crm(["--resume"])
assert result.exit_code == 2
# Message text should match the documented behavior for this error.
assert "must be used together with --write" in result.output.lower()
@pytest.mark.integration
@pytest.mark.parametrize(
"extra_args",
[
["--queue-item-id", "dummy-id"],
["--next-ready"],
["--transport-fixture", "dummy-fixture"],
],
)
def test_run_crm_resume_incompatible_option_errors(extra_args):
"""
`--resume` combined with incompatible options should return an error and
exit code 2.
"""
result = _invoke_run_crm(["--resume", "--write", *extra_args])
assert result.exit_code == 2
assert "cannot be combined with --resume" in result.output.lower()
@pytest.mark.integration
def test_run_crm_resume_propagates_resume_error(monkeypatch, prepared_resumable_session):
"""
A `ResumeError` from `resume_session` should be exposed as a clear CLI
error and exit code 2.
"""
from slice2.resume import ResumeError, resume_session # type: ignore[import]
def _raise_resume_error(*args, **kwargs):
raise ResumeError("synthetic resume failure for testing")
# Monkeypatch the lower-level coordinator used by the CLI entrypoint.
monkeypatch.setattr(
"slice2.resume.resume_session", _raise_resume_error, raising=True
)
result = _invoke_run_crm(["--resume", "--write"])
assert result.exit_code == 2
assert "synthetic resume failure for testing" in result.output
```
1. Update the import of the CLI entrypoint at the top of `test_run_crm_resume_cli.py`:
- If your root Click group is not `cli` or not located in `slice2.cli`, adjust `from slice2.cli import cli` and the `runner.invoke(cli, ...)` calls to match your actual CLI module and command group.
2. Wire up or adjust the fixtures referenced in these tests:
- `prepared_resumable_session` should create a resumable session that will complete successfully when resumed.
- `empty_resume_queue` should ensure there are no resumable sessions.
- `resumable_session_that_will_fail` should stage a resumable session that deterministically fails when resumed.
- If you already have similar fixtures in `tests/integration/test_us4_idempotency.py` or `conftest.py`, re-use or alias them instead of creating new ones.
3. Align the expected JSON contract if your CLI output format differs:
- If `run-crm --resume --write` does not emit JSON, or the keys are named differently, adapt the JSON parsing and assertions to match the actual output (for example, using `Result.stdout` patterns instead of JSON).
4. Ensure the error message substrings used in the assertions (`"must be used together with --write"`, `"cannot be combined with --resume"`) match the precise text produced by `_run_crm_resume` in your implementation; adjust the strings to keep the tests stable but still specific enough.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| @@ -0,0 +1,385 @@ | |||
| """US4 — Idempotent CRM write-back across duplicate events and retries. | |||
There was a problem hiding this comment.
suggestion (testing): Add CLI-level tests for run-crm --resume to exercise the new entrypoint and error handling
Current tests cover resume_session and idempotency but don’t exercise the run-crm --resume CLI path (including _run_crm_resume). Please add CLI-level tests that:
- Run
run-crm --resume --writeand assert the exit code and output fields forcompleted,failed, andno-resume-neededcases. - Verify
--resumewithout--writereturns the documented error and exit code 2. - Verify
--resumecombined with any of--queue-item-id,--next-ready, or--transport-fixtureyields the incompatible-options error and exit code 2. - Verify a
ResumeErrorfromresume_sessionis exposed as a clear CLI error message and exit code 2.
These can go next to this file or in a dedicated CLI integration test module to lock in the user-facing behavior of the new flag.
Suggested implementation:
"""CLI integration tests for `run-crm --resume`.
These tests exercise the end-user CLI behavior for the `--resume` flag,
including normal completion, no-op resumes, failure propagation, and
argument validation errors.
"""
import json
from typing import Sequence
import pytest
from click.testing import CliRunner
# NOTE: Adjust these imports to match your actual CLI entrypoint.
# The tests assume a Click-based CLI with a root command `cli`
# exposing a `run-crm` subcommand.
from slice2.cli import cli # type: ignore[import]
def _invoke_run_crm(args: Sequence[str]):
"""Helper to invoke the `run-crm` CLI command via Click's test runner."""
runner = CliRunner()
result = runner.invoke(cli, ["run-crm", *args])
return result
@pytest.mark.integration
def test_run_crm_resume_write_completed_success(prepared_resumable_session):
"""
`run-crm --resume --write` completes a resumable session and reports
a successful completion (no failures, no "no-resume-needed").
"""
# `prepared_resumable_session` should create a pending resume-able session.
result = _invoke_run_crm(["--resume", "--write"])
assert result.exit_code == 0, result.output
# The CLI is expected to emit JSON with `completed`, `failed`,
# and `no-resume-needed` fields.
payload = json.loads(result.output.strip().splitlines()[-1])
assert payload["completed"] >= 1
assert payload["failed"] == 0
assert payload.get("no-resume-needed", 0) == 0
@pytest.mark.integration
def test_run_crm_resume_write_no_resume_needed(empty_resume_queue):
"""
`run-crm --resume --write` when there is nothing to resume should
exit successfully and report `no-resume-needed`.
"""
# `empty_resume_queue` should ensure there are no resumable sessions.
result = _invoke_run_crm(["--resume", "--write"])
assert result.exit_code == 0, result.output
payload = json.loads(result.output.strip().splitlines()[-1])
assert payload["completed"] == 0
assert payload["failed"] == 0
assert payload["no-resume-needed"] >= 1
@pytest.mark.integration
def test_run_crm_resume_write_failed(resumable_session_that_will_fail):
"""
`run-crm --resume --write` when the resumed session fails should still
exit with code 0 but report failures in the JSON output.
"""
# `resumable_session_that_will_fail` should stage a session that will fail
# during resume (e.g. deterministic validation error in the CRM write-back).
result = _invoke_run_crm(["--resume", "--write"])
assert result.exit_code == 0, result.output
payload = json.loads(result.output.strip().splitlines()[-1])
assert payload["completed"] >= 0
assert payload["failed"] >= 1
# In a failing resume, there should not be a "no-resume-needed" increment.
assert payload.get("no-resume-needed", 0) == 0
@pytest.mark.integration
def test_run_crm_resume_without_write_errors():
"""
`run-crm --resume` without `--write` should fail fast with the documented
message and exit code 2.
"""
result = _invoke_run_crm(["--resume"])
assert result.exit_code == 2
# Message text should match the documented behavior for this error.
assert "must be used together with --write" in result.output.lower()
@pytest.mark.integration
@pytest.mark.parametrize(
"extra_args",
[
["--queue-item-id", "dummy-id"],
["--next-ready"],
["--transport-fixture", "dummy-fixture"],
],
)
def test_run_crm_resume_incompatible_option_errors(extra_args):
"""
`--resume` combined with incompatible options should return an error and
exit code 2.
"""
result = _invoke_run_crm(["--resume", "--write", *extra_args])
assert result.exit_code == 2
assert "cannot be combined with --resume" in result.output.lower()
@pytest.mark.integration
def test_run_crm_resume_propagates_resume_error(monkeypatch, prepared_resumable_session):
"""
A `ResumeError` from `resume_session` should be exposed as a clear CLI
error and exit code 2.
"""
from slice2.resume import ResumeError, resume_session # type: ignore[import]
def _raise_resume_error(*args, **kwargs):
raise ResumeError("synthetic resume failure for testing")
# Monkeypatch the lower-level coordinator used by the CLI entrypoint.
monkeypatch.setattr(
"slice2.resume.resume_session", _raise_resume_error, raising=True
)
result = _invoke_run_crm(["--resume", "--write"])
assert result.exit_code == 2
assert "synthetic resume failure for testing" in result.output- Update the import of the CLI entrypoint at the top of
test_run_crm_resume_cli.py:- If your root Click group is not
clior not located inslice2.cli, adjustfrom slice2.cli import cliand therunner.invoke(cli, ...)calls to match your actual CLI module and command group.
- If your root Click group is not
- Wire up or adjust the fixtures referenced in these tests:
prepared_resumable_sessionshould create a resumable session that will complete successfully when resumed.empty_resume_queueshould ensure there are no resumable sessions.resumable_session_that_will_failshould stage a resumable session that deterministically fails when resumed.- If you already have similar fixtures in
tests/integration/test_us4_idempotency.pyorconftest.py, re-use or alias them instead of creating new ones.
- Align the expected JSON contract if your CLI output format differs:
- If
run-crm --resume --writedoes not emit JSON, or the keys are named differently, adapt the JSON parsing and assertions to match the actual output (for example, usingResult.stdoutpatterns instead of JSON).
- If
- Ensure the error message substrings used in the assertions (
"must be used together with --write","cannot be combined with --resume") match the precise text produced by_run_crm_resumein your implementation; adjust the strings to keep the tests stable but still specific enough.
There was a problem hiding this comment.
Pull request overview
Adds Slice 2 “resume” support and integration coverage for US4 idempotent write-back behavior, plus a run-crm --resume <session-id> CLI entrypoint intended to replay only missing emit_* operations from persisted writeback.json + writeback_progress.
Changes:
- Introduce
slice2.resume.resume_session()coordinator to replay incomplete write-back operations for a session. - Extend
run-crmCLI with--resumemode and adjust--transport-fixturerequirements accordingly. - Add US4 integration tests covering duplicate emits, transient retries, resume flows, and resume preflight failures.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
src/opencloser/slice2/resume.py |
New resume coordinator to replay missing Dataverse write-back operations based on persisted artifacts and progress ledger. |
src/opencloser/cli.py |
Adds run-crm --resume <session-id> path and makes --transport-fixture optional only when resuming. |
tests/integration/test_us4_idempotency.py |
New US4 integration tests for idempotency/retry/resume/no-op resume + preflight errors. |
specs/002-mock-call-real-crm/tasks.md |
Marks US4 tasks T031–T034 complete and annotates T045/T046 as deferred. |
| if progress.run_status is not RunStatus.RESUME_NEEDED: | ||
| return ResumeReport( | ||
| exit_status="no-resume-needed", | ||
| session_id=session_id, | ||
| message=( | ||
| f"session is in {progress.run_status.value!r}, not 'resume_needed'; " |
| except (DataverseError, DataverseWriteBackError, MappingError) as exc: | ||
| # The replay itself failed (transient exhausted again, mapping | ||
| # invalidated, permission regression, etc.). Persist staged failures | ||
| # exactly as the write-enabled runner does, leaving the session in a | ||
| # state another resume invocation can pick up if appropriate. | ||
| adapter.flush_pending_failures() | ||
| return ResumeReport( | ||
| exit_status="failed", | ||
| session_id=session_id, | ||
| message=f"resume replay failed: {exc}", | ||
| artifact_dir=session_dir, | ||
| operations_replayed=replayed, | ||
| ) |
| if resume is not None: | ||
| if not write: | ||
| typer.echo( | ||
| "error: --resume requires --write (resume is a write-back continuation)", | ||
| err=True, | ||
| ) | ||
| raise typer.Exit(code=2) | ||
| if queue_item_id or next_ready or transport_fixture is not None: | ||
| typer.echo( | ||
| "error: --resume is incompatible with --queue-item-id / --next-ready / " | ||
| "--transport-fixture (resume operates on persisted write-back payloads only)", | ||
| err=True, | ||
| ) | ||
| raise typer.Exit(code=2) | ||
| _run_crm_resume( | ||
| session_id=resume, | ||
| config_path=config_path, | ||
| slice2_config_path=slice2_config_path, | ||
| ) | ||
| return |
| """Force enough 503s to exhaust the retry budget AFTER the Phone Call | ||
| activity is created but BEFORE the queue-status PATCH lands; the runner | ||
| should exit `failed` (the orchestrator transaction rolls back). Manually | ||
| stamp `writeback_progress(run_status=resume_needed)` to simulate the | ||
| real exhaust path, then invoke `resume_session()` against a healthy fake; | ||
| the missing operations replay and Dataverse holds exactly one record of | ||
| each kind (SC-014).""" |
| """Force one 503 on the next Dataverse request; the bounded retry budget | ||
| (max_retries=3 by default, but raised here to ensure recovery on first | ||
| retry) must succeed on retry WITHOUT creating a second record. The | ||
| idempotency-key pre-query on retry sees no prior record (because the | ||
| first attempt 503'd before POST landed), so the retry POSTs once.""" |
| if progress.run_status is not RunStatus.RESUME_NEEDED: | ||
| return ResumeReport( | ||
| exit_status="no-resume-needed", | ||
| session_id=session_id, | ||
| message=( | ||
| f"session is in {progress.run_status.value!r}, not 'resume_needed'; " | ||
| "no replay performed" | ||
| ), | ||
| ) |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 17ba33f882
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # invalidated, permission regression, etc.). Persist staged failures | ||
| # exactly as the write-enabled runner does, leaving the session in a | ||
| # state another resume invocation can pick up if appropriate. | ||
| adapter.flush_pending_failures() |
There was a problem hiding this comment.
Keep sessions resumable after a replay failure
When a resumed write fails, this path calls flush_pending_failures(), which currently persists writeback_progress.run_status=blocked for non-completed sessions (src/opencloser/crm/dataverse/adapter.py), then returns failed. On the next --resume attempt, resume_session refuses to replay because the status is no longer resume_needed, so transient outages during resume can permanently strand a session instead of allowing another retry cycle.
Useful? React with 👍 / 👎.
| if progress.run_status is not RunStatus.RESUME_NEEDED: | ||
| return ResumeReport( | ||
| exit_status="no-resume-needed", |
There was a problem hiding this comment.
Reject non-resumable statuses instead of returning success
This branch treats every status other than resume_needed as no-resume-needed, including blocked and in_progress. In the CLI, no-resume-needed maps to exit code 0 (_run_crm_resume), so operators can get a success exit even when the session is actually in a failure state that did not complete. The no-op success behavior should be limited to already-completed sessions (FR-021), with other statuses surfaced as actionable errors.
Useful? React with 👍 / 👎.
| try: | ||
| secrets = load_dataverse_secrets() | ||
| except Slice2ConfigError as exc: |
There was a problem hiding this comment.
Skip Dataverse auth for already-completed resume sessions
run-crm --resume loads Dataverse secrets before checking local writeback_progress, so a re-invocation of a finalized session can fail with missing/invalid credentials instead of being the FR-021 no-op. This makes the resume command depend on live write credentials even when no CRM call should happen.
Useful? React with 👍 / 👎.
| writeback = WriteBack.model_validate_json( | ||
| writeback_path.read_text(encoding="utf-8") | ||
| ) |
There was a problem hiding this comment.
Normalize malformed writeback artifacts into ResumeError
If writeback.json exists but contains malformed JSON/schema, WriteBack.model_validate_json(...) raises directly and bypasses the ResumeError handling in _run_crm_resume, resulting in an unstructured crash path instead of the intended operator-visible pre-flight failure. Corrupt artifacts are a realistic resume scenario and should be surfaced as a controlled error.
Useful? React with 👍 / 👎.
| # mapping the original run used. Schema/approval errors are operator- | ||
| # visible-blocked (mirrors `_verify_readiness`'s first two gates). | ||
| try: | ||
| mapping = load_mapping(slice2_config.dataverse.mapping_artifact) |
There was a problem hiding this comment.
Enforce mapping continuity for resumed write-back
Resume re-translates persisted conceptual payloads with the current mapping file from config, but it does not verify this is the same mapping used by the original failed run. If mappings changed in between (field logical names, option-set bindings, owner override field), resume can write different columns than originally intended, violating idempotent recovery guarantees for that session.
Useful? React with 👍 / 👎.
3 real correctness bugs and 1 P2 surface-fix from the first review pass on commit 17ba33f: P1 (Copilot MED) — runner never sets RESUME_NEEDED: - Previously every write failure in slice2/runner.py was funnelled into `adapter.flush_pending_failures()` which always stamped BLOCKED. The resume coordinator could therefore never be triggered by a real failed run — only by tests that manually stamped writeback_progress. - Parameterized `flush_pending_failures(failure_run_status=...)` on the adapter, default still BLOCKED (preserves existing callers). - Runner now branches: `TransientDataverseError` → RESUME_NEEDED (the client's retry budget exhausted; replay is appropriate); anything else (DataverseWriteBackError, MappingError, permanent errors) → BLOCKED. - Runner's `exit_status` also gains a `resume_needed` value so the CLI reports the same semantic distinction. P1 (Codex) — resume escalates transients to permanent: - When `resume_session`'s replay itself failed transiently, it called `flush_pending_failures()` with the default BLOCKED, making the session permanently non-resumable on any flake. Now it mirrors the runner: TransientDataverseError → keep RESUME_NEEDED, anything else → BLOCKED. P1 (Codex + Copilot) — resume silently no-op'd on non-RESUME_NEEDED: - `resume_session` collapsed every non-`resume_needed` state into `no-resume-needed` (CLI exit 0). Now distinguishes per-state: COMPLETED → `no-resume-needed` (true idempotent no-op, FR-021); BLOCKED → `blocked` (permanent error; operator must intervene); IN_PROGRESS → `failed` (possible concurrent run; refuse to race); RESUME_NEEDED → replay. P1 (Codex) — CLI loaded secrets before checking progress: - `run-crm --resume` loaded Dataverse secrets BEFORE consulting `writeback_progress`, so a finalized-session re-invocation (which should be a credentials-free no-op) failed with a secret-load error. Reordered: pre-flight checks the progress row first; if COMPLETED/BLOCKED/IN_PROGRESS, short-circuit before any secrets are read or any Dataverse round-trip is made. P2 (Codex + Sourcery) — malformed writeback.json: - `WriteBack.model_validate_json` raised `ValidationError` directly, bypassing the structured `ResumeError` surface. Wrapped in try/except (ValueError, OSError) → `ResumeError` with a clear message. Copilot quick fixes: - `--resume` now warns when combined with `--conversation-fixture` / `--campaign` (those have no effect in resume mode). - Two test docstrings corrected: scenario 2 acknowledges the shared config's max_retries=0 baseline; scenario 3 acknowledges it synthesizes the RESUME_NEEDED state rather than driving the transient sequence. New tests added: - `test_us4_adapter_flush_pending_failures_supports_resume_needed` — pins the adapter's new RESUME_NEEDED target path (the runner diff is covered by inspection — driving the runner's TransientDataverseError catch with the fake's request-counting fail-injection is brittle). - `test_us4_resume_raises_when_writeback_json_malformed` — Codex P2. - `test_us4_resume_rejects_non_resumable_states` parametrized over BLOCKED and IN_PROGRESS — Codex P1. Deferred (Codex P2): - "Enforce mapping continuity for resumed write-back" — currently resume re-translates with the CURRENT mapping file rather than the one the original run used. Version-stamping the writeback.json + verifying on resume is a separate scope; tracked for a follow-up. Regression: 430/432 (added 3 new tests; same 2 pre-existing constitution-sync failures). Ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b83f1e2637
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| return CrmRunReport( | ||
| exit_status="failed", | ||
| exit_status="resume_needed" | ||
| if failure_run_status is RunStatus.RESUME_NEEDED | ||
| else "failed", | ||
| message=f"dataverse write failed: {exc}", |
There was a problem hiding this comment.
Populate session_id on resume-needed failure reports
When a Dataverse write exhausts retries, this branch now returns exit_status="resume_needed" but does not include the failed run’s session_id. The new resume flow requires run-crm --resume <session-id>, and _print_crm_report only prints session_id when present, so operators cannot reliably resume from CLI output after a transient failure without manually querying state tables/artifacts. Capture and return the in-flight session id in this report path so the resume command is actionable.
Useful? React with 👍 / 👎.
SonarCloud flagged 4.3% Duplication on New Code (required ≤ 3%) on commit b83f1e2 — caused by the round-1 fixes adding three new tests that all duplicated the "drive run + restamp writeback_progress" preamble + the 7-arg `resume_session(...)` invocation block. Extracted two helpers: - `_stamp_progress(conn, session_id, *, run_status, **flags)` — wraps the `store.upsert_writeback_progress` + `WriteBackProgress(...)` boilerplate. Six call sites collapse from ~12 lines to 2-6 each. - `_invoke_resume(*, session_id, conn, artifact_root, fake, mapping)` — wraps the standard `resume_session(...)` call. Five call sites collapse from 8 lines to 6. No behavior change; all 10 US4 tests still pass; ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
| When a write-enabled run exhausts its retry budget mid-write-back the runner | ||
| exits with ``run_status = resume_needed`` and persists three things: | ||
|
|
||
| 1. ``writeback_progress(session_id, phone_call_activity_done, | ||
| queue_status_update_done, task_done, run_status)`` — which of the three | ||
| write-back ops succeeded. | ||
| 2. ``writeback.json`` (under ``<artifact_root>/<session_id>/``) — the | ||
| conceptual payloads the run intended to emit; written atomically per the | ||
| Slice 1 artifact contract. | ||
| 3. ``crm_correlations`` rows — the confirmed CRM record IDs and any | ||
| ``failed`` rows from the partial run, so a later resume's pre-query | ||
| short-circuits cleanly even if a record was created but the local | ||
| correlation row was rolled back by a transient error. |
| @dataclass | ||
| class ResumeReport: | ||
| """The resume coordinator's outcome — superset of CrmRunReport. | ||
|
|
||
| The slice2 runner converts this to a CrmRunReport when called from the CLI. | ||
| """ | ||
|
|
||
| exit_status: str # "completed" | "failed" | "blocked" | "no-resume-needed" | ||
| session_id: str | ||
| message: str | None = None | ||
| artifact_dir: Path | None = None | ||
| operations_replayed: list[str] | None = None |
| except (DataverseError, DataverseWriteBackError, MappingError) as exc: | ||
| # A permanent Dataverse write failure mid-run, or a missing/invalid | ||
| # mapping entry surfaced by the adapter (e.g. `primary_id` not set), | ||
| # is operator-visible-failed. The orchestrator's | ||
| # `with store.transaction(conn)` around the failing emit_* has | ||
| # already rolled back; the SQLite write lock is free again, so we can | ||
| # now safely persist the failure markers the adapter staged. The | ||
| # resume coordinator (US4) reads from these rows. | ||
| adapter.flush_pending_failures() | ||
| # A Dataverse write failure mid-run, or a missing/invalid mapping entry | ||
| # surfaced by the adapter (e.g. `primary_id` not set), is | ||
| # operator-visible-failed. The orchestrator's `with | ||
| # store.transaction(conn)` around the failing emit_* has already | ||
| # rolled back; the SQLite write lock is free again, so we can now | ||
| # safely persist the failure markers the adapter staged. | ||
| # | ||
| # Distinguish TRANSIENT (retry-budget-exhausted) from PERMANENT for | ||
| # the resume ledger (Copilot PR #9 review: previously every failure | ||
| # got `BLOCKED`, so the resume coordinator could never be triggered | ||
| # by a real failed run). The client's `_request` retries on transient | ||
| # errors until budget exhausted then re-raises, so seeing a | ||
| # TransientDataverseError here means retry-exhaust; that's exactly | ||
| # the FR-023 resume-needed case. | ||
| failure_run_status = ( | ||
| RunStatus.RESUME_NEEDED | ||
| if isinstance(exc, TransientDataverseError) | ||
| else RunStatus.BLOCKED | ||
| ) | ||
| adapter.flush_pending_failures(failure_run_status=failure_run_status) | ||
| return CrmRunReport( | ||
| exit_status="failed", | ||
| exit_status="resume_needed" | ||
| if failure_run_status is RunStatus.RESUME_NEEDED | ||
| else "failed", | ||
| message=f"dataverse write failed: {exc}", | ||
| warnings=adapter.warnings(), | ||
| ) |
| if resume is not None: | ||
| if not write: | ||
| typer.echo( | ||
| "error: --resume requires --write (resume is a write-back continuation)", | ||
| err=True, | ||
| ) | ||
| raise typer.Exit(code=2) | ||
| if queue_item_id or next_ready or transport_fixture is not None: | ||
| typer.echo( | ||
| "error: --resume is incompatible with --queue-item-id / --next-ready / " | ||
| "--transport-fixture (resume operates on persisted write-back payloads only)", | ||
| err=True, | ||
| ) | ||
| raise typer.Exit(code=2) |
| The full transient-exhaust → RESUME_NEEDED flow is exercised | ||
| end-to-end by `test_us4_runner_sets_resume_needed_on_transient_exhaust` | ||
| below.""" |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d37e24efea
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| adapter = DataverseWriteBackAdapter( | ||
| conn=conn, | ||
| client=client, | ||
| translator=translator, | ||
| task_owners=task_owners, | ||
| now_utc_ms=clk.now_utc_ms, | ||
| ) |
There was a problem hiding this comment.
Preserve warning context before replaying queue status
The resume path builds a fresh DataverseWriteBackAdapter without restoring any prior warnings, then may replay emit_queue_status_update. Because DataverseWriteBackAdapter._compose_last_error derives queue.last_error from self._warnings, a resumed run that still needs the queue-status PATCH (for example, a transient failure before queue update) will drop FR-034 warning text from the actual CRM update even though the original run had that warning. This breaks warning continuity across resume and can hide data-quality context operators are supposed to see on the queue row.
Useful? React with 👍 / 👎.
| try: | ||
| secrets = load_dataverse_secrets() | ||
| except Slice2ConfigError as exc: |
There was a problem hiding this comment.
Validate local resume artifacts before loading secrets
_run_crm_resume loads Dataverse credentials before checking whether writeback.json is present/valid for the session, but those artifact checks happen in resume_session and can fail without any CRM call. For a resume_needed session with missing/corrupt local artifacts, operators currently get a credential error first if secrets are unavailable, which masks the real pre-flight failure and makes a purely local resume diagnosis depend on live auth.
Useful? React with 👍 / 👎.
| # visible-blocked (mirrors `_verify_readiness`'s first two gates). | ||
| try: | ||
| mapping = load_mapping(slice2_config.dataverse.mapping_artifact) | ||
| except MappingError as exc: | ||
| conn.close() | ||
| typer.echo(f"error: mapping artifact error: {exc}", err=True) | ||
| raise typer.Exit(code=2) from None | ||
| if not mapping.meta.approved: | ||
| conn.close() | ||
| typer.echo( | ||
| "error: mapping artifact " | ||
| f"{slice2_config.dataverse.mapping_artifact!r} is not approved; " | ||
| "re-run `discover-crm` and have a reviewer flip _meta.approved to true", | ||
| err=True, | ||
| ) | ||
| raise typer.Exit(code=2) |
| @dataclass | ||
| class ResumeReport: | ||
| """The resume coordinator's outcome — superset of CrmRunReport. | ||
|
|
||
| The slice2 runner converts this to a CrmRunReport when called from the CLI. | ||
| """ | ||
|
|
||
| exit_status: str # "completed" | "failed" | "blocked" | "no-resume-needed" | ||
| session_id: str | ||
| message: str | None = None | ||
| artifact_dir: Path | None = None | ||
| operations_replayed: list[str] | None = None | ||
|
|
||
|
|
||
| class ResumeError(RuntimeError): | ||
| """Pre-flight resume failure (writeback.json missing, progress row absent, | ||
| session not in resume_needed state). Distinct from a Dataverse write | ||
| failure during replay, which produces a ``failed`` ResumeReport.""" |
| # Scenario 4b — runner sets RESUME_NEEDED on transient exhaust (Copilot PR #9) | ||
| # --------------------------------------------------------------------------- | ||
|
|
||
|
|
||
| def test_us4_adapter_flush_pending_failures_supports_resume_needed( |
Round-2 review on commit d37e24e (the dedup commit) produced 8 new Copilot inline findings. Triage + fix: P1 (real architectural limitation — documented): - resume_session loads `writeback.json` to replay payloads, but the Slice 1 orchestrator only WRITES writeback.json AFTER all emit_* succeed. A natural transient-exhaust RESUME_NEEDED state therefore has no writeback.json on disk; resume_session correctly raises ResumeError("writeback.json missing …"). Documented as a KNOWN LIMITATION at the top of src/opencloser/slice2/resume.py — full fix requires orchestrator changes (incremental writeback.json write or a "planned-writeback.json" sidecar before emit_*) and is independent of US4's scope. Tracked as a follow-up. P1 (real fix — session_id surfaced on RESUME_NEEDED): - Runner returned CrmRunReport(exit_status="resume_needed", session_id=None) on transient exhaust, so the operator had no way to know which session to pass to `--resume`. Added `DataverseWriteBackAdapter.pending_failure_session_ids()` to extract the staged session ids before flush; runner threads the first one through to CrmRunReport.session_id. Other Copilot findings — fixed: - ResumeReport.exit_status docstring missing "resume_needed" — added full enum list with CLI exit-code mappings (and corrected the earlier "converted by the runner" claim — resume_session is called directly from the CLI by _run_crm_resume). - Truthiness check on queue_item_id let `--queue-item-id ""` slip through. Switched to `is not None`. - Test docstring referenced a renamed/removed test `test_us4_runner_sets_resume_needed_on_transient_exhaust`. Updated to reference the actual adapter-level test + the documented writeback.json limitation. - Section header "Scenario 4b — runner sets RESUME_NEEDED on transient exhaust" was misleading (test exercises the adapter directly). Renamed to match. - CLI _run_crm_resume mapping-artifact errors exited 2/`failed` with a plain error line; mismatched the structured `blocked` report shape of normal write-enabled run. Now produces `exit_status: blocked` reports with exit 1. Regression: 20/20 affected tests pass; ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 415d7b4a01
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| exit_status="resume_needed" | ||
| if failure_run_status is RunStatus.RESUME_NEEDED | ||
| else "failed", |
There was a problem hiding this comment.
Ensure resumable artifacts exist before returning resume_needed
This branch marks transient write failures as resume_needed, but those failures occur before the orchestrator writes writeback.json (artifact writing happens only after all emit_* calls succeed in process_one_queue_item). The new resume coordinator hard-requires that file and raises ResumeError when it is missing (src/opencloser/slice2/resume.py), so a natural retry-exhausted run can be reported as resumable and then be impossible to resume. In practice this strands real transient failures instead of enabling FR-023 recovery.
Useful? React with 👍 / 👎.
| exit_status="resume_needed" | ||
| if failure_run_status is RunStatus.RESUME_NEEDED | ||
| else "failed", | ||
| session_id=session_id, |
There was a problem hiding this comment.
Return blocked when replay permanently fails
In resume_session, permanent replay errors are explicitly escalated to RunStatus.BLOCKED, but the returned report maps that path to exit_status="failed" instead of "blocked". This produces an inconsistent operator signal (and different exit code) for the same terminal state in writeback_progress, especially when a mapping/permission regression causes a non-transient replay failure.
Useful? React with 👍 / 👎.
* ci: add Claude Code Review workflow (anthropics/claude-code-action@v0) Activates Claude reviews on every PR by adding the workflow file to the default branch. The Claude action has a safety check that validates the workflow file exists with identical content on the default branch; until this PR merges, Claude cannot run even though it's installed on the repo (see PRs #8 and #9 where it queued/skipped). Configuration: - Triggers on `pull_request: [opened, synchronize]` — every new PR and every push to an existing PR. - Authenticates via `CLAUDE_CODE_OAUTH_TOKEN` repo secret (set up via the GitHub App install flow). Billing flows to the Claude.ai Max subscription that minted the token. - Permissions are the minimum needed: contents:read, pull-requests:write, issues:write, id-token:write. - Both action versions pinned to SHA per SonarCloud's supply-chain hotspot rule (actions/checkout@v4.2.2, anthropics/claude-code-action@v0 latest). This file is identical to the one already in the umbrella branch `002-mock-call-real-crm` (merged via PR #8). Landing it on `main` separately so Claude reviews activate immediately rather than waiting on the larger umbrella → main merge. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ci: upgrade claude-code-action v0 → v1, add prompt + fork guard Addresses PR #10 round-1 review (Codex P1+P2, Copilot, Sourcery): - Codex P1 — missing prompt: the v0 action without a `prompt` stays in `mode: tag` (waits for `@claude` mentions). With no comment-trigger workflow on this repo, that means PR #10's stated goal — automatic Claude reviews on every PR — would not actually happen. Upgrading to v1 (pinned to SHA 537ffff2) and providing the documented `prompt:` block for automated review (see anthropics/claude-code-action docs/solutions.md §Automatic PR Code Review). - Codex P2 + Copilot — fork PRs: GitHub does not provide repo secrets to workflows triggered from forks (or Dependabot PRs), so the action would fail with an opaque auth error. Added an `if:` guard on the job that skips fork PRs explicitly. If fork reviews are needed later, the conversion to `pull_request_target` belongs in a separate PR with careful sandboxing review. - Sourcery — permission scoping: moved the `permissions:` block from workflow-level to job-level (so future jobs added to this workflow don't inherit elevated rights) and dropped `issues: write` (the v1 auto-review path doesn't need it; only contents:read, pull-requests:write, and id-token:write are required). Copilot — checkout merge-commit vs head: intentionally NOT changing. The default `actions/checkout@v4` behavior (checkout the PR merge commit) gives Claude the post-merge state, which is what reviewers actually want to assess. Renamed the step from "Checkout PR head" to "Checkout PR" to remove the misleading label. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ci: rename step + document checkout-merge-commit decision Follow-up to 94f2b7f: the previous commit's message claimed a step rename and a checkout-decision comment that I forgot to actually make. Adding them now: - Step renamed "Checkout PR head" → "Checkout PR" (the default pull_request checkout produces the MERGE commit, not the head, so "head" was misleading). - Inline comment documents why we keep the default rather than setting `with.ref: ${{ github.event.pull_request.head.sha }}` per Copilot's PR #10 suggestion — reviewing the merge state is the right default. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: brettheap <brett.heap@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Empty commit to re-trigger the `pull_request: synchronize` workflow so Claude reviews this PR for the first time. The previous claude-review runs on this PR's commits were intentionally skipped by the action's workflow-validation guard (the workflow file wasn't on the default branch until PR #10 merged in commit e369717). No code changes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
# Conflicts: # .github/workflows/claude-code-review.yml
| exit_status="resume_needed" | ||
| if failure_run_status is RunStatus.RESUME_NEEDED | ||
| else "failed", |
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3d822d8de1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| writeback = WriteBack.model_validate_json( | ||
| writeback_path.read_text(encoding="utf-8") | ||
| ) |
There was a problem hiding this comment.
Verify writeback session_id matches the resume target
The resume path loads writeback.json and immediately replays its payloads without checking that the artifact belongs to the requested session_id. If the artifact directory is stale/cross-copied (a realistic corruption/ops scenario), emit_* will execute using the embedded payload session IDs while this function still finalizes the requested session, which can write CRM updates/correlations under the wrong session and mark the wrong progress row completed. Add an explicit equality check (top-level and nested payload session IDs) before replay.
Useful? React with 👍 / 👎.
Copilot PR #9 round-3 inline on src/opencloser/slice2/resume.py:235 — when the resume's replay fails with a permanent error, `failure_run_status` is correctly set to `RunStatus.BLOCKED` and persisted via `flush_pending_failures(...)`, but the returned `ResumeReport.exit_status` was still "failed" (exit 2) instead of "blocked" (exit 1). The CLI exit-code mapping declares blocked→1 and failed→2, so this surfaced the wrong exit code for a now-blocked session. Fixed: explicit ladder mapping `failure_run_status` (the persisted progress state) to `exit_status` — RESUME_NEEDED→"resume_needed", BLOCKED→"blocked", anything else→"failed". The state and the report now agree. Regression: 10/10 US4 tests pass; ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
| if resume is not None: | ||
| if not write: | ||
| typer.echo( | ||
| "error: --resume requires --write (resume is a write-back continuation)", | ||
| err=True, | ||
| ) | ||
| raise typer.Exit(code=2) | ||
| # Explicit `is not None` checks (Copilot PR #9 round-2 review): |
| """Pre-flight resume failure (writeback.json missing, progress row absent, | ||
| session not in resume_needed state). Distinct from a Dataverse write | ||
| failure during replay, which produces a ``failed`` ResumeReport.""" |
| session_dir = artifact_root / session_id | ||
| writeback_path = session_dir / _WRITEBACK_FILENAME | ||
| if not writeback_path.exists(): | ||
| raise ResumeError( | ||
| f"writeback.json missing under {session_dir!r}; cannot resume " | ||
| "without persisted payloads (FR-023). Inspect local audit-artifact " | ||
| "retention (FR-035) and re-run the original `run-crm` instead." | ||
| ) |
| @@ -343,9 +363,67 @@ def run_crm( | |||
| mapping, exercise persona + transport, capture planned write-back artifacts, | |||
| zero CRM mutations). ``--write`` enables the write-enabled path. There is no | |||
| way to mutate Dataverse without ``--write`` (SC-013). | |||
|
|
|||
| ``--resume <session-id>`` (FR-023, T033): when supplied, the CLI replays | |||
| only the missing ``emit_*`` operations for a ``resume_needed`` session | |||
| using the persisted ``writeback.json`` and ``writeback_progress`` rows. | |||
| Requires ``--write`` (resume is by definition a write-back continuation) | |||
| and is incompatible with ``--queue-item-id`` / ``--next-ready`` / | |||
| ``--transport-fixture`` (resume does not re-run the orchestrator). | |||
| """ | |||
| run_mode = RunMode.WRITE_ENABLED if write else RunMode.DRY_RUN | |||
|
|
|||
| if resume is not None: | |||
| if not write: | |||
| typer.echo( | |||
| "error: --resume requires --write (resume is a write-back continuation)", | |||
| err=True, | |||
| ) | |||
| raise typer.Exit(code=2) | |||
| # Explicit `is not None` checks (Copilot PR #9 round-2 review): | |||
| # truthiness alone would let `--queue-item-id ""` slip through as | |||
| # falsy, silently ignored rather than flagged as an incompatible | |||
| # combination. | |||
| if ( | |||
| queue_item_id is not None | |||
| or next_ready | |||
| or transport_fixture is not None | |||
| ): | |||
| typer.echo( | |||
| "error: --resume is incompatible with --queue-item-id / --next-ready / " | |||
| "--transport-fixture (resume operates on persisted write-back payloads only)", | |||
| err=True, | |||
| ) | |||
| raise typer.Exit(code=2) | |||
| # Copilot PR #9 review: warn when other orchestrator-input flags | |||
| # are silently ignored. These have no effect in resume mode because | |||
| # the resume coordinator does not re-run process_one_queue_item. | |||
| if conversation_fixture is not None: | |||
| typer.echo( | |||
| "warning: --conversation-fixture is ignored when --resume is set " | |||
| "(resume does not re-run the orchestrator)", | |||
| err=True, | |||
| ) | |||
| if campaign is not None: | |||
| typer.echo( | |||
| "warning: --campaign is ignored when --resume is set " | |||
| "(resume operates on the session id only)", | |||
| err=True, | |||
| ) | |||
| _run_crm_resume( | |||
| session_id=resume, | |||
| config_path=config_path, | |||
| slice2_config_path=slice2_config_path, | |||
| ) | |||
| return | |||
| try: | ||
| writeback = WriteBack.model_validate_json( | ||
| writeback_path.read_text(encoding="utf-8") | ||
| ) | ||
| except (ValueError, OSError) as exc: |
| # record WAS created in the partial run but the local correlation row | ||
| # was rolled back, the existing CRM record is reused (FR-024). | ||
| if not progress.phone_call_activity_done and writeback.phone_call_activity is not None: | ||
| adapter.emit_phone_call_activity(writeback.phone_call_activity) | ||
| replayed.append("phone_call_activity") | ||
|
|
||
| # Queue status update — replay only if not already done. The adapter's | ||
| # _fetch_queue_last_session check skips the PATCH if the row already | ||
| # carries this session's id. | ||
| if not progress.queue_status_update_done: | ||
| adapter.emit_queue_status_update(writeback.queue_status_update) | ||
| replayed.append("queue_status_update") | ||
|
|
||
| # Task — replay only if not already done. emit_task's FR-018 exclusion | ||
| # path will skip the operation if the session's final_disposition is in | ||
| # the no-task set; otherwise pre-query + create or reuse. | ||
| if not progress.task_done and writeback.task is not None: | ||
| adapter.emit_task(writeback.task) |
Copilot round-3 (on commit 7b693dc) flagged 2 real items: SECURITY (Copilot PR #9 round-3, P1 in spirit): - `--resume <session-id>` was passed straight into a filesystem path (`<artifact_root>/<session_id>/writeback.json`) without validation. A crafted value containing path separators (e.g. `--resume "../../etc/passwd"`) would have caused the resume coordinator to read outside the artifact root. - Added `_SESSION_ID_RE = re.compile(r"^ses_[0-9a-f]{32}$")` mirroring the format minted by `opencloser.core.ids.new_session_id()` (`ses_` + uuid4().hex). Validated via `re.fullmatch` (not `re.match` — `$` would tolerate a trailing newline and defeat the check) in `_run_crm_resume` BEFORE any filesystem operation. - 8-case parametrized test `test_cli_run_crm_resume_rejects_malformed_session_id` covers path traversal (`../../etc/passwd`, `ses_../../...`), absolute paths, too-short, non-hex, wrong prefix, empty, and trailing-newline cases. DOC FIX (Copilot, LOW): - `ResumeError`'s docstring claimed it covers "session not in resume_needed state", but `resume_session()` returns a structured `ResumeReport` for COMPLETED/BLOCKED/IN_PROGRESS rather than raising. Updated to describe actual behavior: ResumeError is raised only for missing-progress-row and missing-or-malformed-writeback.json cases; non-RESUME_NEEDED states produce a ResumeReport. Regression: 28/28 affected tests pass; ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
| f"({type(exc).__name__}): {exc}. Cannot resume safely; re-run the " | ||
| "original `run-crm` instead." | ||
| ) from exc | ||
|
|
| session_dir = artifact_root / session_id | ||
| writeback_path = session_dir / _WRITEBACK_FILENAME | ||
| if not writeback_path.exists(): |
| adapter._record_failure( # type: ignore[attr-defined] # private but stable | ||
| session_id=report.session_id, | ||
| record_kind=CrmRecordKind.QUEUE_STATUS, | ||
| error=TransientDataverseError("503 simulated", status_code=503), | ||
| progress_key="queue_status_update_done", | ||
| dataverse_record_id=None, | ||
| ) | ||
| adapter.flush_pending_failures(failure_run_status=RunStatus.RESUME_NEEDED) |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 12ad8ec6d6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| failure_run_status = ( | ||
| RunStatus.RESUME_NEEDED | ||
| if isinstance(exc, TransientDataverseError) | ||
| else RunStatus.BLOCKED |
There was a problem hiding this comment.
Avoid labeling transient write failures as resumable yet
This branch stamps transient emit failures as RunStatus.RESUME_NEEDED, but the same failure path occurs before writeback.json is written by the orchestrator (core/orchestrator.py writes artifacts only after all emit_* calls succeed), so real retry-exhausted runs cannot actually be resumed. The new resume coordinator hard-requires that artifact and raises ResumeError when it is missing (slice2/resume.py), which leaves operators with a resume_needed status that cannot be recovered via run-crm --resume. Either persist the planned writeback payload before emits, or keep transient failures in a non-resumable status until that artifact guarantee exists.
Useful? React with 👍 / 👎.
Copilot + Codex landed 4 real findings on commit 3f41a66: SECURITY (Copilot real): - `_detect_queue_conflict` interpolated `queue_item_id` directly into the `$filter` predicate — same OData-filter-injection vector Copilot flagged on `--resume` in PR #9. Added `_safe_odata_token()` (regex `^[A-Za-z0-9_-]+$` allowlist; raises DataverseWriteBackError on rejection) inline in adapter.py and applied at BOTH call sites (`_detect_queue_conflict` AND the pre-existing `_fetch_queue_last_session`). Duplicates queue_loader's `_odata_token` helper; consolidating into a dedicated `odata.py` module is a future cleanup. Keeping the duplication narrow rather than touching unrelated code in this PR. P2 (Copilot + Codex, both flagged): - Null-status conflict detection: `_detect_queue_conflict` only flagged a conflict when `current_status is not None`. If a human or sync job cleared the status field to null between claim and write, the check silently let it through and PATCHed over the cleared value. Removed the `is not None` guard; null is now treated as a conflict relative to a non-null snapshot. LOW (Copilot): - `QueueConflictError` docstring claimed conflict detection covers "any `preserve_if_present` field", but the implementation explicitly scopes to status + missing-row. Updated the docstring to describe current scope and the deferred preserve_if_present work. Test coverage gap (Copilot): - The runner's `QueueConflictError → exit_status="blocked"` mapping was new behavior with no test. Added `test_us4_t045_runner_maps_queue_conflict_to_blocked` that monkey- patches `emit_queue_status_update` to raise QueueConflictError directly, then asserts the runner produces exit_status="blocked", writeback_progress.run_status=BLOCKED, and the "queue conflict — manual reconciliation required" message prefix. Future refactors of the runner's exception ladder will now fail CI if they regress this operator-visible contract. Regression: 14/14 US4 tests pass (added 1 new); ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…atural-path test (closes review HIGH #8/#9) Closes 2 HIGH findings from the 2026-05-24 swarm code-review pass: HIGH #8 (test-pyramid): the runner's `TransientDataverseError → RESUME_NEEDED` branch was verified by inspection only — every existing resume test synthesized the state via `_stamp_progress`. A future refactor that swapped RESUME_NEEDED ↔ BLOCKED in the runner would silently pass CI. Added a true end-to-end natural-failure test that drives task POSTs through retry- budget exhaustion via a URL-pattern fake hook. HIGH #9 (test-pyramid): the fake's `$filter` / `$orderby` silently swallowed unknown field names — asymmetric with the strict `$select` validator that already caught the preserve_if_present registration gap. A misspelled filter field returned empty results (= "no row matches"), invisible to tests. Made both helpers raise `_UnknownODataField` → HTTP 400 → PermanentDataverseError, matching `$select` strictness. Implementation: - fake.py: new `_UnknownODataField` sentinel exception. `_matches_filter` and `_apply_orderby` now take a `valid_fields: set[str]` parameter and raise on unknown fields. `_handle_query` catches and returns 400. - fake.py: new `_DATAVERSE_SYSTEM_ATTRS` frozenset (`createdon`, `modifiedon`, `versionnumber`, `ownerid`) merged into every entity's attrs in `DataverseFake.__init__`. Real Dataverse always exposes these system fields; the new strictness would otherwise reject queries that reference them (e.g. NextReady's `createdon` tiebreaker). - test_us4_idempotency.py: `test_us4_natural_transient_exhaust_yields_resume_needed` uses a URL-pattern gate (succeed everywhere except POST to /tasks) to drive a real `TransientDataverseError` to retry-budget exhaustion. Asserts `exit_status="resume_needed"` + `writeback_progress.run_status=RESUME_NEEDED`. Then documents the KNOWN LIMITATION in resume.py: the subsequent `resume_session` raises `ResumeError("writeback.json missing")` because the orchestrator only writes `writeback.json` after all emit_* succeed. Both legs pinned so the future T052 fix will require updating this test. Result: 617 pass (+1 new Pass 1D test) / 2 pre-existing constitution_sync failures unrelated. ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>



Summary
Acceptance criteria mapping
test_us4_adapter_emit_phone_call_twice_idempotent(FR-024 pre-query short-circuit)test_us4_transient_failure_retry_reuses_idempotency_key(fake.fail_next + retry budget; exactly one record)test_us4_resume_completes_partial_writeback(SC-014)test_us4_resume_finalized_session_is_noop(FR-021)Test plan
uv run pytest tests/integration/test_us4_idempotency.py— 6/6 passing.uv run pytest tests/— 426/428. The 2 failures are the pre-existingtest_constitution_sync.pyfailures (unrelated).uv run ruff check src/ tests/— All checks passed.Files changed
src/opencloser/slice2/resume.py— resume coordinator (~170 LoC)src/opencloser/cli.py—--resume <session-id>flag +_run_crm_resumehelpertests/integration/test_us4_idempotency.py— 6 scenarios (~340 LoC)specs/002-mock-call-real-crm/tasks.md— T031/T032/T033/T034 marked complete; T045/T046 annotated as deferredDesign notes
run-crminvocation creates a new session with its own idempotency key. The spec's "no duplicates" guarantee is per-session — enforced byresume_sessionfor the resume-needed case, and by the queue loader's callable-status filter for the "don't re-process a finalized queue item" case. Documented in the test file._idempotent_createprovides belt-and-suspenders — if a record WAS created in the partial run but the local correlation row got rolled back, the resume's pre-query finds the existing record and reuses its id without creating a duplicate.🤖 Generated with Claude Code
Summary by Sourcery
Add a write-back resume coordinator for CRM sessions and wire it into the CLI, with integration tests validating idempotent behaviour across retries and resumes.
New Features:
run-crmCLI command with a--resume <session-id>flag that routes to the resume coordinator and relaxes the transport fixture requirement when resuming.Enhancements:
Tests: