Skip to content

tests: live integration smoke for sync + async guards#5

Merged
amavashev merged 1 commit into
mainfrom
tests/live-integration-smoke
May 13, 2026
Merged

tests: live integration smoke for sync + async guards#5
amavashev merged 1 commit into
mainfrom
tests/live-integration-smoke

Conversation

@amavashev
Copy link
Copy Markdown
Contributor

@amavashev amavashev commented May 13, 2026

Summary

Adds tests/integration/test_live_ap2_guard.py — five tests (four sync + one async) exercising cycles_guard_payment and cycles_guard_payment_async end-to-end against a real Cycles server. Pattern mirrors cycles-client-python/tests/integration/test_live_server.py: module-level pytest.mark.skipif(not CYCLES_BASE_URL) so the whole file is skipped at collection time when env vars are unset.

Why

Every existing test uses MagicMock / AsyncMock for the CyclesClient surface. A server-side rename of a Subject.dimensions key, a field that flips from string to enum, or a change to how response.is_success classifies status codes — none of those would surface in mock-based tests. This integration suite catches that class of regression when run against a real dev server.

Cost

Zero in CI (skipped by env-var gate). When run locally, each test uses a fresh UUID-based transaction_id and a $0.00000001 amount, so the suite is idempotent across runs and costs essentially nothing in budget per execution.

Coverage

Test What it proves
test_clean_commit_against_live_server (sync) Full lifecycle: reserve → commit → receipt fields populated, decision reflects server response, reservation_id non-null
test_exception_inside_with_releases (sync) Exception inside with propagates and release is attempted; no deadlock
test_dry_run_raises_result_no_reservation_created (sync) AP2DryRunResult raised with decision payload, body never executes
test_idempotent_replay_returns_same_reservation_id (sync) Same mandate → same reserve key → server returns the original reservation OR surfaces finalized status (both prove the dedup key collided)
test_async_clean_commit_against_live_server (async) Async lifecycle with open_mandate_hash set (exercises the AP2 §6 consume-once scope through the async path)

How to run

CYCLES_BASE_URL=http://localhost:7878 \
CYCLES_API_KEY=cyc_dev_xxx \
CYCLES_TENANT=ap2-integration \
    pytest tests/integration -v

Tenant needs a budget with payment.charge permitted.

Test plan

  • Default pytest --cov=runcycles_ap2 --cov-fail-under=95147 passed, 5 skipped (integration suite skipped via env-var gate)
  • ruff check . && ruff format --check . — clean
  • mypy --strict runcycles_ap2 — 0 errors (8 files)
  • python -m build — sdist + wheel build cleanly
  • CI green on this PR (will repeat the above — integration tests stay skipped)
  • Manual run against a local dev Cycles server (reviewer / merger verifies)

What this PR is NOT

  • Not a version bump. No public API change. No wire-shape change. No CHANGELOG entry (only AUDIT records it because it's purely a test-surface addition).

Adds `tests/integration/test_live_ap2_guard.py` — five tests
(four sync + one async) exercising cycles_guard_payment and
cycles_guard_payment_async end-to-end against a real Cycles
server. Pattern mirrors cycles-client-python/tests/integration/
test_live_server.py: module-level `pytest.mark.skipif(not
CYCLES_BASE_URL)` so the whole file is skipped at collection
time when env vars are unset. Default `pytest` runs and CI
ignore it; verified locally — 147 passed, 5 skipped.

Why: every existing test uses MagicMock or AsyncMock for the
CyclesClient surface. A server-side rename of a `Subject.dimensions`
key, a field that flips from string to enum, or a change to
how response.is_success classifies status codes — none of those
would surface in mock-based tests. The integration suite catches
that class of regression when run against a real dev server.

Cost: zero in CI (skipped by env-var gate). When run, each test
uses a fresh UUID-based transaction_id and a $0.00000001 (1
micro-cent) amount, so the suite is idempotent across runs and
costs essentially nothing in budget per execution.

Covers:
- Sync clean commit (lifecycle + receipt fields)
- Sync exception → release (no deadlock, propagates)
- Sync dry-run → AP2DryRunResult with decision payload
- Sync idempotent replay (same mandate → same reserve key,
  server collapses or surfaces finalized status — both OK)
- Async clean commit with open_mandate_hash scope (exercises the
  AP2 §6 consume-once path through async)

To run locally:

    CYCLES_BASE_URL=http://localhost:7878 \
    CYCLES_API_KEY=cyc_dev_xxx \
    CYCLES_TENANT=ap2-integration \
        pytest tests/integration -v

Tenant needs a budget with `payment.charge` permitted.

README Development section + AUDIT entry both updated. No public
API change. No wire change. No version bump.
@amavashev amavashev merged commit d893a5b into main May 13, 2026
6 checks passed
@amavashev amavashev deleted the tests/live-integration-smoke branch May 13, 2026 21:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant