test(sdk/py): salvage first-run contract tests from #591#633
Merged
Conversation
6 tasks
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a new Python SDK “first-run” contract test suite under sdk/py/tests/contracts/, salvaging the durable regression/UX contracts from closed audit PR #591 so key v0.10.0 behaviors stay pinned in the normal test run.
Changes:
- Introduces
test_first_run.pycontract tests covering in-process create/sign/hash/verify, chain verification/tamper detection, daemon emitter round-trip (opt-in viaAGENTRECEIPTS_SOCKET), silent-drop behavior, top-levelEmitterProtocol behavior, and WAL retain/replay semantics. - Adds the
tests/contractspackage initializer to integrate the suite into pytest discovery.
Reviewed changes
Copilot reviewed 1 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| sdk/py/tests/contracts/test_first_run.py | Adds first-run contract tests pinning day-one SDK behaviors (in-process flow, daemon emitter contracts, and WAL semantics). |
| sdk/py/tests/contracts/init.py | Creates the contracts test package for pytest discovery/organization. |
ojongerius
added a commit
that referenced
this pull request
May 26, 2026
Drops the `_is_protocol` attribute check and the exact TypeError message match in `test_top_level_emitter_is_now_a_protocol`. Both are `typing` implementation details that can drift across Python versions. The behavioural contract — that instantiating `Emitter` raises TypeError — is what matters from the user's perspective and is what the test now pins. Addresses Copilot feedback on #633.
Salvages the regression tests from the v0.10.0 first-run audit (closed PR #591) into the sdk/py suite so the contracts they pin survive without the surrounding audit-report markdown. Pinned contracts: - In-process happy path (README Quick Start) - 2-link chain verify + tamper detection - Top-level `Emitter` is a Protocol (v0.10.0 breaking rename) - `WalEmitter` retains on failure and replays (v0.10.0 WAL fix) - Live-daemon round-trip (skipped unless AGENTRECEIPTS_SOCKET is set) Two tests pin behaviour that is currently under decision and are flagged in module docstring to be flipped when the decision lands rather than silently kept: - silent-drop on missing daemon (#599 emit-failure-contract) - runtime_checkable / DaemonEmitter arity footgun (PY-P4)
Drops the `_is_protocol` attribute check and the exact TypeError message match in `test_top_level_emitter_is_now_a_protocol`. Both are `typing` implementation details that can drift across Python versions. The behavioural contract — that instantiating `Emitter` raises TypeError — is what matters from the user's perspective and is what the test now pins. Addresses Copilot feedback on #633.
501bdf6 to
3fea55e
Compare
ojongerius
added a commit
that referenced
this pull request
May 26, 2026
2 tasks
ojongerius
added a commit
that referenced
this pull request
May 26, 2026
* chore: remove committed binary and expand .gitignore Removes the accidentally committed agent-receipts Mach-O binary and adds /agent-receipts, /bin/, and /daemon/bin/ to .gitignore so built binaries cannot be committed again. * docs(ops): update current.md for 2026-05-26 Mark closure-1 nodes shipped (#621/#623/#624/#625/#628), add daemon-setup-stale-api and py-readme-daemon-refresh nodes, record #632/#633/#634, note #592 trusted-publishing blocker, refresh Next farmable section.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Salvages the regression tests from the v0.10.0 first-run audit (closed PR #591) into
sdk/py/tests/contracts/so the contracts they pin survive without the surrounding audit-report markdown.What's pinned
test_in_process_happy_pathtest_two_receipt_chaintest_daemon_emitter_roundtrip_against_live_daemonAGENTRECEIPTS_SOCKETis set)test_daemon_emitter_no_daemon_is_silent_droptest_top_level_emitter_is_now_a_protocolEmitter→ Protocol)test_wal_emitter_retains_on_failure_and_replaystest_wal_emitter_cannot_wrap_daemon_emitterruntime_checkable/ DaemonEmitter arity footgun (PY-P4)Tests pinning behaviour under decision
Two of the seven pin behaviour that's currently being reconsidered. Both are flagged in the module docstring so a future maintainer flips them when the decision lands rather than silently keeping a buggy default:
test_daemon_emitter_no_daemon_is_silent_drop. Closure 2: Emit failure contract decided at the protocol level + propagated across SDKs #599 (emit-failure-contract) may decide emit MUST raise on transport failure; if so, flip this test to assert the new contract.runtime_checkable/ DaemonEmitter arity footgun — pinned bytest_wal_emitter_cannot_wrap_daemon_emitter. Tracked as PY-P4 indocs/operations/current.md, gated on Closure 2: Emit failure contract decided at the protocol level + propagated across SDKs #599. When PY-P4 lands, replace this with a positive assertion thatWalEmitterwrapped aroundDaemonEmittercomposes correctly.Provenance
PR #591 surfaced these tests inside an
audit/directory alongside a snapshot-dated report and a CI-example workflow. The report and example are not being salvaged (they're rot-prone). The tests are the durable asset and belong in the regular SDK suite.The other audit findings have been refiled as #630 (closure-1 follow-on,
py-readme-daemon-refresh) and #631 (background — "Using Agent Receipts in CI" docs page).Test plan
uv run pytest tests/contracts/ -v— 6 passed, 1 skippeduv run pytest(full sdk/py suite) — 411 passed, 6 skippeduv run ruff check tests/contracts/— cleanuv run ruff format --check tests/contracts/— clean