feat: verification modes + evidence fields + transport combo rejection (hosted parity)#18
Open
govindkavaturi-art wants to merge 1 commit intomainfrom
Open
Conversation
Ports the outcome-verification feature from the hosted monorepo into cueapi-core and fixes the partial /verify endpoint that PR #15 left behind. Schema: - VerificationMode enum (none, require_external_id, require_result_url, require_artifacts, manual) + VerificationPolicy on CueCreate/CueUpdate. - OutcomeRequest accepts evidence fields inline (external_id, result_url, result_ref, result_type, summary, artifacts). Legacy shape still works. Model: - Migration 017: verification_mode column on cues (String(50), nullable, CHECK-constrained enum). NULL == 'none'. evidence_* columns already existed from PR #15 and are reused. Services: - outcome_service computes outcome_state from (success, mode, evidence). Missing required evidence -> verification_failed. Manual mode parks in verification_pending. Failure bypasses verification entirely. - cue_service _check_transport_verification_combo rejects worker+evidence at create and update. Lifted in a follow-up PR once cueapi-worker 0.3.0 lands on PyPI. Router: - POST /v1/executions/{id}/verify now accepts {valid: bool, reason: str?}. valid=true preserves legacy behavior; valid=false -> verification_failed with reason recorded on evidence_summary. Accepted starting states expanded to include reported_failure. Empty body defaults to valid=true for full backward compat. Tests: - 35 new tests across 4 files (verification_modes, transport_verification_combo, outcome_evidence, verify_endpoints). - Amended test_execution_parity.py::test_verify_wrong_state to use a pre-outcome state (reported_failure is now a valid starting state). - Full-suite delta: +35 passing, 0 new failures. Pre-existing SDK-integration failures (cueapi Python package not installed locally) unchanged. Alert firing for verification_failed deferred to PR 2 (alerts feature). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
govindkavaturi-art
pushed a commit
that referenced
this pull request
Apr 17, 2026
Ports the alerts feature to OSS. Deliberately excludes SendGrid/email
— self-hosters configure alert_webhook_url and forward to their own
Slack/Discord/ntfy/SMTP relay. Hosted cueapi.ai keeps managed email.
Model + migrations:
- app/models/alert.py: id/user_id/cue_id/execution_id/alert_type/
severity/message/alert_metadata (column 'metadata')/acknowledged/
created_at. CHECK on alert_type IN ('outcome_timeout',
'verification_failed', 'consecutive_failures'). CHECK on severity.
Indexes: user_id, (user_id, created_at), execution_id.
- alembic 018: alerts table.
- alembic 019: users.alert_webhook_url (String 2048) +
alert_webhook_secret (String 64), both nullable.
- 018.down_revision = '016' intentionally — PR #18 introduces 017 but
isn't merged yet. When PR #18 merges first, rebase this PR to chain
017 -> 018. Documented in the migration docstring.
Services:
- app/services/alert_service.py: create_alert with 5-min dedup on
(user_id, alert_type, execution_id|cue_id). count_consecutive_failures
walks execution history backwards, stops at first non-failed.
Threshold = 3. Webhook delivery is fire-and-forget via
asyncio.create_task.
- app/services/alert_webhook.py: deliver_alert with HMAC-SHA256 over
'{timestamp}.{sorted_payload_json}', 10s timeout, SSRF re-resolve at
delivery, never raises. No-URL short-circuits silently. URL-without-
secret logs a warning and skips.
Router + auth:
- app/routers/alerts.py: GET /v1/alerts with alert_type/since/limit/
offset filters, 400 on invalid type, auth-scoped.
- app/routers/auth_routes.py: PATCH /me accepts alert_webhook_url
(empty string clears; SSRF-validated). GET /alert-webhook-secret
lazy-generates on first call. POST /alert-webhook-secret/regenerate
requires X-Confirm-Destructive.
Integration into outcome_service.record_outcome (post-commit):
- verification_failed alert fires when execution.outcome_state ==
'verification_failed'. Dormant on current main (the rule engine that
sets this state lives in PR #18); activates automatically once #18
merges. No rebase of integration code required — only the migration
chain needs updating.
- consecutive_failures alert fires when the streak reaches 3 on a
failed outcome. Independent of PR #18 — works on current main.
- outcome_timeout alert firing deferred — requires a deadline-checking
poller that cueapi-core doesn't have yet. CHECK constraint and
router already accept the type so the wiring is drop-in when that
poller lands.
- Alert firing is wrapped in try/except — must never break outcome
reporting.
Tests (36 new, all passing):
- test_alert_model.py (6): CRUD, CHECK rejection for invalid
type/severity, parametrized valid types, index existence.
- test_alert_service.py (7): create persists, dedup within window,
dedup doesn't cross alert types, consecutive_failures counter +
streak-breaking + threshold constant.
- test_alert_webhook_delivery.py (7): no-URL short-circuit, URL-
without-secret skip, SSRF block, HMAC signature recomputation,
timeout/non-2xx/RuntimeError all swallowed.
- test_alerts_api.py (8): empty list, own alerts, type filter, invalid
type rejected, pagination, cross-user scoping, auth required.
- test_alert_webhook_config.py (6): set valid URL, empty string clears,
SSRF rejection at config, lazy secret generation, confirmation
required, rotation.
- test_outcome_triggers_alert.py (3): verification_failed end-to-end
(seeds outcome_state to exercise the integration path), consecutive
failures end-to-end, isolated failure does NOT fire.
Full-suite delta: +36 passing, 0 new failures. Pre-existing SDK-
integration failures (cueapi Python package not installed locally)
unchanged.
Docs:
- README 'Alerts' section with alert types, querying, webhook setup.
- examples/alert_webhook_receiver.py: 30-line Flask receiver with
signature verification.
- CHANGELOG [Unreleased] entry.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5 tasks
argus-qa-ai
approved these changes
Apr 17, 2026
Collaborator
argus-qa-ai
left a comment
There was a problem hiding this comment.
All CI checks passing. Approved by Argus.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Ports the outcome-verification feature from the hosted monorepo into cueapi-core, and fixes the partially-ported
/verifyendpoint from PR #15 so it honors the{valid, reason}contract documented by the hosted API.What changed
Schema (
app/schemas/)VerificationModeenum:none,require_external_id,require_result_url,require_artifacts,manualVerificationPolicysub-object with a singlemodefield today (leaves room for future fields without breaking the shape)verification: Optional[VerificationPolicy]onCueCreateandCueUpdateOutcomeRequestextended with optionalexternal_id,result_url,result_ref,result_type,summary,artifacts. Legacy shape ({success, result, error, metadata}) is unchangedModel (
app/models/cue.py)verification_modecolumn:String(50), nullable,CHECKconstraint over the enum values.NULLand'none'are treated identicallyevidence_*columns already exist (PR feat: port 10 missing endpoints for feature parity with hosted service #15) and are reusedMigration
017_add_verification_mode.py— adds the column + CHECK constraint. Applies cleanly on a blank DB (verified locally). Downgrade drops the constraint first, then the columnServices
outcome_service.record_outcomecomputesoutcome_statefrom(success, verification_mode, evidence):reported_failurereported_successverification_pendingrequire_external_idverified_successrequire_external_idverification_failedrequire_result_urlverified_successrequire_result_urlverification_failedrequire_artifactsverified_successrequire_artifactsverification_failedcue_service._check_transport_verification_comborejects worker transport paired with evidence-requiring modes at both create and update (see "Restriction" below)Router
POST /v1/executions/{id}/verifynow accepts{valid: bool, reason: str?}via a typedVerifyRequestbody.valid=true(default) →verified_success(legacy behavior preserved — empty body still works)valid=false→verification_failed,reasonrecorded onevidence_summary(truncated to 500 chars, prepended to any existing summary)reported_failure— this was rejected before but there was no semantic reason toOutcomeResponsenow surfacesoutcome_stateIntentional behavior change
POST /v1/executions/{id}/verifywith an explicit{valid: false}body now transitions toverification_failedinstead ofverified_success. Before this PR, the endpoint ignored the request body and always transitioned toverified_success— a silent-failure bug that made thevalid=falsebranch impossible to exercise. Callers relying on the always-success behavior were getting broken semantics anyway. Empty-body requests remainverified_success(the previous default).Restriction
Worker-transport cues cannot combine with
require_external_id/require_result_url/require_artifacts. Attempting to do so at create or PATCH time returns:{ "error": { "code": "unsupported_verification_for_transport", "transport": "worker", "verification_mode": "require_external_id", "supported_worker_modes": ["none", "manual"] } }This is because cueapi-worker < 0.3.0 has no mechanism to attach evidence to the outcome POST. The restriction will be lifted in a follow-up PR once cueapi-worker 0.3.0 (evidence reporting via
CUEAPI_OUTCOME_FILE) is published to PyPI.Tests
35 new tests across four files:
tests/test_verification_modes.py— 10 tests, 5 modes × (satisfied, unsatisfied / applicable variants)tests/test_transport_verification_combo.py— 13 tests: 3 evidence modes rejected × (create, PATCH) + 2 worker-compatible modes accepted + 5 webhook-always-allowed modes + 3 PATCH transitionstests/test_outcome_evidence.py— 4 tests: inline evidence persists, legacy shape still works, Pydantic length caps enforced, PATCH evidence still workstests/test_verify_endpoints.py— 8 tests covering both branches ofvalid, empty body default, reason-preserves-existing-summary, invalid-state rejections, and/verification-pendingTest-suite delta
test_execution_parity.py::TestVerify::test_verify_wrong_statenow uses a pre-outcome state (sincereported_failureis now valid starting state)test_sdk_integration.py(7) —ModuleNotFoundError: No module named 'cueapi'. Confirmed on cleanorigin/main(stashed this PR's changes and re-ran). These are environment-dependent tests that expect the Python SDK to be installed; CI handles thatBackward compatibility
POST /outcomewithout evidence fields → identical behavior to beforePOST /verifywith empty body → identical behavior to before (verified_success)verificationfield →verification_mode = NULL→ outcome-state engine treats asnone→ samereported_success/reported_failuresemantics as beforePATCH /v1/executions/{id}/evidence→ untouched, still accepts the two-step flowReferences
app/schemas/cue.py(VerificationMode/VerificationConfig),app/schemas/outcome.py(evidence fields),app/services/outcome_service.py(rule engine),app/services/cue_service.py(_check_transport_verification_combo),app/routers/executions.py(/verifybody contract)ABSENT/PARTIALitems 1, 2, 3, 4, 5, 6, 8 from the cueapi-core drift re-audit. Items 7, 9–14 are out of scope (alerts + sync-discipline land in follow-up PRs)Test plan
pytest tests/test_verification_modes.py tests/test_transport_verification_combo.py tests/test_outcome_evidence.py tests/test_verify_endpoints.py)pytest tests/— no new failures (SDK-integration failures pre-exist)alembic upgrade headfrom empty schema)\d cues)🤖 Generated with Claude Code