fix(control-plane): authorize execution note writes(#420)#575
Conversation
📊 Coverage gateThresholds from
✅ Gate passedNo surface regressed past the allowed threshold and the aggregate stayed above the floor. |
📐 Patch coverage gateThreshold: 80% on lines this PR touches vs
✅ Patch gate passedEvery surface whose lines were touched by this PR has patch coverage at or above the threshold. |
b9fb5f5 to
72f59ee
Compare
|
I fixed the patch coverage failure by adding targeted tests in What was added:
|
santoshkumarradha
left a comment
There was a problem hiding this comment.
🔴 PR-AF Review — Needs Major Rework
Automated multi-agent code review · PR-AF built with AgentField
10 findings · 🔴 3 critical · 🟠 4 important · 🔵 0 suggestions · ⚪ 0 nitpicks
PR Overview
Summary
Fixes an IDOR in the execution notes write endpoint by enforcing execution ownership before appending a note.
File-specific changes:
-
control-plane/internal/handlers/execution_notes.go- Resolves the caller agent identity before writing a note.
- Compares the caller agent ID with the execution owner,
execution.AgentNodeID. - Returns
403 execution_ownership_mismatchwhen the caller does not own the execution. - Supports DID-authenticated callers by resolving the verified caller DID to an agent ID.
-
control-plane/internal/server/middleware/auth.go- Stores API-key caller identity in Gin context using
CallerAgentIDKey. - Uses
X-Caller-Agent-IDfirst, withX-Agent-Node-IDas fallback.
- Stores API-key caller identity in Gin context using
-
control-plane/internal/handlers/execution_notes_test.go- Adds coverage for owner write success.
- Adds coverage for non-owner API-key write returning
403. - Adds coverage for DID-authenticated owner and non-owner behavior.
-
control-plane/internal/server/middleware/auth_test.go- Adds coverage that API-key auth populates caller identity in Gin context.
-
control-plane/internal/handlers/coverage_handlers_90_test.go- Updates the existing successful note-write coverage test to include matching caller identity.
Testing
-
./scripts/test-all.sh - Additional verification:
cd control-plane && go test ./internal/handlers ./internal/server/middlewarecd control-plane && go test ./internal/handlers ./internal/server/middleware -coverprofile=/tmp/issue-420.coverprofilecd control-plane && golangci-lint run --new-from-rev=upstream/main ./internal/handlers ./internal/server/middleware- Manual E2E curl verification:
- Agent A adding a note to Agent A’s execution returns
200 OK - Agent A adding a note to Agent B’s execution returns
403 Forbidden - Agent B’s fresh execution remains with
notes: []after the blocked write
- Agent A adding a note to Agent A’s execution returns
Note: full repo lint currently reports pre-existing unrelated Go lint issues outside this PR’s changed files. Changed-line lint for the touched packages reports 0 issues.
Checklist
- I updated documentation where applicable.
- I added or updated tests (or none were needed).
- I updated
CHANGELOG.md(or this change does not warrant a changelog entry).
Screenshots (if UI-related)
Not UI-related.
Related issues
Fixes #420
Key Findings
7 issue(s) should be addressed before merge:
- 🔴 Complete ownership enforcement bypass when APIKey is empty and DID auth is off — the default configuration (
control-plane/internal/server/middleware/auth.go:24) — WhenAuthConfig.APIKeyis empty (the default in ALL deployment configurations:config/agentfield.yaml, Docker Compose atdeployments/docker/docker-compose.yml, and Helm at `deployments/helm/agen… - 🔴 Three-tier identity fallback in execution notes handler has no fail-closed mechanism: raw-header tier silently becomes primary identity source when all upstream auth middleware is configuration-disabled (
control-plane/internal/handlers/execution_notes.go:184) — The execution notes handler's executionNoteCallerAgentID implements a three-tier identity resolution cascade: (1) verified DID from DIDAuthMiddleware, (2) CallerAgentIDKey context value from APIKeyAut… - 🔴 Fixing the default authentication bypass by enabling DID auth silently activates diverging DID resolution paths — a 'fix F4, expose F1' trap (
control-plane/internal/handlers/execution_notes.go:184) — F4 establishes that under the default configuration (APIKey empty AND did_auth_enabled false), the execution notes handler accepts unvalidated raw headers as caller identity — a complete ownership enf… - 🟠 GetExecutionNotesHandler leaks execution notes to any authenticated caller with no ownership enforcement (
control-plane/internal/handlers/execution_notes.go:235) — The PR fixes an IDOR on the write path (AddExecutionNoteHandler) by enforcing execution ownership, but the read path (GetExecutionNotesHandler) remains completely open: any API-key-authenticated calle… - 🟠 DID auth middleware provides no defense against no-auth bypass — attacker can simply omit DID headers (
control-plane/internal/server/middleware/did_auth.go:177) — The assignment question asks: "Does the DIDAuthMiddleware at routes_middleware.go:77-88 provide any defense when API key auth is off?" No, not in any meaningful way. If `DID.Enabled && DIDAuthEna… - 🟠 Two-tier DID resolution reads semantically-different field names (AgentID vs AgentNodeID) from independent tables with no structural equivalence guarantee (
control-plane/internal/handlers/execution_notes.go:206) — The functionresolveExecutionNoteAgentIDByDIDat line 206 resolves a caller DID to an agent identifier through two independent code paths that read *differently-named columns from different tables… - 🟠 Type-unsafe CallerAgentIDKey enables silent reversion to attacker-controlled raw-header identity when any middleware writes a non-string value — turning a compile-time type error into a runtime authentication bypass (
control-plane/internal/handlers/execution_notes.go:189) — The combination of F2 (CallerAgentIDKey accepts any value type via Gin'sc.Set(key string, value any)) and F4 (executionNoteCallerAgentID falls through to raw X-Caller-Agent-ID / X-Agent-Node-ID hea…
Files with findings: control-plane/internal/handlers/execution_notes.go, control-plane/internal/server/middleware/auth.go, control-plane/internal/server/middleware/did_auth.go
All Findings by Severity
🔴 Critical (3)
- Complete ownership enforcement bypass when APIKey is empty and DID auth is off — the default configuration
control-plane/internal/server/middleware/auth.go:24 - Three-tier identity fallback in execution notes handler has no fail-closed mechanism: raw-header tier silently becomes primary identity source when all upstream auth middleware is configuration-disabled
control-plane/internal/handlers/execution_notes.go:184 - Fixing the default authentication bypass by enabling DID auth silently activates diverging DID resolution paths — a 'fix F4, expose F1' trap
control-plane/internal/handlers/execution_notes.go:184
🟠 Important (4)
- GetExecutionNotesHandler leaks execution notes to any authenticated caller with no ownership enforcement
control-plane/internal/handlers/execution_notes.go:235 - DID auth middleware provides no defense against no-auth bypass — attacker can simply omit DID headers
control-plane/internal/server/middleware/did_auth.go:177 - Two-tier DID resolution reads semantically-different field names (AgentID vs AgentNodeID) from independent tables with no structural equivalence guarantee
control-plane/internal/handlers/execution_notes.go:206 - Type-unsafe CallerAgentIDKey enables silent reversion to attacker-controlled raw-header identity when any middleware writes a non-string value — turning a compile-time type error into a runtime authentication bypass
control-plane/internal/handlers/execution_notes.go:189
Review Process Details
Dimensions Analyzed (15):
- No-Auth Mode Identity Spoofing Bypass — 4 file(s)
- CallerAgentIDKey Context Semantics Collision — 3 file(s)
- GET/POST Execution Notes Authorization Asymmetry — 2 file(s)
- DID Resolution Silent Degradation — Error vs Not-Found Conflation — 3 file(s)
- Context.Background() in GET Handler Bypasses Request Timeout — 2 file(s)
- DID resolution field-name divergence: AgentID vs AgentNodeID — 3 file(s)
- CallerAgentIDKey context type contract: non-string write → silent fallback — 3 file(s)
- No-auth middleware bypass: mechanical trace from empty APIKey to unauthenticated header reads — 4 file(s)
- Storage error propagation contract: errors.As dependency on UpdateExecutionRecord fidelity — 2 file(s)
- Context.Background() drift in GET handler: deadline propagation gap vs POST handler — 2 file(s)
- No-auth bypass of execution ownership enforcement via raw header fallback — 4 file(s)
- Triplicated caller resolution logic with diverged priority chains sharing one context namespace — 3 file(s)
- Auth error classification breaks if production storage provider wraps or replaces closure errors — 2 file(s)
- APIKeyAuth global broadcast of CallerAgentIDKey changes semantics for ALL authenticated routes — 2 file(s)
- DID-to-agent ID resolution produces different identifiers depending on resolution path (AgentID vs AgentNodeID field mismatch) — 3 file(s)
Meta-Dimension Lenses (3):
- Semantic — 5 dimension(s), 92% coverage confidence
- Mechanical — 5 dimension(s), 92% coverage confidence
- Systemic — 5 dimension(s), 92% coverage confidence
Cross-Reference & Adversary Analysis:
- 6 compound finding(s) synthesized
- 12 finding(s) adversarially tested: 12 confirmed, 0 challenged
Pipeline Stats
| Metric | Value |
|---|---|
| Duration | 4643.7s |
| Agent invocations | 65 |
| Coverage iterations | 0 |
| Estimated cost | N/A (provider does not report cost) |
| Budget exhausted | No |
| PR type | bugfix |
| Complexity | standard |
Review ID: rev_b6e41625c18a
|
@Luffy2208, just a heads-up that we're currently evaluating the review quality of https://github.com/Agent-Field/pr-af. If you notice any of these automated findings are off-base, noisy, or unhelpful, please let us know! Your feedback would be super helpful. |
Closes the no-auth IDOR bypass flagged in PR Agent-Field#575 review. The execution note write handler resolved caller identity from raw X-Caller-Agent-ID / X-Agent-Node-ID request headers as a last resort. Under the default config (no API key, DID auth off) that header read was the *sole* identity source, so any caller could spoof ownership and append notes to another agent's execution. - executionNoteCallerAgentID now trusts only a verified DID (DIDAuthMiddleware) or the authenticated middleware context populated by APIKeyAuth after a successful key check. The raw-header fallback is removed. - AddExecutionNoteHandler takes an ownershipEnforced flag. Ownership is enforced whenever an auth method is active; when the server is fully unauthenticated there is no trustworthy identity, so the guard is skipped (app.note() keeps working in local/dev) and a startup warning is logged. - noteOwnershipEnforced() derives the flag from API-key / DID-auth config and drives the warning in applyGlobalMiddleware. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…olver Defense-in-depth for the DID->agent resolution used by execution-note ownership (PR Agent-Field#575 review). The resolver accepted any non-error DID document and any matching agent_dids row regardless of revocation status, so a revoked did:key whose registry entry was marked revoked (auth still passes, it is self-verifying) could resolve to an agent identity. - Skip DIDDocumentRecord results where IsRevoked() is true. - Skip AgentDIDInfo entries with status "revoked". - Document that DIDDocumentRecord.AgentID and AgentDIDInfo.AgentNodeID are the same value under differently-named fields, kept equivalent at registration. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
merging this pr to avoid it getting stale, pushed up the fixes, just waiting on CI |
Summary
Fixes an IDOR in the execution notes write endpoint by enforcing execution ownership before appending a note.
File-specific changes:
control-plane/internal/handlers/execution_notes.goexecution.AgentNodeID.403 execution_ownership_mismatchwhen the caller does not own the execution.control-plane/internal/server/middleware/auth.goCallerAgentIDKey.X-Caller-Agent-IDfirst, withX-Agent-Node-IDas fallback.control-plane/internal/handlers/execution_notes_test.go403.control-plane/internal/server/middleware/auth_test.gocontrol-plane/internal/handlers/coverage_handlers_90_test.goTesting
./scripts/test-all.shcd control-plane && go test ./internal/handlers ./internal/server/middlewarecd control-plane && go test ./internal/handlers ./internal/server/middleware -coverprofile=/tmp/issue-420.coverprofilecd control-plane && golangci-lint run --new-from-rev=upstream/main ./internal/handlers ./internal/server/middleware200 OK403 Forbiddennotes: []after the blocked writeNote: full repo lint currently reports pre-existing unrelated Go lint issues outside this PR’s changed files. Changed-line lint for the touched packages reports
0 issues.Checklist
CHANGELOG.md(or this change does not warrant a changelog entry).Screenshots (if UI-related)
Not UI-related.
Related issues
Fixes #420