feat: waiting state with approval workflows, VC-based authorization, and multi-version reasoners#197
Merged
feat: waiting state with approval workflows, VC-based authorization, and multi-version reasoners#197
Conversation
This commit introduces the foundation for the new VC-based authorization system that replaces API key distribution with admin-approved permissions. Key components added: - Architecture documentation (docs/VC_AUTHORIZATION_ARCHITECTURE.md) - Database migrations for permission approvals, DID documents, and protected agents - Core types for permissions and did:web support - DIDWebService for did:web generation, storage, and resolution - PermissionService for permission requests, approvals, and VC issuance The system enables: - Agents self-assigning tags (identity declaration) - Admin approval workflow for protected agent access - Real-time revocation via did:web - Control plane as source of truth for approvals Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…entation - Add DID authentication middleware with Ed25519 signature verification - Add permission checking middleware for protected agent enforcement - Implement admin API handlers for permission management (approve/reject/revoke) - Add permission request and check API endpoints - Implement storage layer for DID documents, permission approvals, protected agent rules - Add comprehensive integration test suite (14 test functions covering all phases) - Add Admin UI pages: PendingPermissions, PermissionHistory, ProtectedAgents - Add Go SDK DID authentication support - Add Python SDK DID authentication support - Fix CI to enable FTS5 tests (previously all SQLite-dependent tests were skipped) - Add security documentation for DID authentication - Add implementation guide documentation Co-Authored-By: Claude <noreply@anthropic.com>
TestGetNodeDetailsHandler_Structure expected HTTP 400 for missing route param but Gin returns 404. TestGetNodeStatusHandler_Structure was missing a mock expectation for GetAgentStatus causing a panic. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The CI workflow change from `go test ./...` to `go test -tags sqlite_fts5 ./...` caused previously-skipped tests to execute, revealing 15 pre-existing bugs: - UI handler tests: Register agents in storage and configure mocks for GetAgentStatus calls; fix assertions to match actual behavior (health check failures mark agents inactive, not error the request) - VC service tests: Fix GetWorkflowVC lookups to use workflow_vc_id not workflow_id; fix issuer mismatch test to tamper VCDocument JSON instead of metadata field; fix error message assertion for empty VC documents - VC storage tests: Fix GetWorkflowVC key lookups; fix empty result assertions - PresenceManager tests: Register agents in storage so markInactive -> UpdateAgentStatus -> GetAgentStatusSnapshot -> GetAgent succeeds; add proper sync.Mutex for callback vars; use require.Eventually instead of time.Sleep; set HardEvictTTL for lease deletion test - Webhook storage: Fix hardcoded Pending status to use webhook.Status - Execution records test: Fix LatestStarted assertion (CreateExecutionRecord overwrites updated_at with time.Now()) - Cleanup test: Wire countWorkflowRuns and deleteWorkflowRuns into workflow cleanup path Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ntext cancellation Multiple SSE tests called req.Context().Done() expecting it to cancel the context, but Done() only returns a channel — it doesn't cancel anything. This caused SSE handler goroutines to block forever, leaking and eventually causing a 10-minute test timeout in CI. Fixed all affected tests to use context.WithCancel + explicit cancel() call, matching the pattern already used by the working SSE tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…n config Add two example agents for manually testing the VC authorization system end-to-end: permission-agent-a (caller) and permission-agent-b (protected target). Enable authorization in the default config with seeded protection rules. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ster_agent_with_did The previous commit added identity_package access and client credential wiring to _register_agent_with_did but didn't update the test fakes. _FakeDIDManager now provides a realistic identity_package and _FakeAgentFieldClient supports set_did_credentials, so the full registration path is exercised in tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add Go permission test agents (caller + protected target with 3 reasoners) - Add TS permission test agents (caller + tag-protected target with VC generation) - Fix TS SDK DID auth: pass pre-serialized JSON string to axios to ensure signed bytes match what's sent on the wire - Fix Python SDK test for async execution manager payload serialization change - Add go-perm-target protection rule to config - Gitignore compiled Go agent binaries Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The execute() method now passes a JSON string instead of an object to axios for DID auth signing consistency. Update test assertion to match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ror propagation
- Fix re-approval deadlock: expand auto-request condition to trigger for
revoked/rejected statuses, not just empty (permission.go)
- Fix empty caller_agent_id: add DID registry fallback in
ResolveAgentIDByDID for did:key resolution (did_service.go, did_web_service.go)
- Fix HTTP 200 for failed executions: return 502 with proper error details
when inner agent-to-agent calls fail (execute.go)
- Fix error propagation across all 3 SDKs:
- Go SDK: add ExecuteError type preserving status code and error_details
- TS SDK: propagate err.responseData as error_details in all error handlers
- Python SDK: add ExecuteError class, extract JSON body from 4xx responses
instead of losing it via raise_for_status(), propagate error_details in
async callback payloads
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests expected the old 3-header format ({timestamp}:{bodyHash}) but the
implementation correctly uses 4 headers with nonce ({timestamp}:{nonce}:{bodyHash}),
matching Go and Python SDKs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Addresses code scanning alert about missing rate limiting on the authorization route handler. Adds a sliding-window rate limiter (30 requests per IP per 60s) to the local verification middleware. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace custom Map-based rate limiter with express-rate-limit package, which CodeQL recognizes as a proper rate limiting implementation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolve conflicts in async_execution_manager.py and client.py by keeping both auth_headers (from main) and did_authenticator (from this branch) parameters. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
# Conflicts: # control-plane/agentfield-server
Resolve conflicts in Python SDK: keep both DID auth (feat/connector) and typed exceptions (main) imports. Use AgentFieldClientError for network errors while preserving detailed error body parsing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… for fresh DBs Two CI failures: 1. linux-tests: stubStorage in server_routes_test.go was missing the DeleteAgentVersion method added to the StorageProvider interface by the multi-version work. Add the stub. 2. Functional Tests (postgres): migrateAgentNodesCompositePKPostgres tried to ALTER TABLE agent_nodes before GORM created it on fresh databases. The information_schema.columns query returns count=0 (not an error) when the table doesn't exist, so the function proceeded to run ALTER statements against a nonexistent table. Add an explicit table existence check matching the pattern already used by the SQLite migration path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolve 4 conflicts from main's v0.1.43-rc.1 release merge: - config.go: keep approval workflow env var overrides (new feature code) - execute.go: keep renderStatusWithApproval() helper (actively called) - server.go: adopt main's :agentId param naming for /agents/ route - __init__.py: keep approval type exports (used by client.py/agent.py) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
Performance
⚠ Regression detected:
|
The merge brought in a test from main that expected 42 columns in the workflow execution insert query, but the feature branch added approval_expires_at as the 43rd column. Update the test's column list and expected placeholder count to match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Ruff lint flagged the unused import (F401). The tests use httpx_mock fixture from pytest-httpx, not httpx directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Python SDK: - Add pytest-httpx dependency (with Python >=3.10 constraint) - Register httpx_mock marker for --strict-markers compatibility - Add importorskip for graceful skip on Python <3.10 - Fix request_approval test calls to match actual API signature TypeScript SDK: - Call server.closeAllConnections() before server.close() in afterEach to prevent keep-alive connection timeout in tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
After the reasoner name fix, @agent.reasoner(name="reports_generate") registers at /reasoners/reports_generate (the explicit name), not /reasoners/generate_report (the function name). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ag-vc endpoints The waiting-state feature added routes under /api/v1/agents/:node_id/... which conflicted with the existing tag-vc endpoint using :agentId as the parameter name. Gin requires consistent wildcard names for the same path segment, causing a panic on server startup. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ional tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
AbirAbbas
added a commit
that referenced
this pull request
Mar 2, 2026
Main added an internalToken parameter to ExecuteHandler in PR #197. Update the two test call sites to pass empty string for the new param. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2 tasks
github-merge-queue bot
pushed a commit
that referenced
this pull request
Mar 2, 2026
* fix: allow empty input for parameterless skills/reasoners (#196) Remove binding:"required" constraint on Input field in ExecuteRequest and ExecuteReasonerRequest structs. Gin interprets required on maps as "must be present AND non-empty", which rejects the valid {"input":{}} payload that SDKs send for parameterless calls. Also remove the explicit len(req.Input)==0 check in prepareExecution and add nil-input guards in the reasoner and skill handlers to match the existing pattern in execute.go. Closes #196 * test: strengthen empty-input handler coverage * fix: update empty_input_test.go for ExecuteHandler signature change Main added an internalToken parameter to ExecuteHandler in PR #197. Update the two test call sites to pass empty string for the new param. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Abir Abbas <abirabbas1998@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds the waiting state execution model and human-in-the-loop approval workflows to the control plane, along with VC-based authorization, multi-version reasoner support, and connector sidecar management APIs.
Waiting State & Approval Workflows
waiting— agents can pause mid-execution to request external approvalPOST /executions/:id/request-approvaltransitions execution to waiting and stores approval metadata (request ID, callback URL, expiry)GET /executions/:id/approval-statuspolls approval statePOST /webhooks/approvalreceives approval decisions via webhook (supports both flat JSON and hax-sdk envelope formats)approved,rejected,request_changes,expiredalready_processed)VC-Based Authorization & Access Policies
AccessRulesTab,AgentTagsTabUI)Multi-Version Reasoners
Connector Sidecar Management APIs
X-Connector-Token) with capability gatingSDK Updates (all three: Go, Python, TypeScript)
ApprovalClient/ approval API methods for requesting and polling approvalsDIDAuthenticatorfor DID-based request signingLocalVerifierfor offline VC verificationwaitingstateWeb UI
ExecutionFilters,SearchWithFilters, etc.)Examples
Infrastructure
dev.sh[skip ci]on version bumps)Testing
Migrations
9 new migrations (018–026):
018_create_permission_approvals— approval tracking table019_create_did_documents— DID document storage020_create_protected_agents— agent protection flags021_create_access_policies— tag-based access rules022_create_agent_tag_vcs— VC-tag associations023_drop_dead_tables— cleanup024_execution_approval_state— waiting state columns on executions025_approval_callback_url— callback URL column026_approval_expires_at— expiry tracking columnStats
241 files changed, 39,791 insertions(+), 12,014 deletions(-)