test: add 24 critical-path test files across control plane, SDKs by santoshkumarradha · Pull Request #352 · Agent-Field/agentfield

santoshkumarradha · 2026-04-07T12:41:31Z

Summary

24 new test files closing the highest-leverage gaps surfaced by a fresh test-coverage audit. Focus on functional / unit / integration tests on paths most likely to break user-facing behavior when AI writes code against this repo.

The internal TEST_COVERAGE_AUDIT.md was removed in the first commit on this branch — it had become stale and was not meant to ship in the repo.

Coverage added

Go control plane (8 files)

Services + middleware + server

services/did_web_service_test.go — DID parsing, generation, resolution, round-trip
services/ui_service_test.go — client subscription, dedupe, heartbeat, concurrent register/close
services/executions_ui_service_test.go — grouping, duration aggregation, status summary, filtering
server/config_db_test.go — storage section preservation, DB overlay, YAML round-trip
server/middleware/permission_test.go — caller DID precedence, body restoration, fail-closed, target parsing
server/middleware/connector_capability_test.go — disabled / read-only / nil-map handling

Handlers (large untested files)

handlers/reasoners_test.go — execution routing, header propagation, persistence, serverless payload
handlers/memory_events_test.go — WS upgrade, pattern filter, scope filter, disconnect cleanup, burst publish

Python SDK (4 files)

test_did_manager_error_paths.py — register_agent under timeout / 5xx / bad JSON / auth headers
test_vc_generator_error_paths.py — VC generation error paths
test_tool_calling_error_paths.py — malformed args, max turns, tool not found, mixed valid/invalid
test_agent_graceful_shutdown.py — idempotent stop, pending tasks, notify failures, cleanup

Go SDK (4 files)

agent/registration_integration_test.go — register handshake, fallback, approval polling, races
agent/verification_test.go — LocalVerifier refresh, did:key resolution, concurrent access (race-clean)
agent/memory_backend_test.go — scope-aware headers, error propagation, query params
harness/provider_error_integration_test.go — provider crash / timeout / malformed JSONL / env-var unset / missing binary

TypeScript SDK (8 files)

agentfield_client.test.ts — REST verbs, error envelope, headers, DID signing, timeouts
agent_lifecycle.test.ts — serve/shutdown, heartbeat scheduling, registration payloads, failures
execution_context_async.test.ts — AsyncLocalStorage propagation across nested + parallel runs
memory_client_scopes.test.ts — scope resolution, metadata passthrough, 404 contract
workflow_reporter_dag.test.ts — progress events, transitions, failure propagation
tool_calling_errors.test.ts — malformed args, missing tool, max turns, discovery filters
harness_runner_resilience.test.ts — transient retry classification, backoff, cost aggregation
agent_router_dispatch.test.ts — skill vs reasoner routing, schema validation, 404

Methodology

Five parallel discovery agents produced fresh per-area gap briefs (the April 5 audit was stale — many flagged files now have tests).
Six parallel codex workers wrote tests grouped by file ownership (no cross-worker overlap on source files).
Each suite verified locally; broken assertions and flaky timing fixed:
- config_db_test.go round-trip equality (nil-vs-empty slice via YAML normalization)
- reasoners_test.go ambiguous-selector compile error from over-embedded fake storage
- verification_test.go deadlocked errCh capacity (8 → 40)
- registration_integration_test.go over-strict ErrorIs assertion against a fixed-string source error

Test plan

cd control-plane && go test ./internal/services/... ./internal/server/... ./internal/handlers/... — green
cd sdk/go && go test ./agent/ ./harness/ (new tests) — green
cd sdk/python && python3 -m pytest tests/test_did_manager_error_paths.py tests/test_vc_generator_error_paths.py tests/test_tool_calling_error_paths.py tests/test_agent_graceful_shutdown.py — 23 passed, 5 intentional skips
cd sdk/typescript && npx vitest run <8 new test files> — 42 passed across 8 suites

Intentional skips (Python)

Five subtests are skipped with source bug: markers documenting real defects discovered while writing the tests. These are targets for follow-up fixes in the implementation, not test bugs:

`execute_tool_call_loop` raises when tool call omits `function.arguments`
tool timeouts do not break the loop early
`Agent.stop()` is not implemented
graceful shutdown does not track or cancel in-flight tasks
graceful shutdown does not enforce timeout-based task cancellation

Notes

No source files were modified — only new `test.go` / `test.py` / `*.test.ts` files were added.
No new dependencies were added to go.mod, pyproject.toml, or package.json.
Total: ~3,600 lines of test code across 24 files.

github-actions · 2026-04-07T12:43:12Z

Performance

SDK	Memory	Δ	Latency	Δ	Tests	Status
Python	7.9 KB	-13%	0.39 µs	+11%	✓	✓
Go	211 B	-25%	0.57 µs	-43%	✓	✓
TS	428 B	+22%	2.70 µs	+35%	✓	⚠

⚠ Regression detected:

TypeScript memory: 350 B → 428 B (+22%)

santoshkumarradha · 2026-04-07T12:44:43Z

Source bugs filed

The 5 pytest.skip("source bug: ...") markers in this PR have been filed as tracked issues:

#	Test	Issue
1	`test_tool_calling_error_paths.py::test_malformed_tool_call_missing_arguments_is_reported_and_loop_continues`	#353
2	`test_tool_calling_error_paths.py::test_tool_execution_timeout_breaks_loop_early`	#354
3	`test_agent_graceful_shutdown.py::test_agent_stop_is_idempotent`	#355
4	`test_agent_graceful_shutdown.py::test_graceful_shutdown_cancels_in_flight_tasks_within_deadline`	#356
5	`test_agent_graceful_shutdown.py::test_graceful_shutdown_force_cancels_tasks_after_timeout`	#357

Each issue contains the file path, repro snippet, expected behavior, and acceptance criteria. When the underlying source bug is fixed, the corresponding skipped test should be unskipped and will pass.

AbirAbbas · 2026-04-07T21:50:21Z

heads up, this branch is forked off v0.1.65-rc.3 and is currently 13 commits behind main. Key changes that landed since:

feat(runs): pause/resume/cancel + unified status primitives + notification center #345 feat(runs): pause/resume/cancel + unified status primitives — major UI and control plane changes
refactor: remove all MCP code from codebase #359 refactor: remove all MCP code from codebase — likely affects some of the test files in this PR
fix: remediate CodeQL security alerts #361 fix: remediate CodeQL security alerts
feat(observability): add OpenTelemetry distributed tracing export #344 feat(observability): add OpenTelemetry distributed tracing export

Some of the issues we've been seeing (SSE connection handling, execution hangs) have already been fixed on main but surface here because of the stale base. A rebase onto latest main should resolve those.

…verlay Adds white-box unit tests for previously-untested control plane files: services/ - did_web_service: ParseDIDWeb / GenerateDIDWeb round-trip and resolution - ui_service: client subscription, dedupe, heartbeat, concurrent register/close - executions_ui_service: grouping, duration aggregation, status summary, filtering server/ - config_db: storage section preservation, DB overlay merge, YAML round-trip, invalid-payload handling server/middleware/ - permission: caller DID precedence, request body restoration, fail-closed, pending-approval target, target param parsing - connector_capability: disabled / read-only / nil-map handling, method gating Also adds .plandb.db to .gitignore.

reasoners.go (~700 LOC, previously untested): - malformed reasoner-id parsing - node lookup, offline / unhealthy paths - workflow execution record persistence on success and failure - header propagation to proxied agent (X-Workflow-ID, X-Run-ID, etc.) - serverless payload encoding memory_events.go (WS + SSE memory subscriptions): - WebSocket upgrade success and rejection - Pattern filter matching, scope/scopeId filtering - Client disconnect cleanup (no goroutine leak) - Burst publish handling under slow reader

…roviders agent/registration_integration_test.go - happy-path register against httptest control plane - 404 fallback to legacy /api/v1/nodes/register - approval-pending exits cleanly when parent context ends - empty AgentFieldURL produces a clear error - concurrent RegisterNode does not race agent/verification_test.go (LocalVerifier) - Refresh populates policies, revocations, registered DIDs, admin pubkey - Refresh failure preserves prior cache - NeedsRefresh respects refreshInterval - concurrent Refresh + CheckRevocation safe under -race - did:key public key resolution and graceful malformed-input handling agent/memory_backend_test.go (ControlPlaneMemoryBackend) - scope-aware headers (workflow / session / global) - 404 → not-found sentinel; 500 propagated cleanly - Delete uses POST /api/v1/memory/delete - list builds correct query params harness/provider_error_integration_test.go - provider crash with no stderr - timeout under context deadline - malformed JSONL middle line tolerated - env var Env{KEY:""} unsets in subprocess - missing binary returns FailureCrash with helpful message

…tdown Python SDK has good happy-path coverage; these add failure-mode tests: test_did_manager_error_paths.py - network timeout / 5xx / truncated JSON during register_agent - X-API-Key header forwarded when configured - agent continues functioning after registration failure (silent degrade) test_vc_generator_error_paths.py - generate_execution_vc / create_workflow_vc under timeout / 5xx / bad JSON - disabled generator makes no HTTP calls test_tool_calling_error_paths.py - malformed tool args, invalid arg types, mixed valid/invalid in one turn - max_turns enforcement - tool not found does not crash the loop test_agent_graceful_shutdown.py - idempotent re-entrant stop - pending in-flight task handling - notification failure during shutdown - resource cleanup Five subtests are intentionally skipped with 'source bug:' markers documenting real defects discovered while writing the tests (Agent.stop() unimplemented, graceful_shutdown does not track in-flight tasks, etc.). These are targets for follow-up fixes in the implementation, not test bugs.

…atures The TS SDK had only ~6 real test files for ~50 source files. This adds behavior tests for the most-critical surfaces: Core client - agentfield_client: REST verbs, error envelope parsing, header propagation, DID-signed requests, timeout behavior - agent_lifecycle: serve()/shutdown(), heartbeat scheduling, registration payload, registration-failure handling - execution_context_async: AsyncLocalStorage propagation across nested and parallel runs, isolation guarantees - memory_client_scopes: workflow/session/global scope resolution, metadata passthrough headers, 404→undefined contract Features - workflow_reporter_dag: progress() / state transitions / failure propagation - tool_calling_errors: malformed JSON args, missing tool, max turns, max tool calls, discovery filters - harness_runner_resilience: transient retry classification, backoff, cost aggregation across attempts - agent_router_dispatch: skill vs reasoner routing, schema validation, 404 42 tests across 8 suites, all green via vitest.

- memory_events_test.go: The SSE handler does not flush response headers until it writes the first event, so http.Client.Do blocks indefinitely when no event is published before the request begins. Run the request in a goroutine, wait for the subscription to register, then publish. - reasoners_test.go: Drop X-Agent-Node-ID propagation assertion. The serverless execution path does not forward this caller header to the downstream agent request, so the original assertion was incorrect.

The SSE handler in memory_events.go defers header flushing until the first matching event is written, and uses the deprecated CloseNotify() for client disconnect detection. Both behaviors interact poorly with httptest in CI: http.Client.Do blocks until the handler writes, and the test never completes within the CI test deadline. The other tests in this file (WS happy path, invalid-pattern cleanup, backpressure disconnect, upgrade rejection) already cover the same code paths, so skipping just this one is a clean win. Tracked source fix: #358

… is fixed The earlier deadlock was fixed in 7c81c53 by running the request in a goroutine and publishing after the subscription registers. The follow-up skip in c8992cd was redundant — the restructured test passes locally and in CI. Source-side flush refactor still tracked in #358.

santoshkumarradha · 2026-04-08T03:50:24Z

Update: this PR is now properly rebased onto the latest main and includes the control-plane, web UI, and SDK test/coverage work we had been carrying on the related coverage branches.

The stale-base issues from v0.1.65-rc.3 are resolved here, the post-#345 / post-#359 test drift has been fixed, and the full GitHub Actions matrix is now green again, including linux-tests, control-plane-image, both functional test jobs, and the SDK CI jobs.

@AbirAbbas this should now reflect the current codebase and be in good shape for review.

Main #350 ("Chore/UI audit phase1 quick wins") deleted ~14k lines of UI components (HealthBadge, NodeDetailPage, NodesPage, AllReasonersPage, EnhancedDashboardPage, ExecutionDetailPage, RedesignedExecutionDetailPage, ObservabilityWebhookSettingsPage, EnhancedExecutionsTable, NodesVirtualList, SkillsList, ReasonersSkillsTable, CompactExecutionsTable, AgentNodesTable, LoadingSkeleton, AppLayout, EnhancedModal, ApproveWithContextDialog, EnhancedWorkflowFlow, EnhancedWorkflowHeader, EnhancedWorkflowOverview, EnhancedWorkflowEvents, EnhancedWorkflowIdentity, EnhancedWorkflowData, WorkflowsTable, CompactWorkflowsTable, etc.). 35 test files added by PR #352 and waves 1/2 import these now-deleted modules and break the build. They're removed here because: - The components they exercise no longer exist on main. - main's CI is currently red on the same import errors (control-plane-image + Functional Tests both fail at tsc -b on GeneralComponents.test.tsx and NodeDetailPage.test.tsx). This commit fixes that regression as a side effect. - Two further tests (NewSettingsPage, RunsPage) failed at the vitest level on the post-#350 main but were never reached by main's CI because tsc errored first; they're removed too. Web UI vitest now: 80 files / 353 tests / all green. Coverage will be recovered against main's new component layout in a follow-up commit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Third batch of test additions from parallel codex + gemini-2.5-pro headless workers, focused on packages affected by main's #350 UI cleanup and main's new internal/skillkit package. Go control plane (per-package line coverage now): cli: 68.3 -> 82.1 (cli regressed earlier; recovered) handlers/ui: 71.2 -> 80.2 (target hit) skillkit: 0.0 -> 80.2 (new package from main #367) storage: 73.6 -> 79.5 (de-duplicated ptrTime helper) Aggregate Go control plane: 78.13% -> 82.38% (>= 80%) Web UI (vitest, against post-#350 component layout): - Restored RunsPage and NewSettingsPage tests rewritten against the refactored sources (the original #352 versions failed against new main and were removed in commit 03dd44e). - New tests for: AppLayout, AppSidebar, RecentActivityStream, ExecutionForm branches, RunLifecycleMenu, dropdown-menu, status-pill, ui-modals, notification, TimelineNodeCard, CompactWorkflowInputOutput, ExecutionScatterPlot, useDashboardTimeRange, use-mobile. Aggregate Web UI lines: 69.71% -> 81.14% (>= 80%) ============================ COMBINED REPO COVERAGE: 81.60% ============================ 435 / 435 vitest tests passing across 97 files. All Go packages compiling and passing go test. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

santoshkumarradha requested review from a team and AbirAbbas as code owners April 7, 2026 12:41

santoshkumarradha marked this pull request as draft April 7, 2026 12:41

santoshkumarradha mentioned this pull request Apr 7, 2026

[Control Plane] Memory events SSE handler defers header flush, causing client hangs #358

Open

4 tasks

santoshkumarradha marked this pull request as ready for review April 7, 2026 13:07

santoshkumarradha added 17 commits April 8, 2026 08:48

chore: remove internal test coverage audit doc

c8f123f

Add coverage reporting workflow and fix local test entrypoints

2ac19a4

test(web-ui): expand client coverage

1b0bc50

test(sdk): raise python and typescript coverage

9737f93

test(sdk-go): raise go coverage above 80

d4e4c0f

test(control-plane): cover storage vector and config paths

980b23c

test(web-ui): cover vc service flows

a6cada3

fix(ci): stabilize coverage branch test runs

b18be57

test(web-ui): expand api service coverage

2043614

santoshkumarradha added 12 commits April 8, 2026 08:49

test(control-plane): cover dashboard helper paths

cb382a1

test(control-plane): cover execution helper paths

27b032e

test(web-ui): extract workflow dag utilities

e0ef72f

test(web-ui): extract runs page utilities

0e3c40d

test(web-ui): expand service coverage wave

cdcd517

test(control-plane): cover storage helper utilities

70f4b60

test(coverage): cover settings runs and workflow helpers

957be32

test(coverage): cover comparison page and execution records helpers

a95c7c1

test(control-plane): cover execution log handlers

3dd3d7e

test(coverage): cover agent pages and identity handlers

5049ef0

test(web-ui): cover node detail page flows

f31ee44

test(web-ui): rebase client tests onto main

572743a

santoshkumarradha force-pushed the chore/test-coverage-improvements branch from c293147 to 572743a Compare April 8, 2026 03:35

test(web-ui): align node detail mocks with current types

7078f80

santoshkumarradha mentioned this pull request Apr 8, 2026

test(coverage): control-plane 81.1% + web UI 81.5% (supersedes #352) #368

Open

7 tasks

AbirAbbas enabled auto-merge April 8, 2026 13:47

AbirAbbas approved these changes Apr 8, 2026

View reviewed changes

AbirAbbas added this pull request to the merge queue Apr 8, 2026

Merged via the queue into main with commit cf922f9 Apr 8, 2026
35 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: add 24 critical-path test files across control plane, SDKs#352

test: add 24 critical-path test files across control plane, SDKs#352
AbirAbbas merged 30 commits intomainfrom
chore/test-coverage-improvements

santoshkumarradha commented Apr 7, 2026

Uh oh!

github-actions bot commented Apr 7, 2026 •

edited

Loading

Uh oh!

santoshkumarradha commented Apr 7, 2026 •

edited

Loading

Uh oh!

AbirAbbas commented Apr 7, 2026 •

edited

Loading

Uh oh!

santoshkumarradha commented Apr 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

santoshkumarradha commented Apr 7, 2026

Summary

Coverage added

Go control plane (8 files)

Python SDK (4 files)

Go SDK (4 files)

TypeScript SDK (8 files)

Methodology

Test plan

Intentional skips (Python)

Notes

Uh oh!

github-actions bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Performance

Uh oh!

santoshkumarradha commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Source bugs filed

Uh oh!

AbirAbbas commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

santoshkumarradha commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Apr 7, 2026 •

edited

Loading

santoshkumarradha commented Apr 7, 2026 •

edited

Loading

AbirAbbas commented Apr 7, 2026 •

edited

Loading

santoshkumarradha commented Apr 8, 2026 •

edited

Loading