test(orchestrator): integration tests for system agent lifecycle#1152
test(orchestrator): integration tests for system agent lifecycle#1152geoffjay merged 3 commits intoissue-1140from
Conversation
Adds 10 HTTP integration tests covering the full system-agent feature surface via tower::ServiceExt::oneshot (no real TCP). Includes a NullBackend to avoid external dependencies. Also fixes terminate_agent handler to return 403 Forbidden (not 500) when attempting to delete a built-in system agent. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
geoffjay
left a comment
There was a problem hiding this comment.
Review: test(orchestrator): integration tests for system agent lifecycle
Stack position: issue-1143 (#1152) is 4th in a stack:
issue-1139 → #1149 (orchestrator: system agent prompt/policy) → #1150 (cli: system-agents subcommand) → #1151 (ui: system agents page) → #1152 (this PR)
The test imports (SYSTEM_AGENT_NAME, bootstrap_system_agents, built_in field, AgentManager) originate from the orchestrator PRs lower in the stack. The diff in this PR touches crates/orchestrator/ only — it is independent of #1151's UI changes, but the full stack must merge before CI sees these tests pass in isolation.
Production code fix (api.rs)
The terminate_agent guard is correct:
let existing = state.manager.get_agent(&id).await?.ok_or(ApiError::NotFound)?;
if existing.built_in {
return Err(ApiError::Forbidden("built-in system agents cannot be deleted".to_string()));
}- Ordering is right: existence check first (→ 404), then built-in check (→ 403), then proceed with delete. Reversing the order would produce a confusing 403 for non-existent IDs.
ApiError::Forbiddenis already used in the same file for room membership guards, so this is consistent with the established pattern and confirmed to map to HTTP 403.- The cosmetic reformatting of
list_system_agents's signature is fine.
Integration tests (system_agent_http.rs)
What works well:
- Test isolation is solid.
build_app()creates a freshTempDirand database for every test. No shared state, no ordering dependencies between tests. NullBackendis complete and well-designed. AllExecutionBackendmethods return safe no-op values (session_exists → Ok(false),kill_session → Ok(())), which allowsAgentManagerto run without any real process management. Theprefix()returning"test"correctly namespaces sessions.tower::ServiceExt::oneshotexercises the full Axum router. Middleware, extractors, error mapping — all active. More meaningful than unit-testing handler functions directly.- Coverage matches the PR description exactly. All 10 stated scenarios have a test.
- Bootstrap idempotency test (
test_bootstrap_is_idempotent) calls bootstrap twice and asserts length=1. Clean verification of the upsert/guard behaviour. test_agent_response_includes_builtin_fieldverifies thebuilt_infield appears inGET /agents/{id}— good serialization coverage.
Non-blocking suggestions
1. Tighten the delete-builtin assertion to exactly 403
test_delete_builtin_agent_is_rejected currently accepts any of three status codes:
assert!(
response.status() == StatusCode::BAD_REQUEST // 400
|| response.status() == StatusCode::FORBIDDEN // 403
|| response.status() == StatusCode::UNPROCESSABLE_ENTITY, // 422
...
);The PR's stated fix is specifically to return 403 Forbidden. The api.rs change adds ApiError::Forbidden, which is confirmed to map to 403. The test should pin that:
assert_eq!(response.status(), StatusCode::FORBIDDEN,
"DELETE on a built-in agent must return 403 Forbidden");As written, the test would still pass if the response regressed to 400 or 422 — defeating its purpose as a regression guard for the specific fix. It does correctly verify "not 500" (the original bug), but a tighter assertion is better.
2. use std::collections::HashMap repeated in function bodies
Both insert_builtin_agent and insert_user_agent declare use std::collections::HashMap inside the function body. Moving it to the module-level use block is cleaner and more idiomatic.
3. Hardcoded localhost URLs in build_app
"ws://localhost:7006" and "http://localhost:17010" are never actually dialled in these tests, so correctness is not affected. Named constants (TEST_WS_URL, TEST_COMMUNICATE_URL) would make the intent clearer. Very minor.
Verdict
The production fix in api.rs is correct, consistent with existing patterns, and properly ordered. The test suite is well-structured, exercises the full router pipeline, and covers all claimed scenarios. The loose 400/403/422 assertion is a test quality issue worth tightening, but it does not represent a gap in production correctness.
LGTM with the suggestion to pin the delete assertion to exactly 403.
Adds 10 HTTP integration tests via
tower::ServiceExt::oneshot(no real TCP) covering the full system-agent feature surface:GET /system-agentsreturns only built-in agentsGET /agentsexcludes built-in agents by defaultGET /agents?include_builtin=trueincludes all agentsDELETE /agents/{id}is blocked for built-in agents (403)POST /agentsignoresbuilt_infrom the request bodybootstrap_system_agents()creates the system agent withbuilt_in = trueAgentResponseincludesbuilt_in = truefor system agentsAlso fixes the
terminate_agentAPI handler to return 403 Forbidden (not 500 Internal Server Error) when attempting to delete a built-in system agent.Closes #1143