Skip to content

test(orchestrator): integration tests for system agent lifecycle#1152

Merged
geoffjay merged 3 commits intoissue-1140from
issue-1143
Apr 17, 2026
Merged

test(orchestrator): integration tests for system agent lifecycle#1152
geoffjay merged 3 commits intoissue-1140from
issue-1143

Conversation

@geoffjay
Copy link
Copy Markdown
Owner

Adds 10 HTTP integration tests via tower::ServiceExt::oneshot (no real TCP) covering the full system-agent feature surface:

  • GET /system-agents returns only built-in agents
  • GET /agents excludes built-in agents by default
  • GET /agents?include_builtin=true includes all agents
  • DELETE /agents/{id} is blocked for built-in agents (403)
  • POST /agents ignores built_in from the request body
  • bootstrap_system_agents() creates the system agent with built_in = true
  • Bootstrap is idempotent — no duplicate agents on repeated calls
  • AgentResponse includes built_in = true for system agents

Also fixes the terminate_agent API handler to return 403 Forbidden (not 500 Internal Server Error) when attempting to delete a built-in system agent.

Closes #1143

geoffjay and others added 2 commits April 16, 2026 21:24
Adds 10 HTTP integration tests covering the full system-agent feature
surface via tower::ServiceExt::oneshot (no real TCP). Includes a
NullBackend to avoid external dependencies.

Also fixes terminate_agent handler to return 403 Forbidden (not 500)
when attempting to delete a built-in system agent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Owner Author

@geoffjay geoffjay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: test(orchestrator): integration tests for system agent lifecycle

Stack position: issue-1143 (#1152) is 4th in a stack:
issue-1139#1149 (orchestrator: system agent prompt/policy) → #1150 (cli: system-agents subcommand) → #1151 (ui: system agents page) → #1152 (this PR)

The test imports (SYSTEM_AGENT_NAME, bootstrap_system_agents, built_in field, AgentManager) originate from the orchestrator PRs lower in the stack. The diff in this PR touches crates/orchestrator/ only — it is independent of #1151's UI changes, but the full stack must merge before CI sees these tests pass in isolation.


Production code fix (api.rs)

The terminate_agent guard is correct:

let existing = state.manager.get_agent(&id).await?.ok_or(ApiError::NotFound)?;
if existing.built_in {
    return Err(ApiError::Forbidden("built-in system agents cannot be deleted".to_string()));
}
  • Ordering is right: existence check first (→ 404), then built-in check (→ 403), then proceed with delete. Reversing the order would produce a confusing 403 for non-existent IDs.
  • ApiError::Forbidden is already used in the same file for room membership guards, so this is consistent with the established pattern and confirmed to map to HTTP 403.
  • The cosmetic reformatting of list_system_agents's signature is fine.

Integration tests (system_agent_http.rs)

What works well:

  • Test isolation is solid. build_app() creates a fresh TempDir and database for every test. No shared state, no ordering dependencies between tests.
  • NullBackend is complete and well-designed. All ExecutionBackend methods return safe no-op values (session_exists → Ok(false), kill_session → Ok(())), which allows AgentManager to run without any real process management. The prefix() returning "test" correctly namespaces sessions.
  • tower::ServiceExt::oneshot exercises the full Axum router. Middleware, extractors, error mapping — all active. More meaningful than unit-testing handler functions directly.
  • Coverage matches the PR description exactly. All 10 stated scenarios have a test.
  • Bootstrap idempotency test (test_bootstrap_is_idempotent) calls bootstrap twice and asserts length=1. Clean verification of the upsert/guard behaviour.
  • test_agent_response_includes_builtin_field verifies the built_in field appears in GET /agents/{id} — good serialization coverage.

Non-blocking suggestions

1. Tighten the delete-builtin assertion to exactly 403

test_delete_builtin_agent_is_rejected currently accepts any of three status codes:

assert!(
    response.status() == StatusCode::BAD_REQUEST      // 400
        || response.status() == StatusCode::FORBIDDEN  // 403
        || response.status() == StatusCode::UNPROCESSABLE_ENTITY, // 422
    ...
);

The PR's stated fix is specifically to return 403 Forbidden. The api.rs change adds ApiError::Forbidden, which is confirmed to map to 403. The test should pin that:

assert_eq!(response.status(), StatusCode::FORBIDDEN,
    "DELETE on a built-in agent must return 403 Forbidden");

As written, the test would still pass if the response regressed to 400 or 422 — defeating its purpose as a regression guard for the specific fix. It does correctly verify "not 500" (the original bug), but a tighter assertion is better.

2. use std::collections::HashMap repeated in function bodies

Both insert_builtin_agent and insert_user_agent declare use std::collections::HashMap inside the function body. Moving it to the module-level use block is cleaner and more idiomatic.

3. Hardcoded localhost URLs in build_app

"ws://localhost:7006" and "http://localhost:17010" are never actually dialled in these tests, so correctness is not affected. Named constants (TEST_WS_URL, TEST_COMMUNICATE_URL) would make the intent clearer. Very minor.


Verdict

The production fix in api.rs is correct, consistent with existing patterns, and properly ordered. The test suite is well-structured, exercises the full router pipeline, and covers all claimed scenarios. The loose 400/403/422 assertion is a test quality issue worth tightening, but it does not represent a gap in production correctness.

LGTM with the suggestion to pin the delete assertion to exactly 403.

@geoffjay geoffjay added merge-queue Approved by reviewer, queued for merge by conductor and removed review-agent Used to invoke a review by an agent tracking this label labels Apr 17, 2026
@geoffjay geoffjay linked an issue Apr 17, 2026 that may be closed by this pull request
9 tasks
@geoffjay geoffjay merged commit f5500cf into issue-1140 Apr 17, 2026
@geoffjay geoffjay deleted the issue-1143 branch April 17, 2026 19:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-queue Approved by reviewer, queued for merge by conductor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Integration test: system agent lifecycle and UI separation

1 participant