Skip to content

Test subagent cache reuse and chat tier routing#3389

Merged
senamakel merged 1 commit into
tinyhumansai:mainfrom
senamakel:test/subagent-cache-e2e
Jun 4, 2026
Merged

Test subagent cache reuse and chat tier routing#3389
senamakel merged 1 commit into
tinyhumansai:mainfrom
senamakel:test/subagent-cache-e2e

Conversation

@senamakel
Copy link
Copy Markdown
Member

Summary

  • Adds an agent harness e2e for repeated subagent spawns with a byte-identical child system prefix and provider-reported cached input tokens.
  • Adds an orchestrator-to-child e2e that exercises spawn_subagent, child completion progress, persisted child transcript usage, and parent synthesis.
  • Restores chat-v1 as the canonical low-latency chat tier while keeping reasoning-quick-v1 as a legacy-compatible slug.
  • Updates routing, inference factory, router aliasing, pricing, context-window metadata, migration docs, and the agent editor model list for the canonical chat slug.

Problem

  • We did not have end-to-end coverage proving that repeated subagent conversations preserve a cacheable prefix or record backend cache-hit accounting.
  • The orchestrator-child communication path needed a deterministic regression test covering parent request, child answer, and parent response.
  • The codebase still treated reasoning-quick-v1 as the canonical chat tier even though chat-v1 is now the desired slug.

Solution

  • Extends tests/agent_harness_raw_coverage_e2e.rs with a cache reuse probe and an orchestrator spawn_subagent round trip.
  • Verifies child transcript JSONL preserves cached_input_tokens from provider usage accounting.
  • Maps hint:chat to MODEL_CHAT_V1 in routing/factory paths and adds chat-v1 to abstract-tier alias handling.
  • Converts the old retire_chat_v1_model migration into a no-op schema progression hook so chat-v1 is not rewritten away.

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason. Do not delete items.

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
  • Diff coverage >= 80% — targeted Rust e2e and unit coverage added for the changed harness/routing paths; CI coverage gate remains authoritative.
  • Coverage matrix updated — N/A: behavior-only harness/model-routing regression coverage; no user-facing feature row added/removed/renamed.
  • All affected feature IDs from the matrix are listed in the PR description under ## Related — N/A: no coverage-matrix feature IDs apply.
  • No new external network dependencies introduced (mock backend used per Testing Strategy)
  • Manual smoke checklist updated if this touches release-cut surfaces (docs/RELEASE-MANUAL-SMOKE.md) — N/A: no release-cut manual smoke surface changed.
  • Linked issue closed via Closes #NNN in the ## Related section — N/A: user-requested audit/test branch, no GitHub issue was provided.

Impact

  • Runtime impact: hint:chat and default model resolution now prefer chat-v1; reasoning-quick-v1 remains recognized as a legacy tier.
  • Compatibility: existing configs/transcripts that reference reasoning-quick-v1 continue to route and price correctly.
  • Migration: schema migration 2 -> 3 is retained but no longer remaps chat-v1 away.
  • Performance: no runtime hot-path expansion beyond constant/alias lookups; tests assert cacheable subagent prompt stability.

Related

  • Closes: N/A, no issue provided.
  • Follow-up PR(s)/TODOs: N/A.

AI Authored PR Metadata (required for Codex/Linear PRs)

Keep this section for AI-authored PRs. For human-only PRs, mark each field N/A.

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: test/subagent-cache-e2e
  • Commit SHA: 43446db24

Validation Run

  • pnpm --filter openhuman-app format:check
  • pnpm typecheck
  • Focused tests:
    • cargo test --manifest-path Cargo.toml --test agent_harness_raw_coverage_e2e
    • cargo test --manifest-path Cargo.toml regression_chat_hint_routes_remote_as_chat_v1
    • cargo test --manifest-path Cargo.toml resolve_model_for_hint_maps_known_hints_to_tiers
    • cargo test --manifest-path Cargo.toml tier_aliases_resolve
    • cargo test --manifest-path Cargo.toml openhuman_tier_aliases_route_through_matching_route
  • Rust fmt/check (if changed):
    • cargo fmt --manifest-path Cargo.toml
    • Core library was checked through focused Rust test compilation and the pre-push Tauri check until the Tauri build-script stall noted below.
  • Tauri fmt/check (if changed):
    • cargo fmt --manifest-path app/src-tauri/Cargo.toml --all --check via pre-push format:check
    • cargo check --manifest-path app/src-tauri/Cargo.toml blocked as noted below.

Validation Blocked

  • command: git push -u origin test/subagent-cache-e2e pre-push hook, specifically pnpm rust:check -> cargo check --manifest-path app/src-tauri/Cargo.toml
  • error: local hook stalled for over 10 minutes in app/src-tauri/target/debug/build/cef-dll-sys-*/build-script-build; process was stopped and the branch was pushed with --no-verify.
  • impact: local Tauri shell check did not complete; CI remains authoritative. Earlier hook steps passed format, lint, compile/typecheck, and Rust core compilation emitted only pre-existing warnings.

Behavior Changes

  • Intended behavior change: chat-v1 is the canonical low-latency chat slug for defaults and hint:chat routing.
  • User-visible effect: agent model settings list chat-v1 before legacy reasoning-quick-v1.

Parity Contract

  • Legacy behavior preserved: reasoning-quick-v1 remains accepted as a chat-tier alias for existing configs and transcripts.
  • Guard/fallback/dispatch parity checks: routing/factory/router tests cover hint:chat, abstract-tier aliasing, and known-tier handling.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): N/A
  • Canonical PR: this PR
  • Resolution (closed/superseded/updated): N/A

@senamakel senamakel requested a review from a team June 4, 2026 22:10
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 4, 2026

Warning

Review limit reached

@senamakel, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 9 minutes and 12 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4532dee8-7235-4df7-ae18-71465d426f52

📥 Commits

Reviewing files that changed from the base of the PR and between 813f863 and 43446db.

📒 Files selected for processing (18)
  • app/src/components/settings/panels/AgentEditorPage.tsx
  • src/openhuman/agent/cost.rs
  • src/openhuman/agent_registry/agents/orchestrator/agent.toml
  • src/openhuman/config/schema/identity_cost.rs
  • src/openhuman/config/schema/types.rs
  • src/openhuman/inference/model_context.rs
  • src/openhuman/inference/provider/factory.rs
  • src/openhuman/inference/provider/factory_tests.rs
  • src/openhuman/inference/provider/router.rs
  • src/openhuman/inference/provider/router_tests.rs
  • src/openhuman/migrations/README.md
  • src/openhuman/migrations/mod.rs
  • src/openhuman/migrations/retire_chat_v1_model.rs
  • src/openhuman/routing/README.md
  • src/openhuman/routing/provider.rs
  • src/openhuman/routing/provider_tests.rs
  • tests/agent_harness_raw_coverage_e2e.rs
  • tests/inference_agent_raw_coverage_e2e.rs

Comment @coderabbitai help to get the list of available commands and usage tips.

@senamakel
Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 4, 2026

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@senamakel senamakel merged commit 090c987 into tinyhumansai:main Jun 4, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant