feat: add proactive welcome onboarding flow#522
Conversation
- Introduced two new proactive agents: "Morning Briefing" and "Welcome Message". - The Morning Briefing agent provides a daily summary of the user's calendar, tasks, and relevant context, scheduled to run automatically at 7 AM. - The Welcome Message agent delivers a personalized greeting after onboarding, utilizing user information for a tailored experience. - Updated the cron job system to seed these agents upon onboarding completion, enhancing user engagement from the start. - Added comprehensive tests to ensure the correct functionality of the new agents and their integration into the system.
- Added a new `ProactiveMessageSubscriber` to route proactive messages (e.g., morning briefings, welcome messages) to the user's active channel, enhancing user engagement. - Introduced `ProactiveMessageRequested` event in the event bus to facilitate proactive message delivery. - Updated the configuration schema to include a preferred channel for proactive messages, allowing for tailored user experiences. - Seeded default proactive agent cron jobs to ensure timely delivery of messages post-onboarding. - Enhanced the startup process to register the proactive message subscriber, ensuring proactive messages are consistently routed. - Comprehensive tests added to validate the functionality of proactive message handling and delivery mechanisms.
- Updated the welcome agent's description to clarify its role as the first point of contact for new users, focusing on guiding them through remaining onboarding steps. - Increased the maximum iterations for the agent from 4 to 6 to allow for more complex interactions. - Introduced a new `complete_onboarding` tool to check the user's setup status and mark onboarding as complete, improving the onboarding experience. - Enhanced the prompt structure to provide clearer guidance on the agent's workflow, including steps for checking setup status, greeting users, and completing onboarding. - Updated the tools list to include `complete_onboarding`, ensuring the welcome agent can effectively manage user onboarding. - Comprehensive documentation added to clarify the agent's functionality and improve user engagement.
- Updated the welcome agent's description to clarify its role as the first point of contact for new users, focusing on guiding them through setup and onboarding. - Increased the maximum iterations for the agent from 4 to 6 to allow for more comprehensive interactions. - Introduced a new `complete_onboarding` tool to check the user's setup status and mark onboarding as complete, improving the onboarding experience. - Enhanced the prompt structure to provide clearer guidance on the agent's workflow, including steps for checking setup status, greeting users, and completing onboarding. - Updated the tools list to include the new `complete_onboarding` tool, ensuring it is available for the welcome agent's operations. - Improved documentation for the welcome agent's prompt, emphasizing tone guidelines and what not to do during user interactions.
📝 WalkthroughWalkthroughThis PR adds proactive messaging: new proactive DomainEvent, a ProactiveMessageSubscriber for web + optional external channel delivery, seeding of two built-in proactive agent cron jobs on onboarding completion, two new built-in agents (morning_briefing, welcome) with configs/prompts, a complete_onboarding agent tool, and schema + store changes to support agent-linked cron jobs. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Onboarding as Onboarding UI/Agent
participant Config as Config Service
participant CronSeed as Cron Seeding
participant Scheduler as Cron Scheduler
User->>Onboarding: complete onboarding
Onboarding->>Config: set_onboarding_completed(true)
Config->>CronSeed: spawn seed_proactive_agents() (background)
CronSeed->>Scheduler: create "welcome" job (one-shot +10s)
CronSeed->>Scheduler: create "morning_briefing" job (cron 7:00)
CronSeed-->>Config: jobs created
Config-->>Onboarding: onboarding marked complete
sequenceDiagram
participant Scheduler as Cron Scheduler
participant EventBus as Event Bus
participant Subscriber as ProactiveMessageSubscriber
participant Web as Web Channel (Socket.IO)
participant Ext as External Channel (if active)
Scheduler->>EventBus: publish ProactiveMessageRequested(source, message, job_name)
EventBus->>Subscriber: deliver event (domain="cron")
Subscriber->>Web: publish_web_channel_event("proactive_message", payload)
alt active_channel set and ≠ "web"
Subscriber->>Ext: Channel::send(SendMessage)
Ext-->>Subscriber: Ok / Err
end
Subscriber-->>EventBus: handled
sequenceDiagram
participant Agent as Welcome Agent
participant Tool as complete_onboarding Tool
participant Config as Config Service
participant CronSeed as Cron Seeding
Agent->>Tool: execute(action="check_status")
Tool->>Config: load config
Tool-->>Agent: markdown status report
Agent->>Tool: execute(action="complete")
Tool->>Config: set onboarding_completed=true, save
Tool->>CronSeed: spawn seed_proactive_agents() (background)
CronSeed-->>Tool: jobs scheduled
Tool-->>Agent: success
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (2)
src/openhuman/tools/ops.rs (1)
85-85: Add a regression assertion forcomplete_onboardingtool presence.Line 85 adds a new default tool registration, but current tests only guard
spawn_subagent. Add a direct assertion forcomplete_onboardingto prevent accidental removal in future refactors.Proposed test update
#[test] fn all_tools_includes_spawn_subagent() { @@ assert!( names.contains(&"spawn_subagent"), "spawn_subagent must be registered in the default tool list; got: {names:?}" ); + assert!( + names.contains(&"complete_onboarding"), + "complete_onboarding must be registered in the default tool list; got: {names:?}" + ); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/tools/ops.rs` at line 85, Add a regression assertion in the tests that explicitly checks the default tool registry contains the complete_onboarding tool to prevent accidental removals: locate the test that currently asserts spawn_subagent (referencing spawn_subagent and any DefaultTools/registry helper used there) and add a new expectation that a tool constructed by CompleteOnboardingTool::new() (or the tool id "complete_onboarding" if tests assert by name) is present in the default tools list returned by the same code path; mirror the existing assertion style used for spawn_subagent so the new check fails the test if CompleteOnboardingTool::new()/complete_onboarding is removed.src/openhuman/agent/harness/definition_loader.rs (1)
272-275: Derive the expected registry size instead of hard-coding13.This assertion now couples the override test to the current built-in count, so the next built-in addition will fail the test even if “custom overrides built-in in place” still works. Please compute the expected size from the built-ins (plus the synthetic
fork) or compare against the pre-override registry length instead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/harness/definition_loader.rs` around lines 272 - 275, The test currently hard-codes assert_eq!(reg.len(), 13); which couples it to the current built-in count; instead compute the expected size dynamically by deriving it from the built-ins or by comparing to the registry length before the override: capture the registry length (e.g., let before = reg.len() or compute builtins_count + 1 for the synthetic "fork") prior to replacing the built-in `code_executor`, then assert that reg.len() equals that computed `expected` value after the override; reference the `reg` variable and the synthetic `fork` + built-ins collection (or use the pre-override `before` value) to form the expectation rather than using the literal 13.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/core/jsonrpc.rs`:
- Around line 843-851: This code registers a web-only proactive subscriber via
crate::core::event_bus::subscribe_global(Arc::new(crate::openhuman::channels::proactive::ProactiveMessageSubscriber::web_only()))
in jsonrpc.rs which duplicates registration done in the domain startup
(openhuman/channels/runtime/startup.rs) and causes duplicate
ProactiveMessageRequested deliveries; remove or guard this call so jsonrpc.rs
becomes a no-op for proactive registration and let the domain-level registrar in
bus.rs own the subscription. Concretely: delete or wrap the subscribe_global
call in jsonrpc.rs (and the similar block at the later 859-860 site) behind a
domain-owned check or feature flag and ensure only the registrar in bus.rs (the
domain-level subscriber registration function) performs
ProactiveMessageSubscriber::web_only() subscription. Ensure subscribe_global is
only invoked from the domain bus registrar to avoid dual handling.
In `@src/openhuman/agent/agents/mod.rs`:
- Around line 323-338: The test welcome_is_read_only_with_memory_recall_only is
out of sync with welcome/agent.toml: update the assertions that inspect the
ToolScope and max_iterations to match the new config returned by
find("welcome"); specifically, in the match for ToolScope::Named (from
def.tools) change the expected tools length check from 1 to 2 and assert that
both "complete_onboarding" and "memory_recall" are present (use
tools.iter().any(...) for each), and change assert_eq!(def.max_iterations, 4) to
assert_eq!(def.max_iterations, 6); keep the existing checks for
def.sandbox_mode, def.omit_memory_context, and def.omit_identity unchanged.
In `@src/openhuman/config/ops.rs`:
- Around line 566-580: In set_onboarding_completed, avoid blocking the async
runtime by replacing tokio::spawn with tokio::task::spawn_blocking when calling
the synchronous seed_proactive_agents; add tracing::debug logs at the start of
set_onboarding_completed, before the seeding branch (when value &&
!was_completed), inside the seeding branch when spawning the blocking task, and
in the noop branch (when no transition) so the false→true transition and its
outcome are traceable; keep the spawn_blocking closure to call
crate::openhuman::cron::seed::seed_proactive_agents(&seed_config) and still log
any error via tracing::warn as before.
In `@src/openhuman/config/schema/channels.rs`:
- Around line 27-35: Update the rustdoc on the proactive channel field in
channels.rs to reflect actual behavior: state that proactive messages are always
delivered to the "web" channel first and are then mirrored to the configured
external channel (if any) rather than being routed exclusively to that external
channel; reference the proactive message handler in
src/openhuman/channels/proactive.rs (the proactive delivery flow) in the comment
so future readers can find the implementation, and add a short note in the
relevant docs (e.g., AGENTS.md or architecture docs) clarifying the “web-first,
then mirror to external channel” behavior.
In `@src/openhuman/cron/scheduler.rs`:
- Around line 278-323: Normalize delivery.mode to a trimmed, lowercase string
before matching so legacy values like "Announce", "PROACTIVE", or values with
surrounding whitespace continue to work; e.g., obtain a normalized mode from
delivery.mode (using as_deref/unwrap_or, trim, and to_lowercase) and then match
on that normalized string instead of matching delivery.mode.as_str() directly,
leaving the publish_global calls (DomainEvent::ProactiveMessageRequested and
DomainEvent::CronDeliveryRequested) and the error handling for missing
channel/target unchanged.
In `@src/openhuman/cron/seed.rs`:
- Around line 72-113: The seeded cron jobs (added via add_agent_job in
seed_welcome and the morning briefing call) only persist raw prompts and never
attach an agent definition ID, so scheduler.rs still uses
Agent::from_config(...).run_single(...) with the prompt and bypasses your
morning_briefing/welcome agent implementations; update the add_agent_job calls
to include an agent-definition selector (e.g., pass the built-in agent
ID/definition name such as MORNING_BRIEFING_JOB_NAME / WELCOME_JOB_NAME into the
job record) and then modify run_agent_job (the scheduler path that currently
invokes Agent::from_config and run_single) to resolve the agent by that
definition ID (look up the agent implementation under
src/openhuman/agent/agents/{morning_briefing,welcome} and instantiate it) so the
cron path applies tool allowlists, iteration caps, and the agent prompt bodies
rather than only using the raw prompt.
---
Nitpick comments:
In `@src/openhuman/agent/harness/definition_loader.rs`:
- Around line 272-275: The test currently hard-codes assert_eq!(reg.len(), 13);
which couples it to the current built-in count; instead compute the expected
size dynamically by deriving it from the built-ins or by comparing to the
registry length before the override: capture the registry length (e.g., let
before = reg.len() or compute builtins_count + 1 for the synthetic "fork") prior
to replacing the built-in `code_executor`, then assert that reg.len() equals
that computed `expected` value after the override; reference the `reg` variable
and the synthetic `fork` + built-ins collection (or use the pre-override
`before` value) to form the expectation rather than using the literal 13.
In `@src/openhuman/tools/ops.rs`:
- Line 85: Add a regression assertion in the tests that explicitly checks the
default tool registry contains the complete_onboarding tool to prevent
accidental removals: locate the test that currently asserts spawn_subagent
(referencing spawn_subagent and any DefaultTools/registry helper used there) and
add a new expectation that a tool constructed by CompleteOnboardingTool::new()
(or the tool id "complete_onboarding" if tests assert by name) is present in the
default tools list returned by the same code path; mirror the existing assertion
style used for spawn_subagent so the new check fails the test if
CompleteOnboardingTool::new()/complete_onboarding is removed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d9a1de02-45c1-4671-8513-d8803fc96ffb
📒 Files selected for processing (20)
src/core/event_bus/events.rssrc/core/jsonrpc.rssrc/openhuman/about_app/catalog.rssrc/openhuman/agent/agents/mod.rssrc/openhuman/agent/agents/morning_briefing/agent.tomlsrc/openhuman/agent/agents/morning_briefing/prompt.mdsrc/openhuman/agent/agents/welcome/agent.tomlsrc/openhuman/agent/agents/welcome/prompt.mdsrc/openhuman/agent/harness/definition_loader.rssrc/openhuman/channels/mod.rssrc/openhuman/channels/proactive.rssrc/openhuman/channels/runtime/startup.rssrc/openhuman/config/ops.rssrc/openhuman/config/schema/channels.rssrc/openhuman/cron/mod.rssrc/openhuman/cron/scheduler.rssrc/openhuman/cron/seed.rssrc/openhuman/tools/impl/agent/complete_onboarding.rssrc/openhuman/tools/impl/agent/mod.rssrc/openhuman/tools/ops.rs
👮 Files not reviewed due to content moderation or server errors (4)
- src/openhuman/agent/agents/morning_briefing/agent.toml
- src/openhuman/agent/agents/welcome/prompt.md
- src/openhuman/channels/proactive.rs
- src/openhuman/tools/impl/agent/complete_onboarding.rs
…hance onboarding agent tests - Replaced the direct registration of the proactive message subscriber with a new `register_web_only_proactive_subscriber` function, ensuring it is only registered once during domain startup. - Updated the welcome agent test to reflect the addition of the `complete_onboarding` tool, verifying its presence and adjusting the expected number of tools. - Enhanced the onboarding process by improving logging for proactive agent seeding transitions, providing clearer insights during onboarding state changes. - Refactored the cron job system to support agent definitions, allowing for more flexible job scheduling and execution based on defined agent parameters. - Improved documentation and comments across the affected modules to clarify the purpose and functionality of changes made.
- Streamlined the registration process for the web-only proactive message subscriber, enhancing clarity and maintainability. - Consolidated logging statements in the onboarding completion function for improved readability and consistency. - These changes aim to enhance the overall structure and logging clarity within the proactive messaging and onboarding processes.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/openhuman/channels/proactive.rs (1)
82-87: Consider logging when the RwLock is poisoned.
if let Ok(...)silently ignores a poisoned lock. While lock poisoning is rare, silently failing to update the active channel could cause subtle bugs.🔧 Proposed fix to log on poisoned lock
pub fn set_active_channel(&self, channel: Option<String>) { - if let Ok(mut guard) = self.active_channel.write() { - *guard = channel; + match self.active_channel.write() { + Ok(mut guard) => *guard = channel, + Err(e) => { + tracing::warn!( + "[proactive] failed to update active_channel — lock poisoned: {e}" + ); + } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/channels/proactive.rs` around lines 82 - 87, The set_active_channel method currently ignores a poisoned RwLock by using if let Ok(...), so update set_active_channel to handle the Err case from self.active_channel.write(): on Ok(mut guard) set *guard = channel as before, and on Err(poison_err) call the project's logger (e.g., tracing::error! or the existing logger instance) to record the poisoning and include poison_err and the attempted channel value; optionally recover by using poison_err.into_inner() to still set *guard = channel so the state is updated despite the poison.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/openhuman/channels/proactive.rs`:
- Around line 82-87: The set_active_channel method currently ignores a poisoned
RwLock by using if let Ok(...), so update set_active_channel to handle the Err
case from self.active_channel.write(): on Ok(mut guard) set *guard = channel as
before, and on Err(poison_err) call the project's logger (e.g., tracing::error!
or the existing logger instance) to record the poisoning and include poison_err
and the attempted channel value; optionally recover by using
poison_err.into_inner() to still set *guard = channel so the state is updated
despite the poison.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 15852bb6-f249-4a34-8f3c-11c17974d18b
📒 Files selected for processing (12)
src/core/jsonrpc.rssrc/openhuman/agent/agents/mod.rssrc/openhuman/agent/harness/definition_loader.rssrc/openhuman/channels/proactive.rssrc/openhuman/config/ops.rssrc/openhuman/config/schema/channels.rssrc/openhuman/cron/mod.rssrc/openhuman/cron/scheduler.rssrc/openhuman/cron/seed.rssrc/openhuman/cron/store.rssrc/openhuman/cron/types.rssrc/openhuman/tools/ops.rs
✅ Files skipped from review due to trivial changes (2)
- src/openhuman/tools/ops.rs
- src/openhuman/cron/seed.rs
🚧 Files skipped from review as they are similar to previous changes (6)
- src/openhuman/cron/mod.rs
- src/openhuman/agent/harness/definition_loader.rs
- src/core/jsonrpc.rs
- src/openhuman/config/ops.rs
- src/openhuman/agent/agents/mod.rs
- src/openhuman/config/schema/channels.rs
…oarding flag (tinyhumansai#525) Wires the per-agent tool scoping plumbing from commit 4a into the channel-message dispatch path. Each incoming channel message now picks the active agent — `welcome` pre-onboarding, `orchestrator` post — based on `Config::onboarding_completed`, loads the matching definition from the global `AgentDefinitionRegistry`, synthesises any `delegate_*` tools the agent declares in its `subagents` field, and passes everything through to `agent.run_turn` on the bus. This is the half of tinyhumansai#525 that makes the welcome agent actually run for new users — the welcome definition has existed since upstream PR tinyhumansai#522 but had no caller; nothing in dispatch consulted the onboarding flag, so every channel message ran through the same generic tool loop with the full registry exposed. Changes: src/openhuman/channels/runtime/dispatch.rs - New `AgentScoping` struct carrying the three new `AgentTurnRequest` fields (`target_agent_id`, `visible_tool_names`, `extra_tools`) plus an `unscoped()` constructor for safe-fallback paths. - New async `resolve_target_agent(channel)` helper: * fresh `Config::load_or_init().await` per turn (no cache — the loader reads from disk every call, verified at `config/schema/load.rs:409`, so the welcome→orchestrator handoff is observed on the next message after `complete_onboarding(complete)` flips the flag, with no need for an explicit handoff event); * picks `"welcome"` or `"orchestrator"` based on the flag and emits a structured `[dispatch::routing] selected target agent` info trace recording the choice + the flag value, satisfying the tinyhumansai#525 acceptance criterion `"agent-selection logs clearly record why each agent was selected at onboarding boundaries"`; * looks up the definition in `AgentDefinitionRegistry::global()`, gracefully falling back to `AgentScoping::unscoped()` (= legacy behaviour, no filter, no extras) if the registry isn't initialised or the definition isn't found, so a routing miss never fails the user message; * for agents with a non-empty `subagents` field, awaits `composio::fetch_connected_integrations(&config)` and runs `orchestrator_tools::collect_orchestrator_tools` to materialise per-turn delegation tools (`research`, `plan`, `delegate_gmail`, …). Agents with empty `subagents` get an empty extras vec. - New `build_visible_tool_set(definition, &extra_tools)` helper that returns `Some(union)` for `ToolScope::Named` agents (their named list ∪ the names of the synthesised delegation tools) and `None` for `ToolScope::Wildcard` agents to preserve the unfiltered semantics — so agents like `skills_agent` and `morning_briefing` that already work via `wildcard + category_filter` keep their existing behaviour without this layer interfering. - `process_channel_message` calls `resolve_target_agent` once per turn, drops the placeholder defaults from commit 4a, and feeds the real `target_agent_id`/`visible_tool_names`/`extra_tools` into `AgentTurnRequest`. - New imports: `AgentDefinition`, `AgentDefinitionRegistry`, `ToolScope`, `Config`, `fetch_connected_integrations`, `orchestrator_tools`, `Tool`, `HashSet`. End-to-end behaviour after this commit: 1. New user, `onboarding_completed=false`: dispatch picks `welcome`, loads its 2-tool TOML scope, builds `visible_tool_names = {complete_onboarding, memory_recall}`, no extras, hands off to the bus. Bus handler applies the filter → welcome's LLM sees exactly 2 tools. 2. Welcome agent guides the user through setup, eventually calls `complete_onboarding(action="complete")` → flag persists to disk via `config.save()`. 3. Next user message: dispatch reads the flag fresh, picks `orchestrator`, fetches connected Composio integrations, expands `subagents = ["researcher", "planner", "code_executor", "critic", "archivist", { skills = "*" }]` into delegate_research / delegate_plan / delegate_run_code / delegate_review_code / delegate_archive_session + one delegate_<toolkit> per connected integration. visible_tool_names is the union with the 4 direct tools from orchestrator's `[tools] named` list. LLM sees the scoped delegation surface, not the full 1000+ Composio catalog. tinyhumansai#526's runtime leak is now fixed end-to-end: the orchestrator's LLM prompt only contains the tools its TOML allows, and the SkillDelegationTool path narrows skills_agent to a single toolkit via the `skill_filter` propagation fix from commit 4a. No agent at any layer sees more than its definition declares. Tests: 599/599 channel module tests pass — including `runtime_dispatch::dispatch_routes_through_agent_run_turn_bus_handler` and the telegram integration variant, which exercise the full bus roundtrip with the new fields populated. No existing assertions were modified. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…itches (tinyhumansai#525) Follow-up to tinyhumansai#522 that addresses a rough UX edge in the welcome agent prompt: users who arrive with only an API key configured (no channels, no Composio integrations, no web search / browser / local AI) were getting a "gentle suggestion" to connect something without any concrete picture of what they'd actually unlock. The welcome agent would finish onboarding, hand off to orchestrator, and leave the user staring at a functional-but-empty assistant with no roadmap for how to make it useful. This commit rewrites the prompt to handle sparse setups explicitly. No Rust changes — this is prompt.md only, picked up at build time via the existing `include_str!` loader in `agents/mod.rs`. Changes to `src/openhuman/agent/agents/welcome/prompt.md`: * Step 2's "point out what's missing" sub-point is rewritten from a single "gently suggest" line into a four-case decision tree keyed off `check_status` state: - No API key → critical, block completion. - Integrations yes, channels no → note the Tauri-only reach limitation, suggest a messaging platform. - Channels yes, integrations no → degraded assistant, nudge toward Composio. - Nothing beyond the API key → the "bare install" case, gets the new Step 2.5 treatment. * New **Step 2.5: Handling a bare install** section added after Step 2. Spells out what the user DOES have (sandboxed reasoning + coding assistant with memory), what they're MISSING (any external action), and how to structure the message: state the current capability honestly, pitch 2-3 specific integrations with concrete example prompts, point to Settings → Integrations / Channels, and leave room for the user to opt into the coding-only experience if that's what they actually want. For bare-install users the word budget stretches to 250-400 words (up from 200-350) so the concrete pitches and example prompts actually fit without cramming. * New **Integration capability reference** section giving the LLM a menu it can draw from when pitching integrations. Each entry is a one-line "connect X → I can Y" with a concrete example prompt the user could send next: - Gmail: "Summarise the most important emails that came in overnight and flag anything that needs a reply today." - Google Calendar: "What's on my calendar tomorrow, and do I have a 30-minute gap before 2pm?" - GitHub: "List open issues on my main project tagged 'bug' and summarise which ones look newest or most urgent." - Notion: "Pull up my 'Ideas' Notion database and show me the three newest entries." - Slack / Discord / Linear / Jira / etc. with similar shapes. Plus a sub-section for messaging platforms (Telegram / Discord / Slack / iMessage / WhatsApp / Signal / web-fallback) that clarifies which each is best for, and a sub-section for the other capabilities (web search, browser automation, HTTP requests, local AI) that explains what breaks without them. The LLM is told NOT to list everything — just pick 2-3 most likely to matter, defaulting to Gmail + GitHub + one of {Calendar, Notion} as the top-3 pitch when no profile context is available. * Tone guidelines updated to document the stretched word budget for bare installs (200-350 for configured users, 250-400 for bare installs). * "What NOT to do" list updated: - Explicitly allows product-tour-style listing ONLY in the bare-install case (Step 2.5), forbids it elsewhere. - Clarifies that describing what WOULD unlock with integration X is fine and encouraged; claiming a capability the user doesn't have is still forbidden. - Adds a new "Don't gloss over a bare install" entry that pins the rule: API-key-only users get concrete pitches and example prompts, not vague suggestions. Scope note: this commit does NOT change the completion logic. `complete_onboarding(complete)` still accepts API-key-only as the minimum bar — that's a separate design question for the maintainer about whether zero-integration users should be gatekept. This change improves what the welcome agent SAYS to those users, not whether they're allowed to proceed. Tests: all 14 `agent::agents::tests` pass (including `welcome_has_onboarding_and_memory_tools` which validates the welcome agent's declarative shape is unchanged). The prompt.md edit is pure content — no schema changes, no tool additions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…_key (tinyhumansai#525) Discovered during end-to-end Tauri testing that the welcome agent refused to call `complete_onboarding(complete)` for a fully authenticated user because `check_status` reported "API key: **missing**" — even though the user had completed the desktop OAuth flow and a valid encrypted JWT was sitting in `auth-profiles.json`. ## Root cause `check_status` was reading only `config.api_key` (an `Option<String>` on the Config struct), which is a **legacy free-form provider key field**. It is not where the desktop OAuth flow stores its credentials. The actual openhuman backend session JWT lives in `<openhuman_dir>/auth-profiles.json` under the `app-session:default` profile, encrypted via the SecretStore using `<openhuman_dir>/.secret_key`, and is read at every inference RPC via `crate::api::jwt::get_session_token(config)` (which delegates to `AuthService::from_config(config).get_profile(APP_SESSION_PROVIDER, None)`). So a user who: 1. Completed the desktop deep-link OAuth flow (the only supported way to get an account), and 2. Has a fully populated `auth-profiles.json` with a valid encrypted `app-session:default` token, was reported by `check_status` as "API key missing" because their JWT lives in the auth profile store, not in `config.api_key`. The welcome agent then dutifully refused to call `complete_onboarding( complete)` per its own prompt rule ("If critical setup is missing, do not complete onboarding"), and re-ran on every chat turn forever even though there was nothing to fix. This is a **pre-existing upstream bug** dating from PR tinyhumansai#522 (the welcome agent's introduction). It was written before, or in parallel with, the auth-profile refactor that moved session credentials out of `config.api_key` into the dedicated profile store. The check was never updated to reflect the new auth model. ## Fix `check_status` now checks BOTH sources: ```rust let has_legacy_api_key = config.api_key.as_ref().map_or(false, |k| !k.is_empty()); let has_session_jwt = crate::api::jwt::get_session_token(&config) .ok() .flatten() .is_some_and(|t| !t.is_empty()); let is_authenticated = has_legacy_api_key || has_session_jwt; ``` The status report now shows: * `Authentication: configured ✓ (session token from desktop login)` — when the user has gone through the OAuth flow (the typical case); * `Authentication: configured ✓ (legacy api_key)` — when the user has set the legacy `config.api_key` field directly (CI / dev setups); * `Authentication: **missing** — log in via the desktop app or set `api_key` in config to enable inference` — when neither source has a credential. The line label changed from "API key" to "Authentication" because "API key" was misleading for both states (it isn't really the user's API key; it's the openhuman backend session JWT). ## What this unblocks After this fix, a Tauri user who: 1. Logs in once via the desktop OAuth flow → JWT lands in `auth-profiles.json`, 2. Types `hi` in the chat pane → welcome agent runs, 3. Welcome calls `check_status` → reports "Authentication: configured ✓", 4. Welcome calls `complete_onboarding(complete)` → flips `chat_onboarding_completed` from false to true, 5. Next chat turn → routes to orchestrator. Without this fix, step 4 never happens because welcome's prompt explicitly forbids completing without an API key, so the user gets the welcome message on every single turn forever. ## Files touched `src/openhuman/tools/impl/agent/complete_onboarding.rs` * `check_status` reads both `config.api_key` and `crate::api::jwt::get_session_token(&config)`. * Status line renamed from "API key" to "Authentication" with a more informative message. * Long inline comment documenting the two auth sources, why the check needs to consult both, and what bug was being fixed. ## Tests 360/361 tools tests pass. Only failure is the pre-existing Windows browser-screenshot bug (`tools::impl::browser::screenshot:: screenshot_command_contains_output_path`) which fails identically on upstream/main and is unrelated to this PR. The `complete_onboarding::tool_metadata` test still passes — it only checks the tool schema, not the runtime behaviour. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…525, #526) (#544) * feat(agent): add subagents + delegate_name fields to AgentDefinition (#525, #526) Introduces the schema change needed to make agent definitions the single source of truth for both direct tools and delegation targets: - `subagents: Vec<SubagentEntry>` — declarative list of agents this agent can spawn, expanding at build time into synthesised delegate_* tools on the LLM's function-calling surface. Supports two TOML shapes via `#[serde(untagged)]`: * Bare string (`"researcher"`) → `SubagentEntry::AgentId` * Inline table (`{ skills = "*" }`) → `SubagentEntry::Skills(SkillsWildcard)` The `Skills` variant expands dynamically to one delegate_{toolkit} tool per connected Composio toolkit at runtime. - `delegate_name: Option<String>` — optional override for the tool name this agent is exposed as when another agent lists it in `subagents`. Defaults to `delegate_{id}` when absent, lets the researcher agent be exposed as `research`, code_executor as `run_code`, etc. Schema only — no runtime behavior change yet. Follow-up commits wire the field into `collect_orchestrator_tools`, the dispatch path, and the debug dump. TOML placement note: `subagents = [...]` must appear before the `[tools]` table header in agent TOMLs. Once a table section opens, every subsequent top-level key is consumed by that table, so placing `subagents` after `[tools]` parses it as `tools.subagents` and fails deserializing ToolScope. The test doc-comment records this constraint. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(agent): drive orchestrator delegation tools from TOML subagents (#525, #526) Completes the half-built migration from a hardcoded ARCHETYPE_TOOLS table + orphan skill_delegation.rs to a TOML-driven delegation surface where each agent declares its subagents and the tool list is synthesised from the registry at agent-build time. Rewrites `collect_orchestrator_tools` to take the orchestrator's AgentDefinition, the global AgentDefinitionRegistry, and the slice of connected Composio integrations, then iterates the definition's `subagents` field: * `SubagentEntry::AgentId` → one ArchetypeDelegationTool. The tool's `name()` comes from the target agent's `delegate_name` override (or `delegate_{id}` fallback) and its `description()` is the target's `when_to_use`. Editing an agent's TOML when_to_use now immediately updates the tool schema the orchestrator LLM sees — zero drift, no hardcoded description strings to keep in sync. * `SubagentEntry::Skills { skills = "*" }` → one SkillDelegationTool per connected Composio toolkit. Each routes to the generic skills_agent with `skill_filter` pre-populated. Empty integrations list (CLI path, or user not yet connected) produces zero tools rather than phantom `delegate_*` entries for unconnected toolkits. Also in this commit: - Wire the orphan `skill_delegation.rs` into `impl/agent/mod.rs` via `mod skill_delegation;` + `pub use SkillDelegationTool;`. The file has existed since earlier work but was never declared in the module tree, so it compiled as dead code. - Delete the legacy `MAIN_AGENT_TOOL_ALLOWLIST` constant and the `main_agent_tools` filter in `tools/ops.rs`. They were documented as "no longer the primary source of truth" since the from_config builder switched to `collect_orchestrator_tools`, and grep confirms no external callers remain. Clean deletion. - Delete the hardcoded `ARCHETYPE_TOOLS` const in `tools/impl/agent/mod.rs`. The 4-entry table has been replaced by the orchestrator TOML's `subagents` list (which covers those 4 plus archivist plus the skills wildcard), and the re-export in `tools/mod.rs` is removed accordingly. - Update `agents/orchestrator/agent.toml`: add the `subagents` field listing researcher / planner / code_executor / critic / archivist / { skills = "*" }. Keep `spawn_subagent` in `[tools] named` as an advanced fallback so power users can still spawn custom workspace- override agent ids that aren't in the declarative subagents list. - Add `delegate_name = "..."` to the 5 archetype TOMLs so the orchestrator LLM sees natural tool names (`research`, `plan`, `run_code`, `review_code`, `archive_session`) rather than the `delegate_<agent_id>` fallback. - Update `agent/harness/session/builder.rs` (line ~461) to call the new `collect_orchestrator_tools` signature. Looks up the orchestrator definition from the global registry; passes an empty integrations slice because the builder is synchronous and cannot await Composio's async fetch. The channel-dispatch path will populate integrations in a later commit — the CLI/REPL path ships without per-toolkit delegation tools, which is acceptable regression since CLI users still reach Composio via `composio_execute` and the retained `spawn_subagent` fallback. Tests: * 5 new unit tests in `orchestrator_tools.rs` cover the baseline AgentId + Skills wildcard expansion, empty-integrations edge case, unknown-id graceful skip, non-delegating agent with empty subagents, and the slug sanitiser for tool-name-safe Composio toolkit names. * Runs clean alongside all existing agent-module tests (323 pass; one pre-existing Windows-path failure in `self_healing::tests:: tool_maker_prompt_includes_command` is unrelated to this PR and fails identically on the upstream baseline). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(agent): plumb per-agent tool scoping through bus + tool loop (#525, #526) Adds the parameter plumbing for agent-aware tool filtering without changing any runtime behaviour. Every existing call site continues to pass `None` / empty extras, so the LLM still sees the full unfiltered registry — the actual routing logic that populates these fields lands in commit 4b (dispatch.rs onboarding-flag → target_agent_id). Why two parameters and not one filter: Tools in this codebase are `Box<dyn Tool>` — owned trait objects with no Clone impl, stored in a shared `Arc<Vec<Box<dyn Tool>>>`. We can't cheaply build a per-turn filtered subset of the global registry, and we can't mutate the Arc to remove entries. Two new parameters work around this without touching the global registry's lifetime model: * `visible_tool_names: Option<&HashSet<String>>` — whitelist filter applied at the iteration site inside `run_tool_call_loop`. When `Some(set)`, only tools whose `name()` is in the set contribute to the function-calling schema and are eligible for execution; every other tool in the combined registry is hidden from the model and rejected if the model emits a call for it. `None` preserves the legacy "everything visible" behaviour. * `extra_tools: &[Box<dyn Tool>]` — per-turn synthesised tools spliced alongside `tools_registry`. The dispatch path will use this to surface delegation tools (`research`, `delegate_gmail`, …) that are built fresh each turn from the active agent's `subagents` field and the current Composio integration list — tools that don't exist in the global startup-time registry because they depend on per-user runtime state. Empty slice for agents that don't delegate. Inside the loop, `tool_specs` is built from `tools_registry.iter().chain(extra_tools.iter()).filter(is_visible)`, and the tool-execution lookup uses the same chain + filter so the function-calling schema and the execution surface stay in sync. Files touched in this commit: src/openhuman/agent/harness/tool_loop.rs - Add `visible_tool_names` and `extra_tools` parameters to `run_tool_call_loop`. Build `tool_specs` from chained iteration with the visibility filter applied. Replace the `find_tool` call at the execution site with an inline chain+filter lookup so hallucinated calls to filtered-out tools surface as "unknown tool" errors. Drop the now-unused `find_tool` import. - Update the legacy `agent_turn` wrapper to pass `None, &[]`, preserving its existing unfiltered behaviour. - Update all 9 in-file test sites to pass `None, &[]`. src/openhuman/agent/harness/tests.rs - Update all 3 `run_tool_call_loop` test sites to pass `None, &[]`. src/openhuman/agent/bus.rs - Add `target_agent_id: Option<String>`, `visible_tool_names: Option<HashSet<String>>`, and `extra_tools: Vec<Box<dyn Tool>>` fields to `AgentTurnRequest`, with rustdoc explaining each. - Destructure the new fields in the `agent.run_turn` handler; thread `visible_tool_names.as_ref()` and `&extra_tools` through to `run_tool_call_loop`. Augment the dispatch trace with target_agent / extra_tool_count / visible_tool_count / filter_active so production logs show whether scoping is active. - Update the in-test `test_request()` helper to populate the new fields with safe defaults. src/openhuman/agent/triage/evaluator.rs - Update the triage `AgentTurnRequest` initializer to set `target_agent_id = Some("trigger_triage")` (for tracing) with `visible_tool_names: None` + `extra_tools: Vec::new()` because the classifier intentionally runs against an empty registry and emits a structured JSON decision rather than calling tools. src/openhuman/channels/runtime/dispatch.rs - Update the channel-message `AgentTurnRequest` initializer to set the three new fields to safe defaults (`None` / `None` / empty vec). Commit 4b will replace these with the real onboarding-flag based routing. src/openhuman/tools/impl/agent/mod.rs - Bug fix: `dispatch_subagent` previously took `_skill_filter: Option<&str>` but discarded the value, hardcoding `SubagentRunOptions::skill_filter_override = None`. That meant `SkillDelegationTool::execute()` synthesising `dispatch_subagent("skills_agent", ..., Some("gmail"))` never actually narrowed `skills_agent`'s tool list — so even with the orchestrator's view scoped, the spawned `skills_agent` subagent would still see the full Composio catalog. Drop the underscore, propagate `skill_filter` into `skill_filter_override`, and add a tracing log line to make this path observable. This is the downstream half of the #526 leak that commit 3's orchestrator- side scoping alone wouldn't have caught. Tests: 8/8 `tool_loop` tests pass, 3/3 harness `tests.rs` cases pass, 323/324 agent module tests pass overall (the one failure is the same pre-existing Windows-path bug in `self_healing::tool_maker_prompt_ includes_command` that fails identically on the upstream baseline). No existing test expectations were changed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(dispatch): route channel messages to welcome/orchestrator by onboarding flag (#525) Wires the per-agent tool scoping plumbing from commit 4a into the channel-message dispatch path. Each incoming channel message now picks the active agent — `welcome` pre-onboarding, `orchestrator` post — based on `Config::onboarding_completed`, loads the matching definition from the global `AgentDefinitionRegistry`, synthesises any `delegate_*` tools the agent declares in its `subagents` field, and passes everything through to `agent.run_turn` on the bus. This is the half of #525 that makes the welcome agent actually run for new users — the welcome definition has existed since upstream PR #522 but had no caller; nothing in dispatch consulted the onboarding flag, so every channel message ran through the same generic tool loop with the full registry exposed. Changes: src/openhuman/channels/runtime/dispatch.rs - New `AgentScoping` struct carrying the three new `AgentTurnRequest` fields (`target_agent_id`, `visible_tool_names`, `extra_tools`) plus an `unscoped()` constructor for safe-fallback paths. - New async `resolve_target_agent(channel)` helper: * fresh `Config::load_or_init().await` per turn (no cache — the loader reads from disk every call, verified at `config/schema/load.rs:409`, so the welcome→orchestrator handoff is observed on the next message after `complete_onboarding(complete)` flips the flag, with no need for an explicit handoff event); * picks `"welcome"` or `"orchestrator"` based on the flag and emits a structured `[dispatch::routing] selected target agent` info trace recording the choice + the flag value, satisfying the #525 acceptance criterion `"agent-selection logs clearly record why each agent was selected at onboarding boundaries"`; * looks up the definition in `AgentDefinitionRegistry::global()`, gracefully falling back to `AgentScoping::unscoped()` (= legacy behaviour, no filter, no extras) if the registry isn't initialised or the definition isn't found, so a routing miss never fails the user message; * for agents with a non-empty `subagents` field, awaits `composio::fetch_connected_integrations(&config)` and runs `orchestrator_tools::collect_orchestrator_tools` to materialise per-turn delegation tools (`research`, `plan`, `delegate_gmail`, …). Agents with empty `subagents` get an empty extras vec. - New `build_visible_tool_set(definition, &extra_tools)` helper that returns `Some(union)` for `ToolScope::Named` agents (their named list ∪ the names of the synthesised delegation tools) and `None` for `ToolScope::Wildcard` agents to preserve the unfiltered semantics — so agents like `skills_agent` and `morning_briefing` that already work via `wildcard + category_filter` keep their existing behaviour without this layer interfering. - `process_channel_message` calls `resolve_target_agent` once per turn, drops the placeholder defaults from commit 4a, and feeds the real `target_agent_id`/`visible_tool_names`/`extra_tools` into `AgentTurnRequest`. - New imports: `AgentDefinition`, `AgentDefinitionRegistry`, `ToolScope`, `Config`, `fetch_connected_integrations`, `orchestrator_tools`, `Tool`, `HashSet`. End-to-end behaviour after this commit: 1. New user, `onboarding_completed=false`: dispatch picks `welcome`, loads its 2-tool TOML scope, builds `visible_tool_names = {complete_onboarding, memory_recall}`, no extras, hands off to the bus. Bus handler applies the filter → welcome's LLM sees exactly 2 tools. 2. Welcome agent guides the user through setup, eventually calls `complete_onboarding(action="complete")` → flag persists to disk via `config.save()`. 3. Next user message: dispatch reads the flag fresh, picks `orchestrator`, fetches connected Composio integrations, expands `subagents = ["researcher", "planner", "code_executor", "critic", "archivist", { skills = "*" }]` into delegate_research / delegate_plan / delegate_run_code / delegate_review_code / delegate_archive_session + one delegate_<toolkit> per connected integration. visible_tool_names is the union with the 4 direct tools from orchestrator's `[tools] named` list. LLM sees the scoped delegation surface, not the full 1000+ Composio catalog. #526's runtime leak is now fixed end-to-end: the orchestrator's LLM prompt only contains the tools its TOML allows, and the SkillDelegationTool path narrows skills_agent to a single toolkit via the `skill_filter` propagation fix from commit 4a. No agent at any layer sees more than its definition declares. Tests: 599/599 channel module tests pass — including `runtime_dispatch::dispatch_routes_through_agent_run_turn_bus_handler` and the telegram integration variant, which exercise the full bus roundtrip with the new fields populated. No existing assertions were modified. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(debug_dump): apply orchestrator definition filter to main dump (#526) Replaces the explicit `empty_filter: HashSet::new()` in `render_main_agent_dump` with a real visibility whitelist derived from the orchestrator's `AgentDefinition`. The "main" dump path now mirrors the runtime dispatch path from commits 4a/4b — same definition, same delegation tool synthesis, same filter — so `openhuman agent dump-prompt main` shows what the LLM actually sees in production instead of the unfiltered global registry. Before this commit, `dump-prompt main` always rendered every tool in the registry regardless of which agent was supposed to run, which is exactly the symptom #526 reports: "agent prompt tool scopes leak full GitHub tool catalog". The runtime fix in commit 4b stops the leak in production but the dump path was the user's primary observability tool for inspecting prompts, so leaving it unscoped would mask future regressions and confuse debug sessions. Changes in `src/openhuman/context/debug_dump.rs`: * `dump_agent_prompt` now loads `AgentDefinitionRegistry` once at the top (previously only loaded inside the sub-agent branch). When the request is for "main" or "orchestrator_main", it looks up the "orchestrator" definition from the registry and passes it into `render_main_agent_dump` along with the registry itself. Missing orchestrator entry → structured error listing known agents instead of silently rendering an unfiltered prompt. * `render_main_agent_dump` signature gains `registry: &AgentDefinitionRegistry` and `orchestrator_def: &AgentDefinition`. Inside, it: - calls `collect_orchestrator_tools(orchestrator_def, registry, connected_integrations)` to synthesise the same per-turn delegation tools (`research`, `plan`, `delegate_<toolkit>`, …) that dispatch generates; - extends `prompt_tools` with the synthesised extras so they contribute to the rendered tool catalogue; - builds `visible_filter: HashSet<String>` from `orchestrator_def.tools` (the `[tools] named` list) ∪ the names of the synthesised extras, falling back to an empty HashSet when the orchestrator definition uses `ToolScope::Wildcard` (which the prompt builder treats as "no filter, every tool visible") so dump consumers that supply a wildcard orchestrator (custom workspace overrides, tests) retain the legacy unscoped behaviour; - replaces `visible_tool_names: &empty_filter` in the `PromptContext` with `&visible_filter`; - filters the returned `tool_names` and `skill_tool_count` by the same predicate so the `DumpedPrompt` summary fields match what the prompt text actually contains. Tests: * Replaces the previous `render_main_agent_dump_includes_tool_ instructions_and_skill_count` test with two more focused cases: 1. `render_main_agent_dump_wildcard_scope_shows_full_tool_set` — regression guard for the legacy wildcard path. Builds a wildcard-scoped orchestrator definition, asserts every tool from `tools_vec` survives, and checks the standard system- prompt skeleton (Tools section, Tool Use Protocol, cache boundary) still renders. 2. `render_main_agent_dump_named_scope_filters_to_whitelist` — the #526 regression guard. Builds an orchestrator with `ToolScope::Named(["query_memory", "ask_user_clarification"])` and a `tools_vec` containing `shell`, `query_memory`, and `GMAIL_SEND_EMAIL`. Asserts the dump's `tool_names` is exactly `["query_memory"]` — `shell` and `GMAIL_SEND_EMAIL` are in the global registry but NOT in the whitelist, so they MUST be excluded. If a future change reintroduces the unfiltered behaviour this test fails immediately. * Adds two test helpers: `wildcard_orchestrator_def()` builds a minimal orchestrator definition with all `omit_*` flags set and `ToolScope::Wildcard`, and `registry_with_orchestrator(orch)` wraps it in an `AgentDefinitionRegistry` so the tests can call `render_main_agent_dump` without going through the full TOML loader. 11/11 debug_dump tests pass. The two new guards plus the 9 existing sub-agent / filter / composio-stub tests all run clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test(dispatch): unit tests for build_visible_tool_set + cargo fmt cleanup (#525, #526) Adds focused unit tests for the per-agent scoping helper landed in commit 4b and runs `cargo fmt` across the files touched by this PR. Why scoping unit tests, not full integration tests: `resolve_target_agent` is async and reads `Config::load_or_init().await` which does a real disk read every call (no cache, verified at `config/schema/load.rs:409`). Mocking that requires either spinning up a full workspace under a temp dir with a config.toml containing the right `onboarding_completed` value, or adding a test-only injection point on the public Config API. Both are tractable but invasive enough to belong in their own follow-up PR. The end-to-end dispatch path is already covered by the existing channel integration tests (`dispatch_routes_through_agent_run_turn_bus_handler` etc.) which exercise the full bus roundtrip with the new fields populated, and which still pass after the new resolver landed (it gracefully falls back to `AgentScoping::unscoped()` when no orchestrator definition is registered in the test environment). Pure-function unit tests for `build_visible_tool_set` cover the branching logic that does the actual scoping work: how the named whitelist + extras union is built, how Wildcard scope is preserved, how duplicates are de-duplicated, etc. That's the part most likely to drift in future changes, so it's the part most worth fencing with focused tests. Tests added (all in `src/openhuman/channels/runtime/dispatch.rs` under the new `scoping_tests` module): * `wildcard_scope_yields_none_filter` — `ToolScope::Wildcard` must produce `None` regardless of whether extras are present, so skills_agent / morning_briefing keep their full skill-category catalogue. * `named_scope_without_extras_returns_named_only` — the welcome agent's path: 2 named tools, no delegation, exactly 2 entries in the visibility whitelist. * `named_scope_with_extras_returns_union` — the orchestrator's path: 3 direct named tools + 3 synthesised extras (research, delegate_gmail, delegate_github) → 6 entries. * `empty_named_with_extras_returns_extras_only` — guards a future "delegation-only" agent layout where the agent has no direct tools of its own, just spawns subagents. * `empty_named_with_no_extras_returns_empty_set` — guards the distinction between `None` (no filter, all visible) and `Some(empty)` (filter active, nothing matches). Important because the prompt loop's `is_visible` check treats them differently. * `duplicate_names_across_named_and_extras_are_deduplicated` — the HashSet handles collisions automatically, but the test pins that behaviour so a future migration to `Vec<String>` (which would silently double-count) gets caught. * `agent_scoping_unscoped_has_no_filter_or_extras` — pins the safe-fallback constructor's contract. Used when the registry is uninitialised or the target agent is missing — every field must default to "no scoping" so the channel turn falls back to legacy unfiltered behaviour rather than crashing. Plus `cargo fmt` run across the 6 files modified by this PR. No behavioural changes. Final test status across all commits 2-6 in this PR: * agent::harness::definition: 10/10 ✅ (4 new for Subagents schema) * agent::harness::tool_loop: 8/8 ✅ * agent::harness::tests: 3/3 ✅ * tools::orchestrator_tools: 5/5 ✅ (5 new) * channels::*: 599/599 ✅ (incl. dispatch integration) * channels::runtime::dispatch::scoping_tests: 7/7 ✅ (7 new) * context::debug_dump: 11/11 ✅ (1 replaced + 1 new) * Total agent module: 323/324 (one pre-existing Windows path failure in `self_healing::tool_maker_prompt_includes_command` confirmed identical against upstream/main baseline) Pre-existing Windows-environment test failures NOT caused by this PR and out of scope (all confirmed identical on upstream baseline; CI on Linux is unaffected): * self_healing::tool_maker_prompt_includes_command (PathBuf separator) * cron::scheduler::run_job_command_success / _failure (Unix shell) * composio::trigger_history::archives_triggers_in_daily_jsonl... (path) * local_ai::paths::target_paths_preserve_absolute_overrides (path) * security::policy::checklist_root_path_blocked (POSIX absolute) * security::policy::checklist_workspace_only_blocks_all_absolute (POSIX) * tools::implementations::browser::screenshot::screenshot_command_ contains_output_path (browser binary lookup) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(welcome): upgrade bare-install nudge with concrete integration pitches (#525) Follow-up to #522 that addresses a rough UX edge in the welcome agent prompt: users who arrive with only an API key configured (no channels, no Composio integrations, no web search / browser / local AI) were getting a "gentle suggestion" to connect something without any concrete picture of what they'd actually unlock. The welcome agent would finish onboarding, hand off to orchestrator, and leave the user staring at a functional-but-empty assistant with no roadmap for how to make it useful. This commit rewrites the prompt to handle sparse setups explicitly. No Rust changes — this is prompt.md only, picked up at build time via the existing `include_str!` loader in `agents/mod.rs`. Changes to `src/openhuman/agent/agents/welcome/prompt.md`: * Step 2's "point out what's missing" sub-point is rewritten from a single "gently suggest" line into a four-case decision tree keyed off `check_status` state: - No API key → critical, block completion. - Integrations yes, channels no → note the Tauri-only reach limitation, suggest a messaging platform. - Channels yes, integrations no → degraded assistant, nudge toward Composio. - Nothing beyond the API key → the "bare install" case, gets the new Step 2.5 treatment. * New **Step 2.5: Handling a bare install** section added after Step 2. Spells out what the user DOES have (sandboxed reasoning + coding assistant with memory), what they're MISSING (any external action), and how to structure the message: state the current capability honestly, pitch 2-3 specific integrations with concrete example prompts, point to Settings → Integrations / Channels, and leave room for the user to opt into the coding-only experience if that's what they actually want. For bare-install users the word budget stretches to 250-400 words (up from 200-350) so the concrete pitches and example prompts actually fit without cramming. * New **Integration capability reference** section giving the LLM a menu it can draw from when pitching integrations. Each entry is a one-line "connect X → I can Y" with a concrete example prompt the user could send next: - Gmail: "Summarise the most important emails that came in overnight and flag anything that needs a reply today." - Google Calendar: "What's on my calendar tomorrow, and do I have a 30-minute gap before 2pm?" - GitHub: "List open issues on my main project tagged 'bug' and summarise which ones look newest or most urgent." - Notion: "Pull up my 'Ideas' Notion database and show me the three newest entries." - Slack / Discord / Linear / Jira / etc. with similar shapes. Plus a sub-section for messaging platforms (Telegram / Discord / Slack / iMessage / WhatsApp / Signal / web-fallback) that clarifies which each is best for, and a sub-section for the other capabilities (web search, browser automation, HTTP requests, local AI) that explains what breaks without them. The LLM is told NOT to list everything — just pick 2-3 most likely to matter, defaulting to Gmail + GitHub + one of {Calendar, Notion} as the top-3 pitch when no profile context is available. * Tone guidelines updated to document the stretched word budget for bare installs (200-350 for configured users, 250-400 for bare installs). * "What NOT to do" list updated: - Explicitly allows product-tour-style listing ONLY in the bare-install case (Step 2.5), forbids it elsewhere. - Clarifies that describing what WOULD unlock with integration X is fine and encouraged; claiming a capability the user doesn't have is still forbidden. - Adds a new "Don't gloss over a bare install" entry that pins the rule: API-key-only users get concrete pitches and example prompts, not vague suggestions. Scope note: this commit does NOT change the completion logic. `complete_onboarding(complete)` still accepts API-key-only as the minimum bar — that's a separate design question for the maintainer about whether zero-integration users should be gatekept. This change improves what the welcome agent SAYS to those users, not whether they're allowed to proceed. Tests: all 14 `agent::agents::tests` pass (including `welcome_has_onboarding_and_memory_tools` which validates the welcome agent's declarative shape is unchanged). The prompt.md edit is pure content — no schema changes, no tool additions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(web-channel): route Tauri in-app chat to welcome/orchestrator + Windows fsync fix (#525) Closes the web-channel-side half of the #525 welcome agent routing. Also bundles a small pre-existing Windows compatibility fix for `app_state::ops::sync_parent_dir` that was blocking end-to-end testing of this branch on Windows. ## Web channel routing (primary change) Commit 4b (274b50ed) wired the welcome-vs-orchestrator routing into `channels/runtime/dispatch.rs::process_channel_message` — the handler for inbound messages on external channels (Telegram, Discord, Slack, iMessage, etc.). That path reads `config.onboarding_completed` and selects the right agent via the bus. End-to-end testing on the Tauri desktop app revealed a gap: the in-app chat window does NOT route through `process_channel_message`. It uses its own JSON-RPC method (`openhuman.channel_web_chat` → `start_chat` → `run_chat_task` → `build_session_agent` → `Agent::from_config`) which was hardcoded to build a generic main-agent every turn. So a new user typing in the Tauri app saw the orchestrator immediately, never the welcome agent. This commit closes that gap without adding a third code path: ### `src/openhuman/agent/harness/session/builder.rs` * Split the existing `Agent::from_config` into a thin wrapper + a new `Agent::from_config_for_agent(config, agent_id)`. The wrapper calls the new function with `agent_id = "orchestrator"`, preserving the legacy behaviour byte-identically for all existing callers (CLI, REPL, tests, every call site that doesn't care which agent they get). * `from_config_for_agent` looks up `agent_id` in `AgentDefinitionRegistry::global()` up front; unknown ids fail fast with a clear error. Missing orchestrator is tolerated as a legacy / pre-startup fallback so existing tests that run without a populated registry continue to work. * When a target definition is resolved: - `prompt_builder` uses `SystemPromptBuilder::for_subagent` with the agent's `system_prompt` body (read from the registry, which already has it inlined via `include_str!` from `agents/*/prompt.md` at crate-build time) and the three `omit_*` flags from its TOML. - `visible_tool_names` is built from the agent's `ToolScope::Named` list, unioned with the names of any synthesised delegation tools produced by `collect_orchestrator_tools` (for agents that declare `subagents = [...]`). - `ToolScope::Wildcard` yields an empty filter, matching the legacy "no filter, all visible" semantics. - `temperature` comes from the agent's TOML (e.g. welcome is 0.7, orchestrator is 0.4) instead of `config.default_temperature`. * Legacy behaviour for plain `from_config(config)` — which passes `agent_id = "orchestrator"` — is preserved: the prompt builder continues to use `SystemPromptBuilder::with_defaults()`, temperature comes from `config.default_temperature`, and the orchestrator's `subagents` list drives delegation tool synthesis exactly as before. No test expectations changed. ### `src/openhuman/channels/providers/web.rs` * `build_session_agent` now reads `config.onboarding_completed` and picks `"welcome"` or `"orchestrator"` as the target agent id, then calls `Agent::from_config_for_agent`. Structured info log records the routing decision + the flag state for observability, matching the `[dispatch::routing]` trace pattern from Commit 4b. * `config.onboarding_completed` is read fresh every turn because `run_chat_task` loads the config via `config_rpc::load_config_with_timeout`, which reads from disk on every call (no in-process cache). Welcome → orchestrator handoff therefore happens automatically on the next chat message after `complete_onboarding(complete)` flips the flag, with no explicit event or notification plumbing. * `set_event_context` call is unchanged — the event context is a (session_id, channel_name) pair for telemetry, not the agent id. ## Windows fsync fix (bundled) ### `src/openhuman/app_state/ops.rs` `sync_parent_dir` was calling `File::open(parent).and_then(|dir| dir.sync_all())` on the save path for `app_state.json`. On Windows, opening a directory as a regular file requires `FILE_FLAG_BACKUP_SEMANTICS` which `std::fs::File::open` does not set, so the call fails with "Access is denied. (os error 5)". Mirrored the existing `#[cfg(unix)]` guard already in `config/schema/load.rs::sync_directory`. On non-Unix the function is a no-op and returns `Ok(())`. Durability on Windows is provided by `NamedTempFile::persist`'s atomic `MoveFileEx`, which is sufficient for config files. This is a pre-existing upstream bug, not caused by this PR, but it blocks end-to-end testing of any code path that touches `app_state::save_stored_app_state_unlocked` on Windows — which includes the web channel's agent_server_status RPC. Fixing it here so the #525 routing can actually be tested on the developer's machine, and the fix is tiny and self-contained. The `File` import on line 1 is also gated behind `#[cfg(unix)]` to avoid an unused-import warning on Windows. ## Tests All 35 `agent::harness::session` tests pass, including the transcript round-trip + session queue + runtime tests that exercise the builder path most heavily. No test expectations were modified; the refactor is a pure delegation chain (`from_config` → new inner method → existing builder). CLI dump-prompt testing from the earlier session still validates: * `openhuman agent dump-prompt welcome` → 2 tools, Step 2.5 bare-install prompt embedded * `openhuman agent dump-prompt main` → 6 tools, zero skill leakage * `openhuman agent dump-prompt main --stub-composio` → still 6 tools, composio meta-tools correctly filtered out End-to-end Tauri testing (next step after this commit) will exercise the new `build_session_agent` branch live — user types `hi`, web channel routes to welcome, welcome replies with the updated bare-install prompt from commit 990a7264. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(config): split chat_onboarding_completed from React UI onboarding flag (#525) End-to-end testing on the Tauri desktop app revealed that the welcome agent could never run from the in-app chat pane — even with all of Commits 4b/8 in place — because the React layer's `OnboardingOverlay.tsx` renders a full-screen wizard whenever `onboarding_completed = false` and gates the chat pane behind it. By the time a user can type a chat message the React wizard has already flipped `onboarding_completed = true` (via `OnboardingOverlay::handleDone` or the `Onboarding.tsx` wizard's completion handler), so the welcome agent's routing condition is never satisfied on the Tauri surface. This commit fixes the architectural conflict by splitting the single `onboarding_completed` flag into two orthogonal flags: * **`onboarding_completed`** — unchanged semantics. Tracks whether the React UI wizard has been completed/dismissed. Continues to be set by `OnboardingOverlay.tsx::handleDone` and `Onboarding.tsx` via the existing `config.set_onboarding_completed` JSON-RPC method. Used exclusively to gate whether the React layer renders the wizard. * **`chat_onboarding_completed`** — NEW. Tracks whether the welcome agent's chat-based greeting flow has run. Set exclusively by the welcome agent itself via `complete_onboarding(action="complete")`. Used by both the Tauri web channel (`channels::providers::web::build_session_agent`) and the external channel dispatch path (`channels::runtime::dispatch::resolve_target_agent`) to decide whether to route to welcome or orchestrator. The two flags are intentionally orthogonal: * A Tauri desktop user completes the React wizard → only `onboarding_completed` flips → wizard disappears → user types `hi` → welcome agent runs (because `chat_onboarding_completed` is still false) → welcome calls `complete_onboarding(complete)` → `chat_onboarding_completed` flips → next chat turn routes to orchestrator. * A Telegram/Discord user (no React wizard exists) sends a message → external channel dispatch checks `chat_onboarding_completed` → routes to welcome → welcome runs → flips the flag → next inbound message routes to orchestrator. Both paths give every user the chat welcome experience, regardless of which surface they came in through, and without requiring the React wizard to be removed or restructured. ## Files changed `src/openhuman/config/schema/types.rs` * Add `chat_onboarding_completed: bool` field with `#[serde(default)]` for backward compat — existing `config.toml` files that don't have the field default to `false`, which means existing users will see the welcome agent on their next chat turn (correct behaviour, the welcome agent is idempotent). * Default impl initializes the new field to `false`. * Extensive rustdoc on both flags explaining the orthogonal split, why two flags exist, and which code paths gate on which. `src/openhuman/tools/impl/agent/complete_onboarding.rs` * `check_status` now reports BOTH flags side-by-side so the welcome agent's LLM can see whether the React wizard has run AND whether the chat welcome itself has run. Old single "Onboarding completed" line replaced with two lines: "UI onboarding wizard completed" and "Chat welcome flow completed". * `complete()` now flips `chat_onboarding_completed`, NOT `onboarding_completed`. The React UI's flag is left untouched — that's owned by the React layer. Idempotency guard updated to check the chat flag. * Rustdoc on `complete()` explains the orthogonal-flags rationale for future readers. `src/openhuman/channels/providers/web.rs::build_session_agent` * Reads `effective.chat_onboarding_completed` instead of `effective.onboarding_completed` for the welcome-vs-orchestrator decision. * Log line now includes BOTH flags so observability captures the full picture (e.g. `chat_onboarding_completed=false, ui_onboarding_completed=true` is the expected steady state for a Tauri user who completed the React wizard but hasn't typed yet). `src/openhuman/channels/runtime/dispatch.rs::resolve_target_agent` * Same flag swap for the external-channel path. * `[dispatch::routing] selected target agent` info trace also reports both flags. * Function-level rustdoc updated with the new semantics. ## Backward compatibility * Existing `config.toml` files: `chat_onboarding_completed` defaults to `false` via `#[serde(default)]`. Means existing users get a welcome message on their next chat turn — this is the desired behaviour, not a regression. * React layer: untouched. The `config.set_onboarding_completed` JSON-RPC method continues to write the same field; the new flag is not exposed to React at all. * External callers of `complete_onboarding(complete)`: they now flip a different flag. This may matter for callers who were depending on the flag flip side effect; a quick grep shows `tools/impl/agent/complete_onboarding.rs` is the only caller and the rest of the codebase reads `onboarding_completed` for UI purposes (snapshots, app state) and `chat_onboarding_completed` for routing — the split is clean. ## Tests 399/400 tools tests pass. The single failure (`tools::impl::browser::screenshot::screenshot_command_contains_output_path`) is a pre-existing Windows-environment bug that fails identically on `upstream/main` baseline and is unrelated to this PR. No test expectations were modified. ## Next step Restage sidecar, relaunch Tauri, click through the React wizard once to dismiss it, then type `hi` in the chat pane. This time `build_session_agent` should read `chat_onboarding_completed=false` and route to welcome, even though `onboarding_completed=true` was set by the React layer. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(welcome): force tool-call-first iteration to stop greeting fallback (#525) End-to-end testing on the Tauri desktop app revealed that the welcome agent — even with all routing/scoping commits in place and the correct prompt embedded — was producing 1-line greetings like "Hey! What's up?" (15 chars, single iteration, zero tool calls) instead of running its workflow. The user typed `hihi`, the LLM saw the welcome agent's full ~14KB persona prompt, and chose to ignore every workflow step in favour of the chatbot fallback behaviour. Diagnosis: the previous prompt was descriptive ("Call check_status to get a snapshot...") rather than imperative. Combined with a "Concise" tone guideline and a low-information user input ("hihi"), the model under tension between "be warm and concise" and "follow the workflow" collapsed to "tiny greeting" — which is the chatbot training-data default for short user messages. This commit makes the workflow non-negotiable by: ## 1. New "MANDATORY FIRST ACTION" preamble at the top of the prompt Inserted as a blockquote between the role description and "Your workflow" so the model encounters it before any workflow steps. The preamble: * States explicitly that the **first thing emitted on every turn must be a tool call** to `complete_onboarding(check_status)` — before any user-facing text, before any greeting, before any thinking out loud. * States the user's input is **irrelevant** to this rule: "hi", "hello", emoji, nothing — the first iteration always emits the same thing, a check_status tool call. * Explains *why* the rule exists (without a status snapshot the agent has nothing to personalise on, and any blind greeting defeats the welcome agent's one job). * Shows three concrete ❌ wrong examples (generic chitchat reply, greeting-then-tool, refusing the tool because the user said hi) and one ✅ correct example (first iteration emits ONLY the tool call, message comes in iteration 2). * Closes with a stop-and-correct rule: "If you find yourself about to write any text in your first iteration, STOP. Emit the tool call instead." ## 2. Strengthened "Your workflow / Step 1" heading Renamed to "Step 1: Check setup status (ALWAYS — see Mandatory First Action above)" so the cross-reference is unmissable. Body explicitly says "In your **first iteration**, ... No text. No greeting. Just the tool call. ... You will use this report to write the actual welcome message in your second iteration." ## 3. Length is non-negotiable (rewritten "Concise" tone guideline) The previous "Concise — but scale with the situation" framing was the exact phrase the model latched onto when producing 15 chars. Rewritten to "Length is non-negotiable" and adds: * "A 1-2 sentence greeting is a failure, not a 'concise' success." * Explicit "if you ever produce a message under 100 words, stop and try again — you've almost certainly skipped a required element." * A "concise vs. terse" callout distinguishing "no wasted words in a message that does its full job" from "skip the job entirely." ## 4. New "What NOT to do" entries Three new bullets target the exact failure modes observed: * The existing "Don't skip check_status" entry is upgraded with "**This is the single most common failure mode**" and a pointer back to the mandatory-first-action preamble. * New "Don't reply with a 1-line greeting" entry forbids the chatbot fallback explicitly and sets a 100-word floor. * New "Don't treat 'hi' / 'hello' / short greetings as a signal to be brief" entry explicitly disconnects user input length from agent output length, since "hi" is the most common opening and the user typing it needs the FULL welcome experience. ## What this does NOT change * No Rust code changes. `prompt.md` is `include_str!`'d into the binary at crate-build time via `agents/mod.rs`, so the next sidecar rebuild picks up the new content automatically. * No agent definition changes (`agent.toml` untouched). The welcome agent still has 2 tools (`complete_onboarding` + `memory_recall`), `temperature = 0.7`, `max_iterations = 6`, `omit_*` flags unchanged. * No model hint change. Still `"agentic"`. If the new prompt language alone isn't enough to push the model over the workflow-following threshold, a follow-up commit can bump to `"reasoning"` for stronger instruction-following. * The integration capability reference, Step 2.5 bare-install handling, subscription/referral flow, and handoff sections are unchanged from commit 990a7264. Those were never the problem — the problem was the model never reaching them because it skipped Step 1 entirely. ## Tests 14/14 `agent::agents::tests` pass, including `welcome_has_onboarding_and_memory_tools` and `every_builtin_has_a_ prompt_body`. The structural shape of the welcome agent definition is byte-identical; only the embedded prompt body changed. ## Verification plan After restaging the sidecar and relaunching Tauri, type `hi` in the chat pane. Expected behaviour: 1. `[web-channel] routing chat turn to 'welcome'` (already validated in commit 56d95e2c). 2. `[agent::builder] building session agent id=welcome` (already validated). 3. **`[agent_loop] iteration start i=1`** with a `check_status` tool call as the first emission — this is the new behaviour we're verifying. 4. `[agent_loop] iteration start i=2` after the tool returns, with the actual welcome message — should be 100-400 words covering all required elements. 5. If everything's in place, `complete_onboarding(complete)` call in the same turn or a later iteration → `chat_onboarding_completed` flips to `true` → next chat turn routes to orchestrator. If iteration 1 still produces text instead of a tool call, the next escalation is bumping the model hint from `"agentic"` to `"reasoning"` in `agent.toml`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(complete_onboarding): check session JWT in addition to legacy api_key (#525) Discovered during end-to-end Tauri testing that the welcome agent refused to call `complete_onboarding(complete)` for a fully authenticated user because `check_status` reported "API key: **missing**" — even though the user had completed the desktop OAuth flow and a valid encrypted JWT was sitting in `auth-profiles.json`. ## Root cause `check_status` was reading only `config.api_key` (an `Option<String>` on the Config struct), which is a **legacy free-form provider key field**. It is not where the desktop OAuth flow stores its credentials. The actual openhuman backend session JWT lives in `<openhuman_dir>/auth-profiles.json` under the `app-session:default` profile, encrypted via the SecretStore using `<openhuman_dir>/.secret_key`, and is read at every inference RPC via `crate::api::jwt::get_session_token(config)` (which delegates to `AuthService::from_config(config).get_profile(APP_SESSION_PROVIDER, None)`). So a user who: 1. Completed the desktop deep-link OAuth flow (the only supported way to get an account), and 2. Has a fully populated `auth-profiles.json` with a valid encrypted `app-session:default` token, was reported by `check_status` as "API key missing" because their JWT lives in the auth profile store, not in `config.api_key`. The welcome agent then dutifully refused to call `complete_onboarding( complete)` per its own prompt rule ("If critical setup is missing, do not complete onboarding"), and re-ran on every chat turn forever even though there was nothing to fix. This is a **pre-existing upstream bug** dating from PR #522 (the welcome agent's introduction). It was written before, or in parallel with, the auth-profile refactor that moved session credentials out of `config.api_key` into the dedicated profile store. The check was never updated to reflect the new auth model. ## Fix `check_status` now checks BOTH sources: ```rust let has_legacy_api_key = config.api_key.as_ref().map_or(false, |k| !k.is_empty()); let has_session_jwt = crate::api::jwt::get_session_token(&config) .ok() .flatten() .is_some_and(|t| !t.is_empty()); let is_authenticated = has_legacy_api_key || has_session_jwt; ``` The status report now shows: * `Authentication: configured ✓ (session token from desktop login)` — when the user has gone through the OAuth flow (the typical case); * `Authentication: configured ✓ (legacy api_key)` — when the user has set the legacy `config.api_key` field directly (CI / dev setups); * `Authentication: **missing** — log in via the desktop app or set `api_key` in config to enable inference` — when neither source has a credential. The line label changed from "API key" to "Authentication" because "API key" was misleading for both states (it isn't really the user's API key; it's the openhuman backend session JWT). ## What this unblocks After this fix, a Tauri user who: 1. Logs in once via the desktop OAuth flow → JWT lands in `auth-profiles.json`, 2. Types `hi` in the chat pane → welcome agent runs, 3. Welcome calls `check_status` → reports "Authentication: configured ✓", 4. Welcome calls `complete_onboarding(complete)` → flips `chat_onboarding_completed` from false to true, 5. Next chat turn → routes to orchestrator. Without this fix, step 4 never happens because welcome's prompt explicitly forbids completing without an API key, so the user gets the welcome message on every single turn forever. ## Files touched `src/openhuman/tools/impl/agent/complete_onboarding.rs` * `check_status` reads both `config.api_key` and `crate::api::jwt::get_session_token(&config)`. * Status line renamed from "API key" to "Authentication" with a more informative message. * Long inline comment documenting the two auth sources, why the check needs to consult both, and what bug was being fixed. ## Tests 360/361 tools tests pass. Only failure is the pre-existing Windows browser-screenshot bug (`tools::impl::browser::screenshot:: screenshot_command_contains_output_path`) which fails identically on upstream/main and is unrelated to this PR. The `complete_onboarding::tool_metadata` test still passes — it only checks the tool schema, not the runtime behaviour. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(welcome): force complete_onboarding(complete) in iteration 2 (#525) Live Tauri test after Commit 11 (auth-aware check_status) showed the welcome agent completing iteration 1 with the correct check_status tool call AND iteration 2 with a beautiful 1586-char welcome message — but NOT calling complete_onboarding(action="complete") to flip chat_onboarding_completed. So the user gets the welcome message and then routes to welcome AGAIN on every subsequent chat turn forever, because the prompt's Step 3 is descriptive rather than imperative. Same failure pattern as the iteration-1 chitchat fallback that Commit 10 fixed — the LLM follows the path of least resistance. If the prompt says "call X to finalize" without making it mandatory, the model will write the welcome text, see that it's done its job, and stop without emitting the second tool call. Fix: rewrite Step 3 with the same "MANDATORY" preamble pattern that worked for the iteration-1 fix. Specifically: * Renamed Step 3 to "Complete onboarding — MANDATORY in iteration 2 (when authenticated)" so the cross-reference is unmissable. * Added a blockquoted "MANDATORY SECOND TOOL CALL" preamble at the top of Step 3 that: - States the rule: when check_status reports "Authentication: configured ✓", iteration 2 MUST contain BOTH the welcome message text AND a complete_onboarding(action="complete") tool call. - Explains the consequence: without the tool call, the user is routed to welcome forever, which is a hard failure. - Defines the iteration 2 output structure as a 2-element list: welcome text + tool call. - Shows ❌/✅ examples (welcome message without tool call vs. with tool call) so the model has a concrete pattern to match. - Documents the single exception: only when authentication is missing should complete() be skipped, and explicitly says "missing channels, missing Composio integrations, missing local AI — none of those block completion." Auth is the only blocker. * Replaced the descriptive bullet list with a stricter "Decision rule for iteration 2" that pairs each check_status outcome with exactly what the agent must emit. ## What stays the same * No code changes — the existing complete_onboarding tool already handles the action="complete" call correctly (Commit 11 made it flip the right flag and check the right auth source). * No agent.toml changes — welcome still has 2 tools, max_iterations=6, temperature=0.7. Plenty of headroom for the 2-iteration flow. * The decision logic itself is unchanged — auth → complete, no auth → no complete. Just the framing. ## Test plan Restage sidecar, relaunch Tauri, type `hi`. Expected: ``` i=1: complete_onboarding(check_status) → "Authentication: configured ✓" i=2: 1500-2000 char welcome message + complete_onboarding(action="complete") ↓ [complete_onboarding] chat welcome flow marked complete, proactive agents seeded ↓ chat_onboarding_completed flips false → true ``` Then type a follow-up message → expect: ``` [web-channel] route → orchestrator (chat_onboarding_completed=true) ``` That's the welcome → orchestrator handoff — the actual goal of #525. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(web-channel): include target_agent_id in THREAD_SESSIONS cache key (#525) Final bug discovered during E2E testing: after Commit 12 made the welcome agent successfully call complete_onboarding(complete) and flip chat_onboarding_completed to true, the user typed a follow-up message — but the next turn STILL routed through the welcome agent instead of orchestrator. ## Root cause `run_chat_task` caches the built `Agent` in a `THREAD_SESSIONS` HashMap keyed by `(client_id, thread_id)`. The cache hit predicate checked `model_override` and `temperature` for invalidation but not the routing target — so when the routing decision flipped (chat_onboarding_completed: false → true) between turns, the cache happily returned the stale welcome agent and `build_session_agent` was never invoked. This was a real bug introduced by Commit 8 — when I added agent-aware routing to `build_session_agent`, I should have extended the cache key (or hit predicate) with the target agent id. Without that, the routing fix only worked on the FIRST turn of any thread; every subsequent turn served the cached agent regardless of flag changes. Symptoms in the live test: * Turn 1: routes to welcome ✓, welcome runs 3 iterations, calls complete(), flag flips on disk → cached as welcome agent * Turn 2: cache hits on (client_id, thread_id), returns cached welcome agent. NO `[web-channel]` routing trace, NO `[agent::builder]` trace. Welcome runs again with history_len=10. * Result: orchestrator handoff never happens. User stuck in welcome forever. ## Fix Three small changes in `web.rs`: ### 1. `SessionEntry` gains a `target_agent_id` field ```rust struct SessionEntry { agent: Agent, model_override: Option<String>, temperature: Option<f64>, target_agent_id: String, // NEW } ``` Documents which agent definition was used to build the cached `Agent`, so the next turn's cache lookup can compare against the current routing decision. ### 2. New `pick_target_agent_id(config)` helper ```rust fn pick_target_agent_id(config: &Config) -> &'static str { if config.chat_onboarding_completed { "orchestrator" } else { "welcome" } } ``` Mirrors the routing decision inside `build_session_agent` so `run_chat_task` can compute it once up front and use it as a cache key component. Since `Config::load_or_init` reads from disk every call, the value reflects the freshly persisted state — meaning the moment the welcome agent flips the flag, the next turn's `pick_target_agent_id` returns the new value, the cache hit predicate falls through, and `build_session_agent` is invoked. ### 3. Cache hit predicate extended ```rust let mut agent = match prior { Some(entry) if entry.model_override == model_override && entry.temperature == temperature && entry.target_agent_id == target_agent_id => { log::info!("[web-channel] reusing cached session agent id={} ..."); entry.agent } Some(prior_entry) => { log::info!( "[web-channel] cache miss — rebuilding session agent \ (was id={}, now id={}) ...", prior_entry.target_agent_id, target_agent_id ); build_session_agent(...)? } None => build_session_agent(...)?, }; ``` The `Some(prior_entry)` arm now distinguishes stale-cache hits (rebuild + log the transition) from cold misses (rebuild silently). The transition log line gives observability for the welcome → orchestrator handoff: whenever you see "cache miss — rebuilding session agent (was id=welcome, now id=orchestrator)", that's the moment of the handoff in production. ## Tests Library compiles. The cache-key change is small and the logic is straightforward; no test changes needed (existing tests don't exercise the `THREAD_SESSIONS` cache flow, and adding a unit test would require mocking the global config + memory backend which is significantly more setup than the change itself warrants). Live verification: type `hi` (welcome), wait for the "all set" message + flag flip, type a follow-up. Expected logs: ``` [web-channel] routing chat turn to 'welcome' (turn 1) [agent::builder] building session agent id=welcome ... welcome runs, calls complete(), flag flips ... [web-channel] routing chat turn to 'orchestrator' (turn 2) [web-channel] cache miss — rebuilding session agent (was id=welcome, now id=orchestrator) [agent::builder] building session agent id=orchestrator ``` That's the complete welcome → orchestrator handoff that has been the goal of #525 since day one. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(complete_onboarding): document tool semantics in schema, terse success result (#525) Stopping the prompt-bloat cycle on the welcome agent. Previous commits (10, 12, and a half-applied iteration of 14) progressively added MANDATORY blockquotes to welcome's prompt.md to fix three distinct failure modes: chitchat-fallback in iter 1, missing complete() in iter 2, and prose-leak in iter 3. Each fix worked but the welcome prompt is now ~250 lines of imperatives encoding what should be tool semantics, not agent persona. The right home for tool semantics is the tool's own schema — specifically `Tool::description()` and `parameters_schema()`. Those fields are what the LLM sees when deciding when and how to call the tool, and they apply to every agent that uses the tool, not just welcome. Encoding the contract there lets the welcome prompt stay about persona / workflow / tone, not implementation details. ## Changes ### `tools/impl/agent/complete_onboarding.rs` #### `Tool::description()` — full rewrite Replaces the previous 5-line description with a structured ~30-line contract documenting BOTH actions: * **`check_status`** — explicit list of what the report contains (auth, default model, channels, integrations, memory, both onboarding flags), explicit "side effects: NONE (read-only)" declaration, and explicit framing as "intended for an LLM agent to read and use as the basis for a personalized welcome message" so consumers know how to use the result. * **`complete`** — explicit semantics: flips `chat_onboarding_completed`, triggers welcome→orchestrator handoff, seeds proactive cron jobs, idempotent, writes config.toml, has a pre-condition (must be authenticated). And critically: > The complete action returns the literal token "ok" on success. > **This return value is a machine-readable success marker, not > user-facing prose.** Do not paraphrase it, summarize it, or > acknowledge it back to the user — the actual user-facing welcome > text should have been emitted alongside the tool call in the > same iteration. The chat layer extracts the LAST iteration's > text as the user-visible reply, so any prose written after this > tool returns will overwrite the welcome message in the chat pane. This is the contract that prevents the iter-3 paraphrase leak, documented at the source of the tool result instead of in every consumer's prompt. #### `parameters_schema()` — extended action description The `action` enum's description now mirrors the contract: > "check_status" → read-only inspection ... "complete" → finalize > the chat welcome flow, flips chat_onboarding_completed to true, > returns the literal token "ok" (NOT a user-facing message — do > not paraphrase the result back to the user). JSON-schema descriptions are seen by the LLM at function-calling decision time, so the warning lands in the same place the LLM is deciding whether to call the tool. #### `complete()` return value Changed from a 118-char chatty success string ("Chat welcome flow marked as complete. Morning briefing and proactive agent jobs have been set up. The user is all set!") to the literal 2-char `"ok"`. The chatty string was the source of the iter-3 paraphrase leak — the LLM kept treating it as something to summarize back to the user in iteration 3, which overwrote the iter-2 welcome message in the chat pane (because the chat layer extracts the last iteration's text). With "ok" there's nothing to paraphrase. The inline comment block on the return statement records the bug history so a future maintainer who sees `Ok(ToolResult::success("ok"))` and is tempted to make it more "informative" understands why it's deliberately terse. ### `agents/welcome/prompt.md` Reverts the half-applied "Step 6: STOP after iteration 2" blockquote that I started adding in the previous commit attempt. Replaces it with a single one-line pointer at the end of Step 5: > "(See the `complete_onboarding` tool's own description for what > its `"ok"` return value means and why you should not paraphrase it > back to the user.)" That's enough context for the welcome agent to understand the return-value contract without duplicating the contract text. The contract lives in the tool, not in the agent that consumes it. ## What this DOESN'T touch * Commits 10 (mandatory first action) and 12 (mandatory second tool call) stay as-is. They're still arguably too forceful, but they encode workflow steps specific to the welcome agent's persona, not general tool semantics. Trimming them is a separate prompt audit follow-up. * No code changes outside `complete_onboarding.rs`. No agent.toml changes, no schema changes, no other tool changes. Pure documentation + return-value tweak. ## Tests `tool_metadata` still passes (it pins name, scope, permission_level, and schema shape — not description text, which is intentionally allowed to drift). 1/1 complete_onboarding tests pass. Library compiles clean. ## Test plan Restage sidecar, relaunch Tauri, type `hi`. Expected: * iter 1: check_status → 600-char status report * iter 2: 1500-char welcome message + complete_onboarding(complete) * iter 3 (if it fires): no prose, or trivial prose ≪ iter 2 chars * Chat pane shows the iter-2 welcome message, NOT a confirmation paraphrase. Co-Authored-By: Claude Opu…
Summary
complete_onboardingagent tool and update the capability catalog to reflect the new onboarding behavior.Problem
Solution
Submission Checklist
Impact
Related
Summary by CodeRabbit
New Features
Chores