Fix agent provisioning flow and restore agent owner index#482
Fix agent provisioning flow and restore agent owner index#482justinmoon merged 8 commits intosledtools:masterfrom
Conversation
Navigate immediately to a loading screen when user taps "New Agent" instead of sitting on the chat list for ~40s. The provisioning screen shows phase, elapsed time, poll progress, and error state with retry. - Add AgentProvisioningPhase/State types with UniFFI bindings - Report progress from run_agent_flow via InternalEvent::AgentFlowProgress - Push Screen::AgentProvisioning on ensure_agent, clear on completion - Add iOS AgentProvisioningView and Android placeholder - Handle back-swipe cancellation and session logout cleanup - 5 new Rust tests for the provisioning state machine Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add max_agents column to agent_allowlist (default 1, NULL = unlimited). Drop the hard one-active-agent-per-owner unique index in favor of application-level enforcement that respects the per-user limit. - New migration: add max_agents column, drop unique index - ensure_agent checks count vs max_agents from allowlist - Admin dashboard shows and accepts max_agents field - Set max_agents to NULL via admin UI for unlimited agents Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Cargo.lock has mdk-core at 0.7.1 but the flake.nix outputHashes key still referenced 0.6.0, causing nix build to fail. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wrap config attributes in explicit `config = { ... }` block since
the module defines `options`, which requires the split format.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- High: Replace racy count-then-insert in ensure_agent with atomic INSERT...SELECT WHERE count < limit to prevent concurrent requests from exceeding max_agents. - High: Fix down migration to mark excess active rows as error before recreating the unique index, preventing rollback failure. - Medium: Prevent retry button from pushing duplicate AgentProvisioning screens by checking the stack before pushing. - Medium: Keep agent_provisioning state alive through CreateChat so the UI shows accurate PublishingKeyPackage/CreatingChat phases instead of falling back to generic "Starting agent..." text. - Medium: Cancel pending direct-chat creation on back-swipe so the chat doesn't finish opening after the user has navigated away. - Medium: Remove hardcoded value="1" from admin max_agents input so editing an entry doesn't silently clamp unlimited users back to 1. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughAdds a cross-platform Agent Provisioning feature: new AgentProvisioning state, phase enum, FFI and UI wiring (iOS/Android), Rust core progress events and token-based flow control, server-side API and DB schema changes for per-user agent limits, plus migrations and admin UI updates. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant App as Mobile App (iOS/Android)
participant Core as Rust Core
participant Server as Server API
participant DB as Database
User->>App: Tap "Ensure Agent"
App->>Core: dispatch(EnsureAgent)
Core->>Core: set Screen=AgentProvisioning<br/>agent_flow_start = now
Core->>Server: run_agent_flow(tx, flow_token)
loop Polling / Progress
Server->>Core: AgentFlowProgress(flow_token, phase, agent_npub?, poll_attempt?)
Core->>Core: handle_agent_flow_progress(update state)
Core->>App: emit AppState (with agentProvisioning)
App->>App: render provisioning UI
end
alt Success
Server->>Core: run_agent_flow returns agent_npub
Core->>Core: advance phase -> PublishingKeyPackage / CreatingChat
Core->>DB: create direct chat / open chat
Core->>App: clear agentProvisioning, push Chat screen
App->>App: show Chat view
else Error
Server->>Core: AgentFlowProgress(..., Error, msg)
Core->>Core: set provisioning error state
Core->>App: emit AppState (phase=Error)
App->>User: show Try Again / Back
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
| send_progress( | ||
| &tx, | ||
| flow_token, | ||
| crate::state::AgentProvisioningPhase::Provisioning, | ||
| None, | ||
| Some(0), | ||
| ); |
There was a problem hiding this comment.
🟡 Initial progress sends poll_attempt=0, producing nonsensical "attempt 0/45" display
After ensure_agent succeeds in run_agent_flow, send_progress is called with poll_attempt: Some(0). The provisioning_status_message function formats this as "Starting microVM... (attempt 0/45)". Since the polling loop starts at attempt = 1 (rust/src/core/agent.rs:313), this "attempt 0" message is displayed to the user in the gap between the ensure call completing and the first actual poll — potentially several seconds — creating a confusing UX that suggests zero attempts have been made.
Status message formatting logic
In provisioning_status_message at rust/src/core/agent.rs:373-379:
if let Some(attempt) = poll_attempt {
format!("Starting microVM... (attempt {}/{})", attempt, AGENT_POLL_MAX_ATTEMPTS)
}When attempt is 0, this produces "Starting microVM... (attempt 0/45)".
| send_progress( | |
| &tx, | |
| flow_token, | |
| crate::state::AgentProvisioningPhase::Provisioning, | |
| None, | |
| Some(0), | |
| ); | |
| send_progress( | |
| &tx, | |
| flow_token, | |
| crate::state::AgentProvisioningPhase::Provisioning, | |
| None, | |
| None, | |
| ); |
Was this helpful? React with 👍 or 👎 to provide feedback.
| if max_agents != MAX_SUPPORTED_AGENTS { | ||
| return Err(( | ||
| StatusCode::BAD_REQUEST, | ||
| format!( | ||
| "max_agents > {MAX_SUPPORTED_AGENTS} is not supported until the API/client add multi-agent selection" | ||
| ), | ||
| )); | ||
| } |
There was a problem hiding this comment.
🟡 Misleading error message rejects max_agents < 1 with "> 1 is not supported" text
The max_agents validation rejects any value != MAX_SUPPORTED_AGENTS (i.e., != 1), but the error message only mentions > 1. If an admin submits max_agents=0 or a negative value (e.g. via curl, bypassing the HTML min="1" max="1" constraint at crates/pika-server/templates/admin/dashboard.html:19), they receive "max_agents > 1 is not supported..." which is factually incorrect for their input.
| if max_agents != MAX_SUPPORTED_AGENTS { | |
| return Err(( | |
| StatusCode::BAD_REQUEST, | |
| format!( | |
| "max_agents > {MAX_SUPPORTED_AGENTS} is not supported until the API/client add multi-agent selection" | |
| ), | |
| )); | |
| } | |
| if max_agents != MAX_SUPPORTED_AGENTS { | |
| return Err(( | |
| StatusCode::BAD_REQUEST, | |
| format!( | |
| "max_agents must be exactly {MAX_SUPPORTED_AGENTS} until the API/client add multi-agent selection" | |
| ), | |
| )); | |
| } |
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@rust/src/core/mod.rs`:
- Around line 2910-2912: The call to invalidate_key_package_publish() here
clears local_key_package_published incorrectly; instead, separate the two
behaviors (invalidate stale publish callbacks vs mark key package unpublished)
and call only the callback-invalidation variant from this
navigation/cancellation path. Add a new method (e.g.
invalidate_key_package_publish_callbacks or extend
invalidate_key_package_publish with a parameter) that performs only the callback
cleanup without flipping local_key_package_published, update the place where
self.invalidate_key_package_publish() is currently called (alongside
self.invalidate_agent_flow() and self.invalidate_direct_chat_creation()) to call
the new callback-only function, and leave the existing full
invalidate_key_package_publish() (or call it elsewhere) for cases that must mark
the local key package as unpublished.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1a7a7863-157a-4c84-b2e4-bb27cb5f160b
📒 Files selected for processing (6)
crates/pika-server/migrations/2026-03-07-020000_restore_agent_owner_active_index/up.sqlcrates/pika-server/src/admin.rscrates/pika-server/src/agent_api.rsrust/src/core/agent.rsrust/src/core/mod.rstodos/agent-provisioning-screen.md
🚧 Files skipped from review as they are similar to previous changes (4)
- crates/pika-server/migrations/2026-03-07-020000_restore_agent_owner_active_index/up.sql
- rust/src/core/agent.rs
- crates/pika-server/src/admin.rs
- todos/agent-provisioning-screen.md
| self.invalidate_agent_flow(); | ||
| self.invalidate_key_package_publish(); | ||
| self.invalidate_direct_chat_creation(); |
There was a problem hiding this comment.
Don't clear the published-key-package state when just canceling stale callbacks.
invalidate_key_package_publish() also flips local_key_package_published back to false. Calling it here on the successful handoff out of provisioning means the next CreateChat will re-enter the publish path, and can now fail on relay readiness even though our key package was already published. Please split "invalidate stale publish callbacks" from "mark local key package unpublished", and use only the former in this navigation/cancellation path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@rust/src/core/mod.rs` around lines 2910 - 2912, The call to
invalidate_key_package_publish() here clears local_key_package_published
incorrectly; instead, separate the two behaviors (invalidate stale publish
callbacks vs mark key package unpublished) and call only the
callback-invalidation variant from this navigation/cancellation path. Add a new
method (e.g. invalidate_key_package_publish_callbacks or extend
invalidate_key_package_publish with a parameter) that performs only the callback
cleanup without flipping local_key_package_published, update the place where
self.invalidate_key_package_publish() is currently called (alongside
self.invalidate_agent_flow() and self.invalidate_direct_chat_creation()) to call
the new callback-only function, and leave the existing full
invalidate_key_package_publish() (or call it elsewhere) for cases that must mark
the local key package as unpublished.
| return; | ||
| } | ||
| self.agent_flow_task = None; | ||
| self.agent_flow_start = None; |
There was a problem hiding this comment.
🟡 agent_flow_start cleared before final progress update, resetting elapsed_secs to 0
In handle_agent_flow_completed, self.agent_flow_start is set to None at rust/src/core/agent.rs:603 before handle_agent_flow_progress is called at line 613. Inside handle_agent_flow_progress (rust/src/core/agent.rs:576-577), elapsed_secs is computed from agent_flow_start, which is now None, so elapsed_secs becomes 0. This causes the provisioning UI to briefly flash 0s elapsed when transitioning to the CreatingChat phase, after the user has been watching elapsed time steadily increase throughout the Provisioning/Recovering phases.
| self.agent_flow_start = None; | |
| self.agent_flow_task = None; |
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
Verification
Summary by CodeRabbit
New Features
Chores