Skip to content

Fix agent provisioning flow and restore agent owner index#482

Merged
justinmoon merged 8 commits intosledtools:masterfrom
justinmoon:new-agent-ux
Mar 7, 2026
Merged

Fix agent provisioning flow and restore agent owner index#482
justinmoon merged 8 commits intosledtools:masterfrom
justinmoon:new-agent-ux

Conversation

@justinmoon
Copy link
Copy Markdown
Collaborator

@justinmoon justinmoon commented Mar 7, 2026

Summary

  • fix agent provisioning cancellation/error handling so stale async callbacks do not complete after the user backs out
  • restore single-active-agent enforcement with a corrective migration for databases that already dropped the owner-active partial index
  • add Android provisioning error UI and keep iOS/Android bindings regenerated after the Rust changes

Verification

  • cargo fmt
  • cargo test -p pika-server prepare_agent_for_reprovision
  • cargo test -p pika-server authenticated_requester_npub_accepts_valid_nip98_header
  • cargo test -p pika_core group_management_validation
  • cargo test -p pika_core agent_provisioning_tests
  • just ios-gen-swift
  • just gen-kotlin
  • cd android && ./gradlew app:compileDebugKotlin

Open with Devin

Summary by CodeRabbit

  • New Features

    • Added a dedicated agent provisioning screen with live progress, status messaging, elapsed time, and masked agent identifier; includes retry on error.
    • Provisioning UI integrated across platforms (Android, iOS) and wired into chat-creation flow.
  • Chores

    • Admin dashboard: add "Max Agents" field per allowlist entry to configure per-user agent limits; corresponding server migrations applied.

justinmoon and others added 7 commits March 6, 2026 23:03
Navigate immediately to a loading screen when user taps "New Agent"
instead of sitting on the chat list for ~40s. The provisioning screen
shows phase, elapsed time, poll progress, and error state with retry.

- Add AgentProvisioningPhase/State types with UniFFI bindings
- Report progress from run_agent_flow via InternalEvent::AgentFlowProgress
- Push Screen::AgentProvisioning on ensure_agent, clear on completion
- Add iOS AgentProvisioningView and Android placeholder
- Handle back-swipe cancellation and session logout cleanup
- 5 new Rust tests for the provisioning state machine

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add max_agents column to agent_allowlist (default 1, NULL = unlimited).
Drop the hard one-active-agent-per-owner unique index in favor of
application-level enforcement that respects the per-user limit.

- New migration: add max_agents column, drop unique index
- ensure_agent checks count vs max_agents from allowlist
- Admin dashboard shows and accepts max_agents field
- Set max_agents to NULL via admin UI for unlimited agents

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Cargo.lock has mdk-core at 0.7.1 but the flake.nix outputHashes
key still referenced 0.6.0, causing nix build to fail.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wrap config attributes in explicit `config = { ... }` block since
the module defines `options`, which requires the split format.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- High: Replace racy count-then-insert in ensure_agent with atomic
  INSERT...SELECT WHERE count < limit to prevent concurrent requests
  from exceeding max_agents.
- High: Fix down migration to mark excess active rows as error before
  recreating the unique index, preventing rollback failure.
- Medium: Prevent retry button from pushing duplicate AgentProvisioning
  screens by checking the stack before pushing.
- Medium: Keep agent_provisioning state alive through CreateChat so the
  UI shows accurate PublishingKeyPackage/CreatingChat phases instead of
  falling back to generic "Starting agent..." text.
- Medium: Cancel pending direct-chat creation on back-swipe so the chat
  doesn't finish opening after the user has navigated away.
- Medium: Remove hardcoded value="1" from admin max_agents input so
  editing an entry doesn't silently clamp unlimited users back to 1.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 7, 2026

📝 Walkthrough

Walkthrough

Adds a cross-platform Agent Provisioning feature: new AgentProvisioning state, phase enum, FFI and UI wiring (iOS/Android), Rust core progress events and token-based flow control, server-side API and DB schema changes for per-user agent limits, plus migrations and admin UI updates.

Changes

Cohort / File(s) Summary
Rust core & state
rust/src/state.rs, rust/src/updates.rs, rust/src/core/mod.rs, rust/src/core/agent.rs, rust/src/core/session.rs
Introduce AgentProvisioningPhase and AgentProvisioningState, add Screen::AgentProvisioning, emit/handle InternalEvent::AgentFlowProgress, add token-based direct-chat flow control, progress handling, and cleanup helpers; adjust signatures and tests.
FFI / Android state
android/app/src/main/java/com/pika/app/rust/pika_core.kt, android/app/src/main/java/com/pika/app/AppManager.kt
Add AgentProvisioning Kotlin data class and enum, FFI converters for the new types, extend AppState with optional agentProvisioning and initialize it in AppManager.
Android UI
android/app/src/main/java/com/pika/app/ui/PikaApp.kt
Add AgentProvisioningScreen composable and route handling (rendering progress, error UI, retry/back actions).
iOS UI
ios/Sources/Views/AgentProvisioningView.swift, ios/Sources/ContentView.swift, ios/Sources/PreviewData.swift
New AgentProvisioningView and ContentView integration; preview state includes agentProvisioning = nil; retry wiring to manager.ensureAgent().
Server API & models
crates/pika-server/src/agent_api.rs, crates/pika-server/src/models/agent_allowlist.rs, crates/pika-server/src/models/agent_instance.rs, crates/pika-server/src/models/mod.rs, crates/pika-server/src/models/schema.rs
Refactor agent endpoints with json_response and reprovision helpers, add max_agents column to allowlist (Option), extend upsert signature to accept max_agents, add get() helper, and derive QueryableByName for AgentInstance.
Migrations & Admin
crates/pika-server/migrations/...allowlist_max_agents/...sql, crates/pika-server/migrations/...restore_agent_owner_active_index/...sql, crates/pika-server/src/admin.rs, crates/pika-server/templates/admin/dashboard.html
Add migration to add max_agents, down/up scripts to manage unique index, admin form field and display column for Max Agents, validate/parsing in upsert handler.
Build / infra
flake.nix, infra/nix/modules/microvm-host.nix
Update cargoLock hashes for mdk-core, and wrap microvm-host options under new config attribute.
Tests & previews
ios/Tests/AppManagerTests.swift, crates/pika-server/src/models/mod.rs
Initialize agentProvisioning = nil in test preview/state builders; update upsert call sites to provide additional parameter (None).
Docs / TODOs
todos/agent-provisioning-screen.md
Specification and implementation notes for the provisioning UX, progress events, token handling, and cross-platform wiring.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant App as Mobile App (iOS/Android)
    participant Core as Rust Core
    participant Server as Server API
    participant DB as Database

    User->>App: Tap "Ensure Agent"
    App->>Core: dispatch(EnsureAgent)

    Core->>Core: set Screen=AgentProvisioning<br/>agent_flow_start = now
    Core->>Server: run_agent_flow(tx, flow_token)

    loop Polling / Progress
        Server->>Core: AgentFlowProgress(flow_token, phase, agent_npub?, poll_attempt?)
        Core->>Core: handle_agent_flow_progress(update state)
        Core->>App: emit AppState (with agentProvisioning)
        App->>App: render provisioning UI
    end

    alt Success
        Server->>Core: run_agent_flow returns agent_npub
        Core->>Core: advance phase -> PublishingKeyPackage / CreatingChat
        Core->>DB: create direct chat / open chat
        Core->>App: clear agentProvisioning, push Chat screen
        App->>App: show Chat view
    else Error
        Server->>Core: AgentFlowProgress(..., Error, msg)
        Core->>Core: set provisioning error state
        Core->>App: emit AppState (phase=Error)
        App->>User: show Try Again / Back
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Poem

🐰 I hopped a trail of progress beams,

Tokens, polls, and hopeful dreams;
Screens that wait and updates sing,
Agents rise—what joy they bring!
Retry, back, the rabbit grins, and springs.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 29.41% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Fix agent provisioning flow and restore agent owner index' accurately summarizes the two main objectives of this pull request: fixing the agent provisioning flow and restoring the agent owner index enforcement.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot]

This comment was marked as resolved.

Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 8 additional findings in Devin Review.

Open in Devin Review

Comment on lines +303 to +309
send_progress(
&tx,
flow_token,
crate::state::AgentProvisioningPhase::Provisioning,
None,
Some(0),
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Initial progress sends poll_attempt=0, producing nonsensical "attempt 0/45" display

After ensure_agent succeeds in run_agent_flow, send_progress is called with poll_attempt: Some(0). The provisioning_status_message function formats this as "Starting microVM... (attempt 0/45)". Since the polling loop starts at attempt = 1 (rust/src/core/agent.rs:313), this "attempt 0" message is displayed to the user in the gap between the ensure call completing and the first actual poll — potentially several seconds — creating a confusing UX that suggests zero attempts have been made.

Status message formatting logic

In provisioning_status_message at rust/src/core/agent.rs:373-379:

if let Some(attempt) = poll_attempt {
    format!("Starting microVM... (attempt {}/{})", attempt, AGENT_POLL_MAX_ATTEMPTS)
}

When attempt is 0, this produces "Starting microVM... (attempt 0/45)".

Suggested change
send_progress(
&tx,
flow_token,
crate::state::AgentProvisioningPhase::Provisioning,
None,
Some(0),
);
send_progress(
&tx,
flow_token,
crate::state::AgentProvisioningPhase::Provisioning,
None,
None,
);
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +481 to +488
if max_agents != MAX_SUPPORTED_AGENTS {
return Err((
StatusCode::BAD_REQUEST,
format!(
"max_agents > {MAX_SUPPORTED_AGENTS} is not supported until the API/client add multi-agent selection"
),
));
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Misleading error message rejects max_agents < 1 with "> 1 is not supported" text

The max_agents validation rejects any value != MAX_SUPPORTED_AGENTS (i.e., != 1), but the error message only mentions > 1. If an admin submits max_agents=0 or a negative value (e.g. via curl, bypassing the HTML min="1" max="1" constraint at crates/pika-server/templates/admin/dashboard.html:19), they receive "max_agents > 1 is not supported..." which is factually incorrect for their input.

Suggested change
if max_agents != MAX_SUPPORTED_AGENTS {
return Err((
StatusCode::BAD_REQUEST,
format!(
"max_agents > {MAX_SUPPORTED_AGENTS} is not supported until the API/client add multi-agent selection"
),
));
}
if max_agents != MAX_SUPPORTED_AGENTS {
return Err((
StatusCode::BAD_REQUEST,
format!(
"max_agents must be exactly {MAX_SUPPORTED_AGENTS} until the API/client add multi-agent selection"
),
));
}
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@rust/src/core/mod.rs`:
- Around line 2910-2912: The call to invalidate_key_package_publish() here
clears local_key_package_published incorrectly; instead, separate the two
behaviors (invalidate stale publish callbacks vs mark key package unpublished)
and call only the callback-invalidation variant from this
navigation/cancellation path. Add a new method (e.g.
invalidate_key_package_publish_callbacks or extend
invalidate_key_package_publish with a parameter) that performs only the callback
cleanup without flipping local_key_package_published, update the place where
self.invalidate_key_package_publish() is currently called (alongside
self.invalidate_agent_flow() and self.invalidate_direct_chat_creation()) to call
the new callback-only function, and leave the existing full
invalidate_key_package_publish() (or call it elsewhere) for cases that must mark
the local key package as unpublished.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1a7a7863-157a-4c84-b2e4-bb27cb5f160b

📥 Commits

Reviewing files that changed from the base of the PR and between 985b7c6 and 656027e.

📒 Files selected for processing (6)
  • crates/pika-server/migrations/2026-03-07-020000_restore_agent_owner_active_index/up.sql
  • crates/pika-server/src/admin.rs
  • crates/pika-server/src/agent_api.rs
  • rust/src/core/agent.rs
  • rust/src/core/mod.rs
  • todos/agent-provisioning-screen.md
🚧 Files skipped from review as they are similar to previous changes (4)
  • crates/pika-server/migrations/2026-03-07-020000_restore_agent_owner_active_index/up.sql
  • rust/src/core/agent.rs
  • crates/pika-server/src/admin.rs
  • todos/agent-provisioning-screen.md

Comment on lines +2910 to +2912
self.invalidate_agent_flow();
self.invalidate_key_package_publish();
self.invalidate_direct_chat_creation();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't clear the published-key-package state when just canceling stale callbacks.

invalidate_key_package_publish() also flips local_key_package_published back to false. Calling it here on the successful handoff out of provisioning means the next CreateChat will re-enter the publish path, and can now fail on relay readiness even though our key package was already published. Please split "invalidate stale publish callbacks" from "mark local key package unpublished", and use only the former in this navigation/cancellation path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust/src/core/mod.rs` around lines 2910 - 2912, The call to
invalidate_key_package_publish() here clears local_key_package_published
incorrectly; instead, separate the two behaviors (invalidate stale publish
callbacks vs mark key package unpublished) and call only the
callback-invalidation variant from this navigation/cancellation path. Add a new
method (e.g. invalidate_key_package_publish_callbacks or extend
invalidate_key_package_publish with a parameter) that performs only the callback
cleanup without flipping local_key_package_published, update the place where
self.invalidate_key_package_publish() is currently called (alongside
self.invalidate_agent_flow() and self.invalidate_direct_chat_creation()) to call
the new callback-only function, and leave the existing full
invalidate_key_package_publish() (or call it elsewhere) for cases that must mark
the local key package as unpublished.

Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 11 additional findings in Devin Review.

Open in Devin Review

return;
}
self.agent_flow_task = None;
self.agent_flow_start = None;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 agent_flow_start cleared before final progress update, resetting elapsed_secs to 0

In handle_agent_flow_completed, self.agent_flow_start is set to None at rust/src/core/agent.rs:603 before handle_agent_flow_progress is called at line 613. Inside handle_agent_flow_progress (rust/src/core/agent.rs:576-577), elapsed_secs is computed from agent_flow_start, which is now None, so elapsed_secs becomes 0. This causes the provisioning UI to briefly flash 0s elapsed when transitioning to the CreatingChat phase, after the user has been watching elapsed time steadily increase throughout the Provisioning/Recovering phases.

Suggested change
self.agent_flow_start = None;
self.agent_flow_task = None;
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@justinmoon justinmoon merged commit 1194bcf into sledtools:master Mar 7, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant