perf: caching, batched DM resolution, bounded audit, global kind index#367
Merged
tlongwell-block merged 2 commits intomainfrom Apr 20, 2026
Merged
Conversation
Performance improvements from staging profiling (2026-04-17). Projects GET /api/channels from 329ms to ~35ms, reduces per-request DB queries from ~43 to ~5, and cuts channels seq scans >50%. Caching (#1, #2) - Wire membership_cache (moka, 10s TTL, 10k cap) into all 10 is_member() call sites with cache-aside pattern. - Add accessible_channels_cache for get_accessible_channel_ids() at 3 call sites (REQ handler, /api/feed, /api/search). - Invalidate on all mutation paths: add_member, remove_member, channel create/delete, DM create/expand, compensation delete, audio auto-join. Multi-pod relies on TTL expiry (documented). Batch DM resolution (#3) - Add get_members_bulk(channel_ids) using WHERE channel_id = ANY($1). - Rewrite /api/channels to resolve all DM participants in 2 queries (one get_members_bulk + one get_users_bulk) instead of 2xN_DMs. - Remove per-DM resolve_dm_participants() function. Bounded audit (#7) - Replace unbounded tokio::spawn(audit.log()) with bounded mpsc channel (capacity 1000) + single worker task in AppState. - Uses .send().await for backpressure — audit entries must not be silently dropped (SOX-grade tamper-evident chain). - Migrate media upload audit from unbounded spawn to audit_tx. - Add sprout_audit_log_errors_total counter for DB write failures. Global kind index (#6) - Add global_kind_index and global_wildcard_index to SubscriptionRegistry for sub-linear fan-out on global events. - Fan-out goes from O(all_subs) to O(matching_kind_subs). - Preserves channel/global scoping invariant (no behavior change). - Add 4 tests: kind routing, wildcard routing, removal cleanup, channel/global isolation. Pool sizing (#8) - Main pool: max 50->20, min 5->2. Audit pool: max=5, min=1. - Frees connections for multi-pod (4 pods x 25 = 100 <= PG limit). Observability (#9) - Wire sprout_fanout_recipients histogram at all 4 fan_out() sites.
AppState::new() now returns (Self, AuditShutdownHandle). The handle owns a CancellationToken that signals the audit worker to stop accepting new entries, close the receiver, drain buffered entries via recv().await, and exit. This is independent of Arc<AppState> lifetime — works correctly even when background tasks (reaper, pubsub, health server) still hold state clones after axum's graceful shutdown completes. Closing the receiver (audit_rx.close()) rejects future sends atomically, so no entries are lost between the cancel signal and the drain loop. Sequence: SIGTERM → readiness 503 → axum drains connections → main() calls audit_shutdown.drain(5s) → cancel token fires → worker closes receiver → drains buffered entries → exit. 5-second timeout prevents hanging on a stuck audit DB.
fsola-sq
added a commit
that referenced
this pull request
Apr 20, 2026
…-binding * origin/main: fix(desktop): eliminate agent startup beachball (#374) fix(desktop): resolve agent command path for DMG builds (#372) fix(desktop): remove stale sprout-admin prereq, add sidecar tooling (#371) Add server cross-compile and macOS desktop build CI jobs (#369) Fix forum post card bugs on desktop and mobile (#370) fix(desktop): kill WebSocket flood and fix Markdown <p><div> nesting (#368) perf: caching, batched DM resolution, bounded audit, global kind index (#367) fix: staging to generate stubs as needed (#366) chore(deps): update rust crate axum to v0.8.9 (#365) chore(deps): update dependency @tanstack/react-router to v1.168.22 (#364) feat(desktop): autoscroll thread sidebar for new replies (#363) fix(desktop): eliminate 10+ second UI freeze on startup (#361) feat(desktop): bundle sprout-acp and sprout-mcp-server as Tauri sidecars (#362) Remove release pipeline from public repo (#360) Amp-Thread-ID: https://ampcode.com/threads/T-019dab7a-5979-7401-83a1-509b9adfe4a0 Co-authored-by: Amp <amp@ampcode.com> # Conflicts: # crates/sprout-relay/src/state.rs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Performance improvements from staging profiling (2026-04-17). All findings from the profiling plan are addressed — Phases 1 through 3.
Projected impact:
GET /api/channels: 329ms → ~35ms (batch DM resolution)channelsseq scans: >50% reductionChanges
Caching (#1, #2)
membership_cache(moka, 10s TTL, 10k cap) into all 10is_member()call sites with cache-aside pattern.accessible_channels_cacheforget_accessible_channel_ids()at 3 call sites (REQ handler,/api/feed,/api/search).add_member,remove_member, channel create/delete, DM create/expand, compensation delete, audio auto-join.Batch DM resolution (#3)
get_members_bulk(channel_ids)usingWHERE channel_id = ANY($1)./api/channelsto resolve all DM participants in 2 queries (oneget_members_bulk+ oneget_users_bulk) instead of 2×N_DMs.resolve_dm_participants()function.Bounded audit (#7)
tokio::spawn(audit.log())with boundedmpscchannel (capacity 1000) + single worker task..send().awaitfor backpressure — audit entries must not be silently dropped (SOX-grade tamper-evident chain).audit_txtoo.sprout_audit_log_errors_totalcounter for DB write failures.Graceful audit drain on shutdown
AppState::new()returns(Self, AuditShutdownHandle)so the caller can drain the audit queue during graceful shutdown.AuditShutdownHandleowns aCancellationToken+JoinHandle. On drain: cancel fires → worker callsaudit_rx.close()(atomically rejects future sends) → drains buffered entries viarecv().await→ exits.Arc<AppState>lifetime — works even when background tasks (reaper, pubsub, health server) still hold state clones.log_audit_entry()helper shared by normal loop and drain loop.Global kind index (#6)
global_kind_indexandglobal_wildcard_indextoSubscriptionRegistryfor sub-linear fan-out on global events.Pool sizing (#8)
Observability (#9)
sprout_fanout_recipientshistogram at all 4fan_out()sites.Testing
cargo clippyclean,cargo fmtcleanWhat's NOT in this PR