Conversation
- Added a new function `_warm_agent_jit_caches` to pre-warm agent caches at startup, reducing cold invocation costs. - Updated the `SurfSenseContextSchema` to include per-invocation fields for better state management during agent execution. - Introduced caching mechanisms in various tools to ensure fresh database sessions are used, improving performance and reliability. - Enhanced middleware to support new context features and improve error handling during connector and document type discovery.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (60)
📝 WalkthroughWalkthroughThis PR introduces compiled-agent caching with per-call session management across tools to enable safe graph reuse, adds system-message flattening middleware for provider compatibility, refactors connector discovery caching, and implements startup JIT warmup for LangChain schema compilation. ChangesAgent Caching & Context Refactoring
UI Cleanup
Sequence Diagram(s)sequenceDiagram
participant Client
participant StreamNewChat
participant DeepAgent as Deep Agent<br/>(Compiled)
participant AgentCache
participant Tools
participant DB as Per-Call<br/>DB Session
Client->>StreamNewChat: stream_new_chat(request)
Note over StreamNewChat: Phase 1: Build context & check model
StreamNewChat->>StreamNewChat: Create SurfSenseContextSchema<br/>(mentioned_document_ids, turn_id, etc.)
Note over StreamNewChat: Phase 2: Parallel preflight & speculative agent
par Preflight LLM
StreamNewChat->>StreamNewChat: Preflight ping (concurrent)
and Speculative Build
StreamNewChat->>DeepAgent: Speculative create_surfsense_deep_agent()
DeepAgent->>AgentCache: Compute stable_hash(config, flags, tools, ...)
DeepAgent->>AgentCache: get_or_build(cache_key, builder)
alt Cache Hit
AgentCache->>DeepAgent: Return cached compiled graph
else Cache Miss
AgentCache->>DeepAgent: Run builder in asyncio.to_thread
DeepAgent->>DeepAgent: Compile middleware stack<br/>(+ FlattenSystemMessageMiddleware)
DeepAgent->>AgentCache: Store in cache with TTL
end
end
Note over StreamNewChat: Phase 3: Stream agent events
StreamNewChat->>DeepAgent: agent.astream_events(...,<br/>context=runtime_context)
loop For each tool invocation
DeepAgent->>Tools: Invoke tool(search_space_id, user_id, ...)
Tools->>DB: async_session_maker() → new AsyncSession
Tools->>DB: Query/mutate using per-call session
DB->>Tools: Result
Tools->>DeepAgent: Return
end
DeepAgent->>StreamNewChat: Agent events + final response
StreamNewChat->>Client: Stream response chunks
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
_warm_agent_jit_cachesto pre-warm agent caches at startup, reducing cold invocation costs.SurfSenseContextSchemato include per-invocation fields for better state management during agent execution.Description
Motivation and Context
FIX #
Screenshots
API Changes
Change Type
Testing Performed
Checklist
High-level PR Summary
This PR implements a comprehensive performance optimization for the agent system through multi-phase caching and session management improvements. The core changes include: introducing a TTL-LRU compiled-agent cache to reuse graph instances across turns (reducing cold invocation from 4-5s to <50µs on cache hits), refactoring all connector tools to use per-call database sessions instead of cached closures to enable safe cache sharing, implementing a connector discovery TTL cache to reduce repeated database queries, fixing Anthropic's 4-cache-control-block limit by flattening multi-block system messages, switching prompt cache injection from
role: systemtoindex: 0to avoid overflow, parallelizing agent build with LLM preflight checks, adding JIT warmup at startup to pre-pay compilation costs, and convertingSurfSenseContextSchemato a dataclass for better runtime context management. These changes collectively improve both cold-start and warm-path performance while maintaining backward compatibility through feature flags.⏱️ Estimated Review Time: 3+ hours
💡 Review Order Suggestion
.env.exampleapp/agents/new_chat/feature_flags.pyapp/agents/new_chat/context.pyapp/agents/new_chat/agent_cache.pyapp/agents/new_chat/middleware/flatten_system.pyapp/agents/new_chat/prompt_caching.pyapp/services/connector_service.pyapp/agents/new_chat/tools/registry.pyapp/agents/new_chat/chat_deepagent.pyapp/agents/new_chat/middleware/knowledge_search.pyapp/agents/new_chat/tools/search_surfsense_docs.pyapp/agents/new_chat/tools/update_memory.pyapp/agents/new_chat/tools/connected_accounts.pyapp/agents/new_chat/tools/notion/create_page.pyapp/agents/new_chat/tools/notion/update_page.pyapp/agents/new_chat/tools/notion/delete_page.pyapp/agents/new_chat/tools/confluence/create_page.pyapp/agents/new_chat/tools/confluence/update_page.pyapp/agents/new_chat/tools/confluence/delete_page.pyapp/agents/new_chat/tools/gmail/create_draft.pyapp/agents/new_chat/tools/gmail/send_email.pyapp/agents/new_chat/tools/gmail/trash_email.pyapp/agents/new_chat/tools/gmail/update_draft.pyapp/agents/new_chat/tools/gmail/read_email.pyapp/agents/new_chat/tools/gmail/search_emails.pyapp/agents/new_chat/tools/google_drive/create_file.pyapp/agents/new_chat/tools/google_drive/trash_file.pyapp/agents/new_chat/tools/dropbox/create_file.pyapp/agents/new_chat/tools/dropbox/trash_file.pyapp/agents/new_chat/tools/onedrive/create_file.pyapp/agents/new_chat/tools/onedrive/trash_file.pyapp/agents/new_chat/tools/google_calendar/create_event.pyapp/agents/new_chat/tools/google_calendar/update_event.pyapp/agents/new_chat/tools/google_calendar/delete_event.pyapp/agents/new_chat/tools/google_calendar/search_events.pyapp/agents/new_chat/tools/jira/create_issue.pyapp/agents/new_chat/tools/jira/update_issue.pyapp/agents/new_chat/tools/jira/delete_issue.pyapp/agents/new_chat/tools/linear/create_issue.pyapp/agents/new_chat/tools/linear/update_issue.pyapp/agents/new_chat/tools/linear/delete_issue.pyapp/agents/new_chat/tools/discord/list_channels.pyapp/agents/new_chat/tools/discord/read_messages.pyapp/agents/new_chat/tools/discord/send_message.pyapp/agents/new_chat/tools/teams/list_channels.pyapp/agents/new_chat/tools/teams/read_messages.pyapp/agents/new_chat/tools/teams/send_message.pyapp/agents/new_chat/tools/luma/create_event.pyapp/agents/new_chat/tools/luma/list_events.pyapp/agents/new_chat/tools/luma/read_event.pyapp/agents/new_chat/middleware/__init__.pyapp/tasks/chat/stream_new_chat.pyapp/app.pytests/unit/agents/new_chat/test_agent_cache.pytests/unit/agents/new_chat/test_feature_flags.pytests/unit/agents/new_chat/test_flatten_system.pytests/unit/agents/new_chat/test_prompt_caching.pytests/unit/middleware/test_knowledge_search.pytests/unit/test_stream_new_chat_contract.pysurfsense_web/components/pricing/pricing-section.tsxsurfsense_web/components/pricing/pricing-section.tsxSummary by CodeRabbit
New Features
Improvements
Bug Fixes