Conversation
…onnector forms and hooks
…rt-document-panel
…ts for memory limit handling
…ditor, chat, dashboard and settings
…ool-ui generators
…update UI components
…r and search space settings
…the editor panel, and remove deprecated memory hook
…t LLM Adds an optional planner LLM role wired through KnowledgePriorityMiddleware so KB query rewriting, date extraction, and recency classification run on a cheap model (e.g. gpt-4o-mini, Haiku, Azure nano) instead of the user's chat LLM. Operators opt in by setting is_planner: true on exactly one global config; without it, behavior is unchanged.
Splits the OpenAI-family gate into per-param predicates so AZURE and AZURE_OPENAI configs now receive prompt_cache_key for backend routing affinity (Microsoft auto-caches GPT-4o+ deployments at >=1024 tokens; the key clusters same-prefix requests on the same GPU pool and raises hit rate on turn 2+). prompt_cache_retention stays opted out for Azure because litellm 1.83.14's Azure transformer would drop it silently; revisit when Azure's supported params list is updated.
Skip the ~1-3s MCP initialize + list_tools handshake on every cache miss by reading tool definitions from the connector row we already load. Lazy populate on first miss, self-heal on corrupt cache, zero schema migration.
…ection and job dependencies
…n job and simplifying test conditions
Collapse the invalidate + warmup pair into a single refresh_mcp_tools_cache_for_connector(connector_id, search_space_id) helper and scope live discovery to the one connector that changed instead of the whole search space. - new mcp_tool.discover_single_mcp_connector: load one connector, refresh OAuth if needed, force live MCP discovery so its cached_tools row is rewritten; returned wrappers are discarded since the in-process LRU is rebuilt lazily on the next user query - mcp_tools_cache.refresh_mcp_tools_cache_for_connector: synchronously evicts the per-space LRU (LRU keys cannot scope finer) and schedules the per-connector prefetch via loop.create_task - routes (OAuth callback, MCP POST, MCP PUT) collapse their two back-to-back calls into a single refresh call; DELETE handlers keep using bare invalidate_mcp_tools_cache (nothing to prefetch) No new automated tests: the new functions are I/O glue (DB + network) where mocked unit tests would test implementation rather than behavior. The existing 9 unit tests for the cached_tools data shape are unchanged.
The probe answered its question (informing the cached_tools persistence design). Future MCP session-pooling work, if revived, can recreate it.
…cument-panel feat: improve memory extraction & add document-panel memory editing
refactor(env): replace inline process.env reads with BACKEND_URL in lib/
…forms refactor(env): replace inline process.env reads with BACKEND_URL in connector forms and hooks
…t-dashboard refactor(env): replace inline process.env reads with BACKEND_URL in editor, chat, dashboard and settings
…nerators refactor(env): replace inline process.env reads with BACKEND_URL in tool-ui generators
fix: Update CI workflow versions and scoped test triggers
Resolves: surfsense_backend/app/agents/new_chat/middleware/memory_injection.py - Took both imports: upstream moved MEMORY_HARD_LIMIT/SOFT_LIMIT to app.services.memory; kept our perf-logger import for timing. Pulls in upstream changes: - Memory document feature (services/memory refactor, removal of app.agents.new_chat.memory_extraction and background extraction in stream_new_chat — agent now drives memory via update_memory tool). - BACKEND_URL env refactor across web tool-ui/editor/chat/dashboard/lib. - GitHub Actions backend test workflow + pre-commit biome bump. - Token-display polish in MessageInfoDropdown; save_memory no-update sentinel. Verified: 1723 unit tests pass, ruff clean. No semantic regression in stream_new_chat (their memory-extraction deletion and our preflight removal touch different functions).
[Improvement] Agent: faster turns and lower LLM cost
…DocumentTabContent; update connector-status-config for Composio Google Drive connector maintenance
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (132)
📝 WalkthroughWalkthroughAdds a canonical memory service and routes, removes legacy edit flows, updates agents/middleware with planner LLM, prompt-caching, and perf logs, introduces MCP tool discovery cache/refresh, offloads embeddings to threads, and aligns web UI to BACKEND_URL with a new memory editor mode. ChangesUnified Memory and MCP Cache
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
Description
Motivation and Context
FIX #
Screenshots
API Changes
Change Type
Testing Performed
Checklist
High-level PR Summary
This PR delivers agent performance improvements and a complete citations architecture overhaul. On the speed front, it introduces a dedicated planner LLM for internal utility calls (query rewriting, date extraction) — routing those short classification tasks to a small/fast model instead of the user's chat LLM yields measurable per-turn latency savings. The KB planner runnable is compiled once rather than on every turn, and background LLM calls have been stripped from the critical chat stream path to eliminate network round-trips before the user sees output. MCP connector tool discovery is now cached in the database so repeated connector list calls hit a sub-second disk read instead of a 5+ second network round-trip, and subagent compilation timings are logged so slow middleware or tool loads surface in production observability. On the citations side, the PR replaces brittle marker-based memory storage with a heading-based markdown document model that can parse and normalize both legacy
(YYYY-MM-DD) [fact]bullets and new- YYYY-MM-DD: textentries, exposing user and team memory as first-classMEMORY.mdandTEAM_MEMORY.mddocuments in the file tree that open in the same editor panel used for KB docs. The Knowledge Base subagent specialist is taught to emit[citation:chunk_id]markers in its prose so the main agent can relay them verbatim to the UI, and the citation prompt now handles both direct chunk-block channels and specialist-relayed markers with strict copy-digit-for-digit rules to prevent LLM-corrupted IDs from silently breaking the citation UI. Connector subagent system prompts are updated to capevidence.itemsat a structured{total: N}count and surface matched entries in condensed one-line-per-entryaction_summaryprose instead of dumping raw arrays, eliminating the multi-turn context bloat that was slowing shared-thread workflows. Azure OpenAI's automatic prompt caching support is wired in forprompt_cache_key(routing affinity) while itsprompt_cache_retentionparam is deferred until LiteLLM ships it in the Azure transformer. Token tracking logs now expose per-call cache hit ratios and wall-clock latencies so cost and speed regressions surface immediately in production telemetry, and the in-process MCP LRU eviction is replaced with a two-tier system (DB-backedcached_tools+ background prefetch) so connector lifecycle events no longer stall the HTTP response.⏱️ Estimated Review Time: 1-3 hours
💡 Review Order Suggestion
VERSIONsurfsense_backend/pyproject.tomlsurfsense_web/package.jsonsurfsense_browser_extension/package.jsonsurfsense_desktop/package.json.github/workflows/backend-tests.yml.github/workflows/code-quality.yml.pre-commit-config.yamlsurfsense_backend/app/services/memory/__init__.pysurfsense_backend/app/services/memory/service.pysurfsense_backend/app/services/memory/validation.pysurfsense_backend/app/services/memory/document.pysurfsense_backend/app/services/memory/rewrite.pysurfsense_backend/app/services/memory/prompts.pysurfsense_backend/app/services/memory/schemas.pysurfsense_backend/app/agents/new_chat/tools/update_memory.pysurfsense_backend/app/agents/multi_agent_chat/subagents/builtins/memory/tools/update_memory.pysurfsense_backend/app/routes/memory_routes.pysurfsense_backend/app/routes/team_memory_routes.pysurfsense_backend/app/schemas/search_space.pysurfsense_backend/app/agents/multi_agent_chat/main_agent/system_prompt/prompts/citations/on.mdsurfsense_backend/app/agents/multi_agent_chat/subagents/builtins/knowledge_base/system_prompt_cloud.mdsurfsense_backend/app/agents/multi_agent_chat/subagents/builtins/knowledge_base/system_prompt_readonly_cloud.mdsurfsense_backend/app/agents/multi_agent_chat/subagents/builtins/knowledge_base/description_readonly.mdsurfsense_backend/app/agents/multi_agent_chat/subagents/builtins/knowledge_base/system_prompt_desktop.mdsurfsense_backend/app/agents/multi_agent_chat/subagents/builtins/knowledge_base/system_prompt_readonly_desktop.mdsurfsense_backend/app/agents/multi_agent_chat/main_agent/system_prompt/prompts/memory_protocol/private.mdsurfsense_backend/app/agents/multi_agent_chat/main_agent/system_prompt/prompts/memory_protocol/team.mdsurfsense_backend/app/agents/multi_agent_chat/main_agent/system_prompt/prompts/tools/update_memory/private/description.mdsurfsense_backend/app/agents/multi_agent_chat/main_agent/system_prompt/prompts/tools/update_memory/private/example.mdsurfsense_backend/app/agents/multi_agent_chat/main_agent/system_prompt/prompts/tools/update_memory/team/description.mdsurfsense_backend/app/agents/multi_agent_chat/main_agent/system_prompt/prompts/tools/update_memory/team/example.mdsurfsense_backend/app/agents/new_chat/prompts/base/memory_protocol_private.mdsurfsense_backend/app/agents/new_chat/prompts/base/memory_protocol_team.mdsurfsense_backend/app/agents/new_chat/prompts/tools/update_memory_private.mdsurfsense_backend/app/agents/new_chat/prompts/tools/update_memory_team.mdsurfsense_backend/app/agents/new_chat/prompts/examples/update_memory_private.mdsurfsense_backend/app/agents/new_chat/prompts/examples/update_memory_team.mdsurfsense_backend/app/agents/multi_agent_chat/subagents/builtins/memory/system_prompt.mdsurfsense_backend/app/services/llm_service.pysurfsense_backend/app/config/__init__.pysurfsense_backend/app/config/global_llm_config.example.yamlsurfsense_backend/app/agents/new_chat/middleware/knowledge_search.pysurfsense_backend/app/agents/multi_agent_chat/middleware/main_agent/knowledge_priority.pysurfsense_backend/app/agents/new_chat/chat_deepagent.pysurfsense_backend/app/agents/new_chat/tools/mcp_tools_cache.pysurfsense_backend/app/agents/new_chat/tools/mcp_tool.pysurfsense_backend/app/routes/mcp_oauth_route.pysurfsense_backend/app/routes/search_source_connectors_routes.pysurfsense_backend/app/agents/multi_agent_chat/middleware/main_agent/checkpointed_subagent_middleware/middleware.pysurfsense_backend/app/agents/multi_agent_chat/middleware/main_agent/checkpointed_subagent_middleware/task_tool.pysurfsense_backend/app/agents/multi_agent_chat/middleware/shared/kb_context_projection.pysurfsense_backend/app/agents/new_chat/middleware/knowledge_tree.pysurfsense_backend/app/agents/new_chat/middleware/memory_injection.pysurfsense_backend/app/agents/new_chat/prompt_caching.pysurfsense_backend/app/services/token_tracking_service.pysurfsense_web/lib/env-config.tssurfsense_web/atoms/editor/editor-panel.atom.tssurfsense_web/components/editor-panel/memory.tssurfsense_web/components/editor-panel/editor-panel.tsxsurfsense_web/components/documents/DocumentNode.tsxsurfsense_web/components/documents/FolderTreeView.tsxsurfsense_web/components/layout/ui/sidebar/DocumentsSidebar.tsxsurfsense_web/contracts/types/document.types.tssurfsense_web/contracts/enums/connectorIcons.tsxsurfsense_web/app/dashboard/[search_space_id]/search-space-settings/layout-shell.tsxsurfsense_web/app/dashboard/[search_space_id]/user-settings/layout-shell.tsxsurfsense_web/contracts/types/search-space.types.tssurfsense_backend/app/routes/search_spaces_routes.pysurfsense_backend/app/routes/__init__.pysurfsense_backend/app/agents/multi_agent_chat/subagents/connectors/airtable/system_prompt.mdsurfsense_backend/app/agents/multi_agent_chat/subagents/connectors/calendar/system_prompt.mdsurfsense_backend/app/agents/multi_agent_chat/subagents/connectors/clickup/system_prompt.mdsurfsense_backend/app/agents/multi_agent_chat/subagents/connectors/discord/system_prompt.mdsurfsense_backend/app/agents/multi_agent_chat/subagents/connectors/gmail/system_prompt.mdsurfsense_backend/app/agents/multi_agent_chat/subagents/connectors/jira/system_prompt.mdsurfsense_backend/app/agents/multi_agent_chat/subagents/connectors/linear/system_prompt.mdsurfsense_backend/app/agents/multi_agent_chat/subagents/connectors/luma/system_prompt.mdsurfsense_backend/app/agents/multi_agent_chat/subagents/connectors/slack/system_prompt.mdsurfsense_backend/app/agents/multi_agent_chat/subagents/connectors/teams/system_prompt.mdsurfsense_backend/app/agents/multi_agent_chat/subagents/builtins/research/system_prompt.mdsurfsense_backend/tests/unit/agents/new_chat/test_memory_response_content.pysurfsense_backend/tests/unit/agents/new_chat/tools/test_update_memory_scope.pysurfsense_backend/tests/unit/services/test_memory_service.pysurfsense_backend/tests/unit/agents/new_chat/tools/test_mcp_tools_cache.pysurfsense_backend/tests/unit/agents/new_chat/test_prompt_caching.pysurfsense_backend/app/tasks/chat/stream_new_chat.pysurfsense_backend/app/utils/document_converters.pysurfsense_backend/app/agents/new_chat/middleware/kb_persistence.pysurfsense_backend/app/services/gmail/kb_sync_service.pysurfsense_backend/app/services/google_calendar/kb_sync_service.pysurfsense_backend/app/services/jira/kb_sync_service.pysurfsense_backend/app/services/onedrive/kb_sync_service.pysurfsense_backend/app/services/revert_service.pysurfsense_backend/app/tasks/connector_indexers/discord_indexer.pysurfsense_backend/app/tasks/connector_indexers/luma_indexer.pysurfsense_backend/app/tasks/connector_indexers/teams_indexer.pysurfsense_backend/app/tasks/document_processors/_save.pySummary by CodeRabbit
New Features
Improvements
Removed Features