Skip to content

refactor: context management and sub-agents improvements#185

Merged
larryro merged 10 commits into
mainfrom
refactor/context-management-and-sub-agents
Jan 12, 2026
Merged

refactor: context management and sub-agents improvements#185
larryro merged 10 commits into
mainfrom
refactor/context-management-and-sub-agents

Conversation

@larryro
Copy link
Copy Markdown
Collaborator

@larryro larryro commented Jan 12, 2026

Summary

  • New CRM Assistant Sub-Agent: Added dedicated CRM assistant to handle internal data (customers, products), improving context isolation and tool routing
  • Context Management Refactor: Reorganized context management structure and moved dynamic timestamps to end of context for better LLM cache hits
  • Sub-Agent Improvements: Added action-first principle to sub-agent instructions, standardized context limits, and improved integration routing with domain-aware context

Test plan

  • Verify CRM assistant properly handles customer and product queries
  • Test integration routing with various domain contexts
  • Confirm sub-agents follow action-first principle (search before asking)
  • Validate LLM cache improvements with dynamic timestamp positioning

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Added specialized CRM Assistant for managing internal customer and product data requests.
    • Integration introspection now supports detailed operation information on demand.
  • Improvements

    • Enhanced assistant behavior with action-first principles—less asking, more direct execution.
    • Clarified scope limitations for CRM and product tools (internal systems only).
    • Improved search engine prioritization with weighted results (Google prioritized highest).
    • Streamlined approval verification with structured responses.
  • Documentation

    • Updated tool descriptions with clearer guidance on operation scope and external system handling.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jan 12, 2026

📝 Walkthrough

Walkthrough

This PR consolidates and refactors agent tooling and context management infrastructure. Key changes include: expanding documentation for internal CRM tools (customer_read, product_read) to clarify scope limitations; introducing a new CRM assistant sub-agent tool with prompt building and context handling; extracting integration tool descriptions for conciseness; refactoring integration introspection to support both summary and detailed operation modes; implementing a comprehensive context management module with token budgeting, prioritization, and per-agent configuration; restructuring sub-agent tools (web, document, integration, workflow) to use shared prompt building and context handlers; updating the chat agent to delegate CRM queries to the new sub-agent and use fast model by default; and adjusting search engine weights in configuration.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
services/platform/convex/model/threads/get_or_create_sub_thread.ts (1)

19-34: Either import type definitions from helpers/types.ts or add sync comments documenting the intentional duplication.

SubAgentType, SubThreadsMap, and ThreadSummaryWithSubThreads are duplicated in both get_or_create_sub_thread.ts and helpers/types.ts. No circular dependencies block consolidation—helpers/types.ts is a pure types file with no imports, and get_or_create_sub_thread.ts only imports from _generated/ modules.

The helper version uses Partial<Record<SubAgentType, string>>, which automatically includes new sub-agent types, while this file requires manual updates to the explicit interface. Recommend importing from helpers/types.ts to maintain a single source of truth. If duplication is intentional for isolation, add a sync comment documenting the relationship.

services/platform/convex/agent_tools/integrations/integration_tool.ts (1)

99-122: Minor: Redundant variable assignment.

Line 100 assigns result to approvalResult after the type guard has already narrowed result to ApprovalResult. You could use result directly within the narrowed block without the extra variable.

That said, the explicit naming improves readability, so this is stylistic.

services/platform/convex/lib/create_agent_config.ts (1)

90-110: Improve error message to reflect useFastModel context.

When useFastModel is true but OPENAI_FAST_MODEL is unset, the code silently falls back to OPENAI_MODEL. If both are unset, the error message only mentions OPENAI_MODEL, which could be confusing.

Consider updating the error message to indicate which variable was expected:

♻️ Suggested improvement
  if (!model) {
+   const expectedVar = opts.useFastModel ? 'OPENAI_FAST_MODEL (or fallback OPENAI_MODEL)' : 'OPENAI_MODEL';
    throw new Error(
-     'OPENAI_MODEL environment variable is required for Agent configuration but is not set',
+     `${expectedVar} environment variable is required for Agent configuration but is not set`,
    );
  }
services/platform/convex/lib/create_web_agent.ts (1)

29-35: Add OPENAI_FAST_MODEL to .env.example documentation.

The code change to enable useFastModel: true for the web agent is appropriate and consistent with existing chat and CRM agents. However, OPENAI_FAST_MODEL is referenced in create_agent_config.ts but is missing from .env.example. Since other OPENAI_* variables (MODEL, CODING_MODEL, EMBEDDING_MODEL, VISION_MODEL) are documented there, add OPENAI_FAST_MODEL with a description of its purpose (fast model for lower-complexity operations).

🤖 Fix all issues with AI agents
In @services/platform/convex/agent_tools/integrations/integration_tool.ts:
- Around line 84-97: Extract the inline ApprovalResult interface and its type
guard isApprovalResult into a shared types module (e.g., types.ts) and replace
the local declarations with imports; specifically, move the ApprovalResult
definition and the isApprovalResult logic to the shared file, export them, then
update integration_tool.ts to import ApprovalResult and isApprovalResult, and
also update other code that mirrors this shape (such as verify_approval_tool.ts)
to import the shared types instead of redefining them to ensure consistency and
reusability.

In @services/platform/convex/lib/context_management/agent_context_manager.ts:
- Around line 210-213: Add a TODO and improve observability around the
fire-and-forget summarization call in AgentContextManager: where
this.config.onSummarizationNeeded(ctx, this.config.threadId) is invoked, replace
the bare console.error with a structured log/metrics emission and add a
retry/backoff placeholder; specifically, add a TODO noting to emit metrics
(success/failure, latency), integrate with the project logger instead of
console, and implement retry/backoff logic around onSummarizationNeeded(ctx,
this.config.threadId) in future iterations for reliability and observability.
- Around line 277-311: The createSubAgentContext function omits adding current
time context; update it to call addCurrentTime() on the AgentContextManager
instance (created in createSubAgentContext) after addSystemInfo() and before
addTaskContext() so sub-agents receive timestamp info; locate the
AgentContextManager instance in createSubAgentContext and insert
manager.addCurrentTime() in the same sequence used elsewhere (e.g.,
build_sub_agent_prompt flow) to ensure consistent time context for sub-agents.

In @services/platform/convex/lib/context_management/context_priority.ts:
- Around line 125-140: The loop that always pushes mandatory contexts (when
ctx.canTrim === false) can let totalTokens exceed tokenBudget; add a warning log
when mandatory items alone push totalTokens past tokenBudget by checking after
adding a mandatory ctx (if totalTokens + ctx.tokens > tokenBudget then log a
warning including ctx.id or ctx.name and the new totalTokens vs tokenBudget),
and ensure you only emit this warning once (use a local boolean like
sawMandatoryOverBudget to avoid spamming). Keep using the existing variables
sorted, ctx, canTrim, kept, totalTokens, and tokenBudget so the change is
localized to this loop.

Comment thread services/platform/convex/lib/context_management/context_priority.ts
@larryro larryro force-pushed the refactor/context-management-and-sub-agents branch from 24ec7cb to f0c3996 Compare January 12, 2026 07:00
larryro and others added 10 commits January 12, 2026 15:01
- Move dynamic timestamp to end of context for better LLM cache hits
- Reorganize context management structure
- Add new crm_assistant sub-agent to handle internal CRM data (customers, products)
- Move customer_read and product_read tools from chat agent to crm_assistant for context isolation
- Add action-first principle to sub-agent instructions (search/act before asking)
- Improve integration routing with domain-aware context in format_integrations
- Add scope limitation warnings to customer_read and product_read tools
- Standardize sub-agent context limits to use DEFAULT_MODEL_CONTEXT_LIMIT
- Add useFastModel option to createAgentConfig for OPENAI_FAST_MODEL support
- Enable fast model for chat and web agents
- Add search engine weights to SearXNG configuration

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added addCurrentTime() method to AgentContextManager class and
updated createSubAgentContext to call it for consistency with
build_sub_agent_prompt.ts. Sub-agents need timestamp context for
time-based operations.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added TODO comment noting the need for structured logging/metrics
and retry/backoff logic for the fire-and-forget summarization call.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added debug warning log when mandatory context items (canTrim=false)
alone exceed the token budget. This helps identify misconfigured
context limits. The warning is emitted once per trim operation.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added JSDoc comment documenting the expected input format for
splitRagByRelevance function, noting the upstream RAG formatter
dependency.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replaced duplicated type definitions with imports from the shared
helpers/types.ts module. The shared SubThreadsMap uses Partial<Record>
which automatically includes new sub-agent types, improving maintainability.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Updated error message to indicate OPENAI_FAST_MODEL when useFastModel
is true, making it clearer which environment variable was expected.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Documented OPENAI_FAST_MODEL environment variable used for
lower-complexity sub-agent operations (web, integration, CRM).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant