Skip to content

Feat: ai web search via exa#157

Merged
XeIris merged 2 commits intomasterfrom
AiWebSearchViaExa
May 8, 2026
Merged

Feat: ai web search via exa#157
XeIris merged 2 commits intomasterfrom
AiWebSearchViaExa

Conversation

@XeIris
Copy link
Copy Markdown
Collaborator

@XeIris XeIris commented May 8, 2026

@coderabbitai review
@claude review

Summary by CodeRabbit

Release Notes

  • New Features

    • Web search functionality now enabled for Grok, GPT, and Silverwolf personas
    • AI responses now indicate when web searches have been performed
    • Integrated search tool capabilities for enhanced responses
  • Improvements

    • Enhanced token management for web search scenarios
    • Improved graceful shutdown handling
    • Better tool call history tracking
  • Chores

    • Added Model Context Protocol SDK dependency

@XeIris XeIris self-assigned this May 8, 2026
@XeIris XeIris added enhancement New feature or request ai labels May 8, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 8, 2026

Review Change Stack

Warning

Rate limit exceeded

@XeIris has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c56e1c20-6bb9-4fcf-9df9-79441b471d8e

📥 Commits

Reviewing files that changed from the base of the PR and between be4ca8d and 208fbe2.

📒 Files selected for processing (4)
  • classes/handlers/keywordsBehaviorHandler.ts
  • utils/ai.ts
  • utils/mcp.ts
  • utils/tokenizer.ts
📝 Walkthrough

Walkthrough

This PR integrates Model Context Protocol (MCP) support for web search and tool calling into the AI response generation pipeline. Personas gain optional webSearchEnabled flags; the database schema and models are extended to persist tool messages; a new MCP client utility handles tool listing and execution; the AI generation engine implements bounded tool-iteration loops for OpenRouter and Gemini with result wrapping; token budgets increase when tools are active; and the grok handler integrates tool calling, persists exchanges, and displays a web-search notice.

Changes

Web Search & Tool-Calling Support

Layer / File(s) Summary
Data Schema & Contracts
database/tables/aiChatHistoryTable.ts, data/aiPersonas.json
AiChatHistoryRow.role adds 'tool'; personas Grok, GPT, Silverwolf gain webSearchEnabled: true.
Database Migration & Model
database/Database.ts, database/models/AiChatModel.ts
Database.init() idempotently migrates AiChatHistory constraint to allow 'tool' role; AiChatModel.addHistory() accepts 'tool' messages.
MCP Client Implementation
utils/mcp.ts
New utility module launches MCP server, lists tools (standard and Gemini formats), executes tool calls with timeout, sanitizes tool names, maintains connection state, and provides shutdownMcp().
AI Generation Types & Interfaces
utils/ai.ts (lines 4–7, 37–48, 60–73)
Adds ToolCallRecord type, Persona.webSearchEnabled, and expands GenerateContentOptions/GenerateContentResult to include webSearchEnabled and toolCalls.
AI Engine: OpenRouter Tool Loop
utils/ai.ts (lines 113–247)
OpenRouter path conditionally enables tool calling; implements bounded loop that executes MCP tool calls, injects results with <<MCP_TOOL_RESULT>> markers, and retries until completion or max iterations.
AI Engine: Gemini Tool Loop
utils/ai.ts (lines 249–422)
Gemini path lists tools only for non-image models; non-image path runs function-call loop translating calls into Gemini functionResponse messages with result wrapping; image path returns empty toolCalls.
Token Budget & Web Search Mode
utils/tokenizer.ts
computeReservedTokens() uses dynamic floor (1,024 or 12,288 based on webSearchEnabled); trimHistoryToFit() accepts webSearchEnabled parameter to adjust history budget.
Handler Integration & Tool Persistence
classes/handlers/keywordsBehaviorHandler.ts
Grok handler passes persona.webSearchEnabled to trimming, receives toolCalls from AI engine, prepends web-search notice when tools were used, persists tool exchanges as 'tool' messages in history.
Configuration & Process Startup
package.json, index.ts
Adds @modelcontextprotocol/sdk dependency; registers graceful shutdown handler for SIGTERM/SIGINT to cleanly close MCP before exit.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • Mewtwo2387/silverwolf#141: Both PRs modify utils/ai.ts, utils/tokenizer.ts, and classes/handlers/keywordsBehaviorHandler.ts around token/context handling and AI generation flows.
  • Mewtwo2387/silverwolf#98: Both PRs modify classes/handlers/keywordsBehaviorHandler.ts and data/aiPersonas.json to enhance grok persona behavior and configuration.
  • Mewtwo2387/silverwolf#103: Both PRs refactor the grok keyword handler to use unified generateContent flow with expanded result shapes and provider-specific logic.

Poem

🐰 A rabbit hops through web-wide searches,
Tools in pockets, perched on birches.
Grok now finds the truth it seeks,
MCP whispers through the creeks,
Token budgets stretch their wings—
Web-search brings all glimmering things!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 26.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Feat: ai web search via exa' accurately summarizes the main feature added: web search capability for AI personas using the Exa tool via MCP.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch AiWebSearchViaExa

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
classes/handlers/keywordsBehaviorHandler.ts (2)

140-150: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Exclude tool rows before token trimming.

After the first web-search turn, rawHistory includes JSON-stringified tool payloads. Feeding those rows into trimHistoryToFit() makes audit-only data consume the context budget and can push out real conversation turns on the very next request.

Suggested fix
         const rawHistory = await (message.client as any).db.aiChat.getHistory(aiSession.sessionId, 100);
+        const replayableHistory = rawHistory.filter((entry) => entry.role !== 'tool');
         hadRawHistory = rawHistory.length > 0;
 
         // Token-based sliding window: trim oldest messages to fit context
         const { trimmedHistory, warnings } = await trimHistoryToFit(
           persona.provider,
           persona.model,
           persona.systemPrompt ?? '',
-          rawHistory,
+          replayableHistory,
           prompt,
           persona.webSearchEnabled,
         );
         history = trimmedHistory;
         contextWarnings = warnings;
         historyLoaded = true;
 
-        if (rawHistory.length !== trimmedHistory.length) {
-          log(`AiChat: Trimmed history from ${rawHistory.length} to ${trimmedHistory.length} messages for session ${aiSession.sessionId}`);
+        if (replayableHistory.length !== trimmedHistory.length) {
+          log(`AiChat: Trimmed history from ${replayableHistory.length} to ${trimmedHistory.length} messages for session ${aiSession.sessionId}`);
         }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@classes/handlers/keywordsBehaviorHandler.ts` around lines 140 - 150,
RawHistory may contain tool rows (web-search JSON payloads) which should be
excluded before token-based trimming; filter rawHistory to remove entries with
role === 'tool' (or the equivalent tool identifier used in your message objects)
into a new variable (e.g., filteredHistory) and pass that filtered array to
trimHistoryToFit(persona.provider, persona.model, persona.systemPrompt ?? '',
filteredHistory, prompt, persona.webSearchEnabled). Keep the original
hadRawHistory detection if you need to know whether any raw history existed, but
use the filteredHistory for trimming and subsequent conversation-context logic
so tool payloads do not consume the context budget.

285-302: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Don't skip persistence on tool-only or image-only turns.

This block is still gated by && text, so if generation succeeds with only toolCalls and/or images, the user turn and all tool rows are silently lost.

Suggested fix
-      if (hasMemory && aiSession && text) {
+      if (hasMemory && aiSession && (text || toolCalls?.length || images?.length)) {
         const aiRole = persona.provider === 'openrouter' ? 'assistant' : 'model';
         try {
           await (message.client as any).db.aiChat.addHistory(aiSession.sessionId, 'user', prompt);
           if (toolCalls && toolCalls.length > 0) {
             // Persist tool exchanges between the user message and the final assistant
             // text so chronological order is preserved. These rows are audit-only;
             // they're filtered out when history is replayed to the model.
             for (const tc of toolCalls) {
               // eslint-disable-next-line no-await-in-loop
               await (message.client as any).db.aiChat.addHistory(
                 aiSession.sessionId,
                 'tool',
                 JSON.stringify(tc),
               );
             }
           }
-          await (message.client as any).db.aiChat.addHistory(aiSession.sessionId, aiRole, text);
+          if (text) {
+            await (message.client as any).db.aiChat.addHistory(aiSession.sessionId, aiRole, text);
+          }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@classes/handlers/keywordsBehaviorHandler.ts` around lines 285 - 302, The
current guard (hasMemory && aiSession && text) skips persisting the user turn
and toolCalls when generation yields only toolCalls or images; change the
condition to check for any content: use (hasMemory && aiSession && (text ||
(toolCalls && toolCalls.length > 0) || (images && images.length > 0))). Always
call db.aiChat.addHistory(aiSession.sessionId, 'user', prompt) and always
persist each tool call in the toolCalls loop; only call
db.aiChat.addHistory(aiSession.sessionId, aiRole, ...) for the assistant when
text is present or when images exist (serialize image metadata or a concise
representation) so image-only or tool-only turns are not lost. Ensure
references: hasMemory, aiSession, text, toolCalls, images, persona.provider, and
aiRole (assistant/model) are updated accordingly.
🧹 Nitpick comments (3)
utils/mcp.ts (2)

49-52: 💤 Low value

Add a small jitter to backoff to avoid thundering-herd on simultaneous reconnects.

If multiple call sites await connect() after a crash, they all unblock at the same nextRetryAt. Adding ±20% jitter to computeBackoffMs smooths reconnect attempts.

♻️ Optional jitter
 function computeBackoffMs(): number {
   const exp = Math.min(RECONNECT_MAX_MS, RECONNECT_BASE_MS * 2 ** consecutiveFailures);
-  return exp;
+  const jitter = 0.8 + Math.random() * 0.4; // ±20%
+  return Math.floor(exp * jitter);
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@utils/mcp.ts` around lines 49 - 52, computeBackoffMs currently returns a
deterministic exponential backoff (using RECONNECT_BASE_MS, RECONNECT_MAX_MS and
consecutiveFailures) which can cause a thundering herd; modify computeBackoffMs
to apply a ±20% jitter by multiplying the computed backoff by a random factor
between 0.8 and 1.2 (e.g., 1 + (Math.random() * 0.4 - 0.2)), then clamp the
result so it does not exceed RECONNECT_MAX_MS and remains non-negative; update
references to computeBackoffMs to rely on this jittered value as before.

121-145: 💤 Low value

sanitizeName collisions silently overwrite toolNameMap entries.

Two real tool names that differ only in characters stripped by sanitizeName (e.g., punctuation, the exa substring, or trailing/leading underscores) will map to the same publicName, and the later one wins on Line 129 — the model would then be unable to address the shadowed tool. Today's exa toolset does not collide, but a defensive disambiguation here is cheap and avoids silent loss when the upstream tool list grows.

♻️ Suggested disambiguation
   toolNameMap.clear();
-  const sanitized: SanitizedTool[] = res.tools.map((t: any) => {
-    const publicName = sanitizeName(t.name);
-    toolNameMap.set(publicName, t.name);
+  const sanitized: SanitizedTool[] = res.tools.map((t: any) => {
+    let publicName = sanitizeName(t.name);
+    if (toolNameMap.has(publicName)) {
+      let i = 2;
+      while (toolNameMap.has(`${publicName}_${i}`)) i += 1;
+      logWarning(`[mcp] sanitized name collision for ${t.name} → ${publicName}_${i}`);
+      publicName = `${publicName}_${i}`;
+    }
+    toolNameMap.set(publicName, t.name);
     return {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@utils/mcp.ts` around lines 121 - 145, In listSanitizedTools, sanitizeName can
produce duplicate publicName keys which silently overwrite entries in
toolNameMap; change the logic after computing publicName (and before
toolNameMap.set) to detect collisions in toolNameMap and disambiguate by
appending a deterministic suffix (e.g., "-1", "-2" or a short hash derived from
the original t.name) until the publicName is unique, update the
SanitizedTool.publicName to the disambiguated value, then set toolNameMap with
that unique publicName -> t.name mapping; ensure the logged summary uses the
final disambiguated publicName so the model can address each tool unambiguously.
utils/ai.ts (1)

177-187: ⚡ Quick win

Suggested refactor checks wrong HTTP status code for OpenRouter.

OpenRouter returns HTTP 404 (not 400) with message "No endpoints found that support tool use" when a model lacks tool support, while the proposed status === 400 && looksToolRelated check would miss this entirely. The current string-match heuristic is indeed brittle, but it works across both OpenRouter (404) and OpenAI-compatible (400) responses. If refactoring, check for status === 404 alongside message text for OpenRouter, or inspect err?.status to handle both 404 and 400 cases appropriately.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@utils/ai.ts` around lines 177 - 187, The catch block in utils/ai.ts currently
only string-matches error messages to detect tool-rejection (variables: err,
msg, toolsAvailable, model, iter); update that logic to also check err?.status
for OpenRouter's 404 alongside 400 for OpenAI-compatible responses (e.g., treat
status === 400 || status === 404 as a tool-rejection candidate) before flipping
toolsAvailable to false, decrementing iter and continuing; keep the existing
message text heuristic as a fallback (msg.includes('tool') ||
msg.includes('function')) so both status-based and message-based detections are
covered.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@utils/ai.ts`:
- Around line 311-342: When the loop reaches MAX_TOOL_ITERATIONS and the model
still emitted function calls, response.text() can throw and yields an empty
reply; update the loop in utils/ai.ts to detect the
final-iteration-with-pending-function-calls case (check iter ===
MAX_TOOL_ITERATIONS && fnCalls.length > 0) and set fullText to a clear fallback
message (e.g. "Tool budget exhausted — the assistant could not complete the
request. Try again or simplify the request.") instead of calling
response.text(); keep toolCalls populated as before and then break so the
function returns the fallback text, images, and toolCalls; reference
MAX_TOOL_ITERATIONS, iter, fnCalls, response, fullText, toolCalls, and
chatSession.sendMessage to locate where to insert the check.

In `@utils/mcp.ts`:
- Around line 230-237: The shutdownMcp race causes transport.onclose to treat an
intentional close as a crash; add an explicit boolean flag (e.g.,
isShuttingDown) that shutdownMcp sets true before awaiting client.close() and
clears after cleanup, and update the existing transport.onclose handler to
immediately no-op when isShuttingDown is true (instead of setting state =
'crashed', incrementing consecutiveFailures, or setting nextRetryAt).
Specifically: in shutdownMcp() set isShuttingDown = true, await client.close(),
then perform the current cleanup (client = null; transport = null; state =
'disconnected'), and set isShuttingDown = false (or leave it cleared); in the
onclose handler check if (isShuttingDown) return; otherwise keep the existing
crash/backoff logic that modifies state, consecutiveFailures and nextRetryAt.

In `@utils/tokenizer.ts`:
- Around line 76-78: computeReservedTokens can reserve more tokens than the
model context when webSearchEnabled (e.g., floor = 12_288 for an 8,192 model),
causing rawAvailable to go negative; modify computeReservedTokens so the
reserved floor is clamped to the provided contextSize (e.g., compute floor =
webSearchEnabled ? Math.min(12_288, contextSize) : Math.min(1_024, contextSize)
or after computing reserve do Math.min(reserve, contextSize)) and then return
the final value (still bounded by the 16_384 cap) so the function never reserves
more tokens than the model contextSize.

---

Outside diff comments:
In `@classes/handlers/keywordsBehaviorHandler.ts`:
- Around line 140-150: RawHistory may contain tool rows (web-search JSON
payloads) which should be excluded before token-based trimming; filter
rawHistory to remove entries with role === 'tool' (or the equivalent tool
identifier used in your message objects) into a new variable (e.g.,
filteredHistory) and pass that filtered array to
trimHistoryToFit(persona.provider, persona.model, persona.systemPrompt ?? '',
filteredHistory, prompt, persona.webSearchEnabled). Keep the original
hadRawHistory detection if you need to know whether any raw history existed, but
use the filteredHistory for trimming and subsequent conversation-context logic
so tool payloads do not consume the context budget.
- Around line 285-302: The current guard (hasMemory && aiSession && text) skips
persisting the user turn and toolCalls when generation yields only toolCalls or
images; change the condition to check for any content: use (hasMemory &&
aiSession && (text || (toolCalls && toolCalls.length > 0) || (images &&
images.length > 0))). Always call db.aiChat.addHistory(aiSession.sessionId,
'user', prompt) and always persist each tool call in the toolCalls loop; only
call db.aiChat.addHistory(aiSession.sessionId, aiRole, ...) for the assistant
when text is present or when images exist (serialize image metadata or a concise
representation) so image-only or tool-only turns are not lost. Ensure
references: hasMemory, aiSession, text, toolCalls, images, persona.provider, and
aiRole (assistant/model) are updated accordingly.

---

Nitpick comments:
In `@utils/ai.ts`:
- Around line 177-187: The catch block in utils/ai.ts currently only
string-matches error messages to detect tool-rejection (variables: err, msg,
toolsAvailable, model, iter); update that logic to also check err?.status for
OpenRouter's 404 alongside 400 for OpenAI-compatible responses (e.g., treat
status === 400 || status === 404 as a tool-rejection candidate) before flipping
toolsAvailable to false, decrementing iter and continuing; keep the existing
message text heuristic as a fallback (msg.includes('tool') ||
msg.includes('function')) so both status-based and message-based detections are
covered.

In `@utils/mcp.ts`:
- Around line 49-52: computeBackoffMs currently returns a deterministic
exponential backoff (using RECONNECT_BASE_MS, RECONNECT_MAX_MS and
consecutiveFailures) which can cause a thundering herd; modify computeBackoffMs
to apply a ±20% jitter by multiplying the computed backoff by a random factor
between 0.8 and 1.2 (e.g., 1 + (Math.random() * 0.4 - 0.2)), then clamp the
result so it does not exceed RECONNECT_MAX_MS and remains non-negative; update
references to computeBackoffMs to rely on this jittered value as before.
- Around line 121-145: In listSanitizedTools, sanitizeName can produce duplicate
publicName keys which silently overwrite entries in toolNameMap; change the
logic after computing publicName (and before toolNameMap.set) to detect
collisions in toolNameMap and disambiguate by appending a deterministic suffix
(e.g., "-1", "-2" or a short hash derived from the original t.name) until the
publicName is unique, update the SanitizedTool.publicName to the disambiguated
value, then set toolNameMap with that unique publicName -> t.name mapping;
ensure the logged summary uses the final disambiguated publicName so the model
can address each tool unambiguously.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b492b981-d413-471a-8e45-5f60e246b682

📥 Commits

Reviewing files that changed from the base of the PR and between 0dd636c and be4ca8d.

⛔ Files ignored due to path filters (1)
  • bun.lock is excluded by !**/*.lock
📒 Files selected for processing (10)
  • classes/handlers/keywordsBehaviorHandler.ts
  • data/aiPersonas.json
  • database/Database.ts
  • database/models/AiChatModel.ts
  • database/tables/aiChatHistoryTable.ts
  • index.ts
  • package.json
  • utils/ai.ts
  • utils/mcp.ts
  • utils/tokenizer.ts

Comment thread utils/ai.ts
Comment thread utils/mcp.ts
Comment thread utils/tokenizer.ts
- ai.ts: clearer fallback when Gemini tool-loop hits MAX_TOOL_ITERATIONS
  with pending fn calls; treat OpenRouter 400/404 as tool-rejection alongside
  message-text heuristic.
- mcp.ts: isShuttingDown flag stops onclose from flagging intentional shutdown
  as a crash; ±20% jitter on reconnect backoff; disambiguate sanitizeName
  collisions in toolNameMap.
- tokenizer.ts: clamp reserved-tokens floor to contextSize so small-context
  models can't drive rawAvailable negative.
- keywordsBehaviorHandler.ts: filter tool rows out before token-budget trim;
  persist user/tool/image-only turns instead of dropping them when text is
  empty.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@XeIris XeIris merged commit 6b86e1b into master May 8, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant