fix: return truncated content when token limit exceeded in MCP search_code #604

waynesun09 · 2025-11-06T19:05:03Z

Summary

Fixes token truncation behavior in the MCP search_code tool to return partial content instead of discarding files completely when token limits are exceeded.

Problem

Currently, when a search result file would exceed the maxTokens limit, the entire file is discarded with only a truncation message returned. This provides no useful data to the user, especially problematic when the first file is too large.

Example failure scenario:

User requests search with maxTokens=6000
First result file contains 10K tokens
Current behavior: Returns ONLY truncation message, no file data
Result: User gets no information at all

Solution

Modified the truncation logic in packages/mcp/src/index.ts to:

Calculate remaining token budget before breaking the loop
If meaningful space remains (>100 tokens), truncate the file content to fit
Append a clear truncation marker: ...[content truncated due to token limit]
Add the truncated content to results
Continue to add the overall truncation message at the end

Changes

File: packages/mcp/src/index.ts (lines 125-142)

Before:

if ((totalTokens + tokens) > maxTokens) {
    isResponseTruncated = true;
    break;  // Discards the file completely
}

After:

if ((totalTokens + tokens) > maxTokens) {
    const remainingTokens = maxTokens - totalTokens;
    
    if (remainingTokens > 100) {
        const maxLength = Math.floor(remainingTokens * 4);
        const truncatedText = text.substring(0, maxLength) + 
            "\n\n...[content truncated due to token limit]";
        
        content.push({
            type: "text",
            text: truncatedText,
        });
    }
    
    isResponseTruncated = true;
    break;
}

Benefits

✅ Users receive partial data instead of nothing
✅ Better debugging and analysis experience
✅ More useful for AI-powered code analysis workflows
✅ Consistent with expected truncation behavior
✅ Maintains backward compatibility (still includes truncation message)

Example Impact

Scenario: Search returns 50 files, each ~2K tokens, with maxTokens=10000

Before: Returns first 5 complete files (10K tokens), discards file #6 completely
After: Returns first 5 complete files + truncated version of file #6

Testing

✅ Verified token calculation logic (chars/4 approximation)
✅ Tested with various token limits (100, 1000, 10000, 150000)
✅ Confirmed truncation marker appears correctly
✅ Validated backward compatibility (truncation message still appended)

Related Issues

This fix specifically addresses issues observed in AI agent workflows where large codebases trigger token limits on early results, providing zero useful data to analysis tasks.

Commit: c5b8fda
Branch: fix/mcp-search-truncation

coderabbitai · 2025-11-06T19:05:14Z

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

packages/mcp/src/index.ts

brendan-kellam · 2025-11-07T17:30:28Z

LGTM, this approach makes sense. Thanks for the contribution 👍 Left one comment, once resolved happy to merge

…_code When search results exceed maxTokens limit, now returns partial truncated content instead of discarding the file completely. Changes: - Calculate remaining token budget before breaking - Truncate file content to fit within remaining tokens (if > 100 tokens left) - Append truncation marker to indicate content was cut off - Still add truncation message at end of all results Benefits: - Users get partial data instead of nothing - Better debugging and analysis experience - More useful for AI-powered code analysis tasks - Consistent with expected behavior when limits are reached Example: If file would use 10K tokens but only 2K remain, return first ~8K chars of content + truncation marker instead of dropping it. Signed-off-by: Wayne Sun <gsun@redhat.com>

waynesun09 force-pushed the fix/mcp-search-truncation branch from c5b8fda to e9903f4 Compare November 7, 2025 16:50

brendan-kellam reviewed Nov 7, 2025

View reviewed changes

packages/mcp/src/index.ts Show resolved Hide resolved

waynesun09 force-pushed the fix/mcp-search-truncation branch from d99c047 to a1da34a Compare November 7, 2025 18:24

brendan-kellam approved these changes Nov 7, 2025

View reviewed changes

brendan-kellam approved these changes Nov 10, 2025

View reviewed changes

brendan-kellam merged commit 278c0dc into sourcebot-dev:main Nov 10, 2025
5 checks passed

github-actions bot mentioned this pull request Nov 10, 2025

Sourcebot Roadmap 🚀 #459

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: return truncated content when token limit exceeded in MCP search_code #604

fix: return truncated content when token limit exceeded in MCP search_code #604

waynesun09 commented Nov 6, 2025

Uh oh!

coderabbitai bot commented Nov 6, 2025

Review skipped

Uh oh!

Uh oh!

brendan-kellam commented Nov 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: return truncated content when token limit exceeded in MCP search_code #604

fix: return truncated content when token limit exceeded in MCP search_code #604

Conversation

waynesun09 commented Nov 6, 2025

Summary

Problem

Solution

Changes

Benefits

Example Impact

Testing

Related Issues

Uh oh!

coderabbitai bot commented Nov 6, 2025

Review skipped

Uh oh!

Uh oh!

brendan-kellam commented Nov 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants