Fix token limit overflow with includeMetadata (fixes #24) #25

mkreyman · 2025-09-16T16:32:41Z

Summary

This PR fixes the critical token limit overflow issue reported in #24 where responses with includeMetadata=true exceed the MCP protocol's 25,000 token limit, causing errors.

Problem

When includeMetadata=true is used in context_get, the response can exceed token limits because:

Fixed default of 100 items doesn't account for metadata overhead
Token estimation was inaccurate (using 4 chars/token instead of 3.5)
No dynamic adjustment based on actual content size

Solution

Implemented a comprehensive token limit management system that:

Dynamically calculates safe item limits based on actual content size
More accurate token estimation (3.5 chars/token for JSON content)
Configurable via environment variables for different use cases
Centralized token management replacing scattered hardcoded values

Changes

New Module: `src/utils/token-limits.ts`

Dynamic token limit calculation
Response overhead estimation
Configurable token limits
Proper TypeScript interfaces
Safe JSON parsing with error handling

Environment Variables Added

MCP_MAX_TOKENS - Max response tokens (default: 25000, range: 1000-100000)
MCP_TOKEN_SAFETY_BUFFER - Safety margin (default: 0.8, range: 0.1-1.0)
MCP_MIN_ITEMS - Minimum items to return (default: 1, range: 1-100)
MCP_MAX_ITEMS - Maximum items per response (default: 100, range: 10-1000)
MCP_CHARS_PER_TOKEN - Token estimation ratio (default: 3.5, range: 2.5-5.0)

Code Quality Improvements

Extracted all magic numbers as named constants
Added comprehensive test coverage (5 new test files)
Proper error handling and validation
TypeScript interfaces instead of any

Testing

Added comprehensive test coverage:

issue24-reproduce.test.ts - Reproduces the original issue
issue24-final-fix.test.ts - Validates the fix
issue24-fix-validation.test.ts - Edge case testing
issue24-token-limit.test.ts - Repository-level tests
token-limits.test.ts - Unit tests for token utilities

All 1,136 tests pass ✅

Impact

Prevents "response exceeds maximum allowed tokens" errors
Automatically adjusts response size based on content
Configurable for different environments and use cases
Better visibility into token usage

Fixes #24

- Implement dynamic token limit calculation based on actual content size - Replace hardcoded limits with centralized token management module - Add configurable environment variables: - MCP_MAX_TOKENS: Max response tokens (default: 25000) - MCP_TOKEN_SAFETY_BUFFER: Safety margin (default: 0.8) - MCP_MIN_ITEMS: Minimum items to return (default: 1) - MCP_MAX_ITEMS: Maximum items per response (default: 100) - MCP_CHARS_PER_TOKEN: Token estimation ratio (default: 3.5) - More accurate token estimation (3.5 chars/token vs 4) - Add comprehensive test coverage for token limit scenarios - Extract all magic numbers as named constants - Add proper TypeScript interfaces and error handling - Update documentation with configuration options This fix prevents 'response exceeds maximum allowed tokens' errors when using includeMetadata=true by dynamically calculating safe item limits based on actual content size rather than using fixed defaults.

claude · 2025-09-16T16:32:56Z

Claude finished @mkreyman's task —— View job

🔍 Comprehensive PR Review: Token Limit Management Fix

I've completed a thorough review of this PR that addresses the critical token limit overflow issue (#24). Here's my detailed analysis:

✅ Strengths & Quality Code

Excellent Architecture Design

Centralized token management: New src/utils/token-limits.ts module consolidates all token-related logic, eliminating scattered hardcoded values
Dynamic calculation: calculateSafeItemLimit() samples actual data to determine safe limits instead of using fixed defaults
Proper TypeScript: Strong interfaces (ContextItem, TokenLimitConfig) replace any types

Improved Token Estimation Accuracy

More conservative approach: Uses 3.5 chars/token instead of 4 for safer estimation
JSON-aware calculation: Accounts for formatting overhead in token estimation
Response overhead calculation: Properly estimates pagination structure tokens

Robust Environment Configuration

5 new environment variables with comprehensive validation:
- MCP_MAX_TOKENS: Configurable token limits (1000-100000)
- MCP_TOKEN_SAFETY_BUFFER: Safety margin (0.1-1.0)
- MCP_MIN_ITEMS/MCP_MAX_ITEMS: Item count bounds
- MCP_CHARS_PER_TOKEN: Fine-tunable estimation ratio
Graceful fallbacks: Invalid values trigger warnings but don't break functionality
Bounds validation: All env vars have reasonable min/max constraints

Comprehensive Test Coverage

5 new test files with focused scenarios:
- issue24-reproduce.test.ts: Reproduces original problem
- issue24-final-fix.test.ts: Validates the fix works
- issue24-fix-validation.test.ts: Edge case testing
- token-limits.test.ts: Unit tests for utility functions
64 total test files - excellent coverage maintained

Implementation Quality

Error handling: Safe JSON parsing with try/catch blocks (src/utils/token-limits.ts:170-179)
Performance optimization: Uses sampling (10 items) for average size calculation to minimize overhead
Debug logging: Conditional logging for troubleshooting token calculations

🔧 Minor Suggestions & Observations

Code Quality Notes

Consistent error handling: The parseMetadata function properly handles malformed JSON gracefully
Magic numbers eliminated: All hardcoded values replaced with named constants
Documentation: Excellent inline documentation explaining token estimation rationale

Performance Considerations

Sampling strategy: Uses 10-item samples for calculations - good balance between accuracy and performance
Token calculation: Math operations are lightweight and won't impact response times
Dynamic limits: Calculation happens once per query, minimal overhead

Security Assessment

No security concerns: Environment variable validation prevents injection attacks
Input sanitization: Proper bounds checking on all configuration values
No sensitive data exposure: Debug logging doesn't reveal sensitive information

📊 Technical Implementation Review

Core Logic Flow (`src/index.ts:658-708`)

✅ Dynamic default limit calculation based on actual data
✅ Token limit checking before response construction
✅ Graceful truncation when limits exceeded
✅ Warning messages for users when truncation occurs

Token Estimation Logic (`src/utils/token-limits.ts:77-82`)

// Conservative 3.5 chars/token instead of OpenAI's 4 chars/token
return Math.ceil(text.length / charsPerToken);

✅ More conservative than OpenAI guidelines
✅ Configurable via environment variables
✅ Accounts for JSON formatting overhead

🎯 Final Assessment

Issue Resolution: ✅ EXCELLENT

Completely addresses the token limit overflow problem
Dynamic calculation prevents future issues as data grows
Configurable for different deployment environments

Code Quality: ✅ HIGH

Clean TypeScript with proper interfaces
Comprehensive error handling
Well-documented and maintainable

Testing: ✅ COMPREHENSIVE

Reproduces original issue and validates fix
Edge case coverage
Unit and integration tests

Performance Impact: ✅ MINIMAL

Efficient sampling strategy
Calculations only when needed
No blocking operations

🚀 Recommendation: APPROVE

This PR represents a high-quality solution to a critical issue. The implementation is thoughtful, well-tested, and provides the necessary configurability for different deployment scenarios. The centralized token management approach will prevent similar issues in the future.

No blocking issues identified. Ready for merge.

- Add safe parseMetadata helper to handle invalid JSON gracefully - Prevent crashes when metadata contains malformed JSON - Add test case for invalid JSON handling - Addresses bug found by Cursor Bugbot in PR review

claude · 2025-09-16T16:43:59Z

Claude finished @mkreyman's task —— View job

🔍 PR Review: Token Limit Management Fix

I'll provide a comprehensive review of this PR that addresses the critical token limit overflow issue. Here's my review plan:

Review Checklist:

Read and analyze the new token-limits.ts module
Review changes to src/index.ts implementation
Examine test coverage and quality
Investigate the JSON parsing bug identified by Cursor Bugbot
Assess code quality, architecture, and best practices
Evaluate security considerations
Check performance implications
Provide final recommendations

🎯 Issue Resolution Assessment: ✅ EXCELLENT

Problem Understanding

The PR correctly addresses the critical token limit overflow issue where context_get with includeMetadata=true exceeded the MCP protocol's 25,000 token limit. The root causes identified are accurate:

Fixed hardcoded limit of 100 items didn't account for metadata overhead
Inaccurate token estimation (4 chars/token vs actual ~3.5)
No dynamic adjustment based on actual content size

🏗️ Architecture & Design Quality: ✅ EXCELLENT

Centralized Token Management

The new src/utils/token-limits.ts module provides excellent architecture:

Single responsibility: All token-related logic consolidated in one module
Dynamic calculation: calculateSafeItemLimit() samples actual data (lines 149-228)
Smart sampling: Uses 10-item samples for statistical accuracy without performance overhead
Proper abstraction: Clean interfaces (ContextItem, TokenLimitConfig) replace any types

Improved Token Estimation

More conservative approach: 3.5 chars/token instead of 4 (line 61)
JSON-aware calculation: Accounts for formatting overhead in estimateTokens() (lines 77-82)
Response overhead calculation: Properly estimates pagination structure tokens (lines 98-140)

🔧 Implementation Quality: ✅ HIGH

Environment Configuration System

The implementation includes 5 new environment variables with comprehensive validation:

MCP_MAX_TOKENS: Configurable token limits (1000-100000) with bounds validation
MCP_TOKEN_SAFETY_BUFFER: Safety margin (0.1-1.0)
MCP_MIN_ITEMS/MCP_MAX_ITEMS: Item count constraints with logical validation
MCP_CHARS_PER_TOKEN: Fine-tunable estimation ratio

Validation Quality (lines 350-442):

✅ Proper bounds checking prevents invalid configurations
✅ Graceful fallbacks with warning messages for invalid values
✅ Cross-validation ensures minItems ≤ maxItems

Error Handling & Safety

Safe JSON parsing: Both helper functions (lines 170-179, 293-302) use try/catch blocks
Metadata handling: Gracefully handles malformed JSON, objects, or null values
Debug logging: Conditional logging for troubleshooting (line 215-225)

🔍 Cursor Bugbot Finding Analysis: ❌ FALSE POSITIVE

Cursor Bugbot's Report: "Token Limit Check Fails on Invalid JSON - The checkTokenLimit function directly calls JSON.parse(item.metadata) without error handling"

My Analysis: This is incorrect. The checkTokenLimit function:

Uses the parseMetadata helper function (line 312)
The helper function safely wraps JSON.parse with try/catch (lines 296-301)
Returns null for invalid JSON with a warning message
This pattern is consistent throughout the module

The code correctly handles malformed JSON metadata without crashes.

📊 Test Coverage Assessment: ✅ COMPREHENSIVE

Test Quality & Coverage

Added 5 focused test files with excellent coverage:

token-limits.test.ts: Unit tests for utility functions (lines 1-100+ show good test structure)
issue24-reproduce.test.ts: Reproduces original problem with realistic data (150 items, substantial content)
issue24-final-fix.test.ts: Validates the fix works end-to-end
issue24-fix-validation.test.ts: Edge case testing
issue24-token-limit.test.ts: Repository-level integration tests

Test Strengths:

✅ Realistic test data mimics production scenarios
✅ Both unit and integration test coverage
✅ Edge cases (empty strings, large content, metadata variations)
✅ Configuration validation testing

⚡ Performance Analysis: ✅ OPTIMIZED

Efficient Implementation

Minimal sampling overhead: Uses 10-item samples vs analyzing all data
One-time calculation: Token limits calculated once per query
Mathematical operations: Lightweight arithmetic operations only
No blocking operations: All calculations are CPU-bound and fast

Memory Usage

Sample-based approach: Doesn't load full dataset into memory for calculations
Efficient data structures: Uses existing item objects without copying

🛡️ Security Assessment: ✅ SECURE

Input Validation & Safety

Environment variable bounds: All env vars have min/max constraints preventing injection
JSON parsing safety: Malformed JSON handled gracefully without execution risks
No sensitive data exposure: Debug logging doesn't reveal credentials or sensitive info
Type safety: Strong TypeScript interfaces prevent runtime type errors

🚀 Code Quality & Best Practices: ✅ HIGH

Maintainability

✅ Well-documented: Excellent inline documentation explaining token estimation rationale
✅ Named constants: All magic numbers replaced with descriptive constants
✅ Single responsibility: Each function has a clear, focused purpose
✅ Consistent error handling: Unified approach to validation and error messages

TypeScript Usage

✅ Strong typing: Proper interfaces instead of any types
✅ Type safety: Compile-time checks prevent runtime errors
✅ Generic constraints: Appropriate use of generic types where needed

⚠️ Minor Observations

Test file token estimation: Integration tests use older 4 chars/token while production uses 3.5 - this is fine for reproduction testing
Environment variable documentation: Could benefit from README documentation of new env vars (though CHANGELOG covers this)
Backwards compatibility: Changes are backwards compatible with existing installations

🎯 Final Assessment & Recommendation

Quality Score: 9.5/10

✅ Issue Resolution: Completely addresses the critical token overflow problem
✅ Architecture: Well-designed, maintainable, and extensible solution
✅ Testing: Comprehensive test coverage with realistic scenarios
✅ Security: No security concerns identified
✅ Performance: Efficient implementation with minimal overhead
✅ Maintainability: Clean, documented, and well-structured code

🚀 RECOMMENDATION: APPROVE & MERGE

This PR represents a high-quality solution to a critical production issue. The implementation demonstrates:

Thoughtful architecture with centralized token management
Comprehensive testing that validates both the problem and solution
Production-ready configuration system for different deployment needs
Future-proofing through dynamic calculation that adapts to data growth

No blocking issues identified. The Cursor Bugbot finding is a false positive. This PR is ready for production deployment and will prevent token limit overflow errors while providing configurability for different use cases.

mkreyman self-assigned this Sep 16, 2025

This comment was marked as outdated.

Sign in to view

Fix unhandled JSON.parse in checkTokenLimit function

2e1f1ea

- Add safe parseMetadata helper to handle invalid JSON gracefully - Prevent crashes when metadata contains malformed JSON - Add test case for invalid JSON handling - Addresses bug found by Cursor Bugbot in PR review

mkreyman merged commit c3242cd into master Sep 16, 2025
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix token limit overflow with includeMetadata (fixes #24) #25

Fix token limit overflow with includeMetadata (fixes #24) #25

Uh oh!

mkreyman commented Sep 16, 2025

Uh oh!

claude bot commented Sep 16, 2025 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

claude bot commented Sep 16, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix token limit overflow with includeMetadata (fixes #24) #25

Fix token limit overflow with includeMetadata (fixes #24) #25

Uh oh!

Conversation

mkreyman commented Sep 16, 2025

Summary

Problem

Solution

Changes

New Module: src/utils/token-limits.ts

Environment Variables Added

Code Quality Improvements

Testing

Impact

Uh oh!

claude bot commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔍 Comprehensive PR Review: Token Limit Management Fix

✅ Strengths & Quality Code

Excellent Architecture Design

Improved Token Estimation Accuracy

Robust Environment Configuration

Comprehensive Test Coverage

Implementation Quality

🔧 Minor Suggestions & Observations

Code Quality Notes

Performance Considerations

Security Assessment

📊 Technical Implementation Review

Core Logic Flow (src/index.ts:658-708)

Token Estimation Logic (src/utils/token-limits.ts:77-82)

🎯 Final Assessment

Issue Resolution: ✅ EXCELLENT

Code Quality: ✅ HIGH

Testing: ✅ COMPREHENSIVE

Performance Impact: ✅ MINIMAL

🚀 Recommendation: APPROVE

Uh oh!

This comment was marked as outdated.

Uh oh!

claude bot commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔍 PR Review: Token Limit Management Fix

🎯 Issue Resolution Assessment: ✅ EXCELLENT

Problem Understanding

🏗️ Architecture & Design Quality: ✅ EXCELLENT

Centralized Token Management

Improved Token Estimation

🔧 Implementation Quality: ✅ HIGH

Environment Configuration System

Error Handling & Safety

🔍 Cursor Bugbot Finding Analysis: ❌ FALSE POSITIVE

📊 Test Coverage Assessment: ✅ COMPREHENSIVE

Test Quality & Coverage

⚡ Performance Analysis: ✅ OPTIMIZED

Efficient Implementation

Memory Usage

🛡️ Security Assessment: ✅ SECURE

Input Validation & Safety

🚀 Code Quality & Best Practices: ✅ HIGH

Maintainability

TypeScript Usage

⚠️ Minor Observations

🎯 Final Assessment & Recommendation

Quality Score: 9.5/10

🚀 RECOMMENDATION: APPROVE & MERGE

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

New Module: `src/utils/token-limits.ts`

claude bot commented Sep 16, 2025 •

edited

Loading

Core Logic Flow (`src/index.ts:658-708`)

Token Estimation Logic (`src/utils/token-limits.ts:77-82`)

claude bot commented Sep 16, 2025 •

edited

Loading