-
Notifications
You must be signed in to change notification settings - Fork 29
Reduce MCP server JSON response size to minimize LLM token consumption #360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce MCP server JSON response size to minimize LLM token consumption #360
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
|
@copilot rebase |
Implement lightweight response structure for MCP server that reduces JSON payload size while preserving all essential security data and adding enhanced SCM context for better repository identification. Changes: - Create mcpAnalysisResponse struct with only essential fields: findings, rules, purl, repository, scm_type, git_ref, commit_sha, last_commit - Remove embedded PackageInsights to eliminate heavy fields like github_actions_workflows, package_dependencies, and repo statistics - Update all MCP handlers (analyze_repo, analyze_local, analyze_org, analyze_stale_branches) to use lightweight response - Add comprehensive test suite to verify response structure and size - Add SCM context fields (purl, scm_type) per reviewer feedback - Rename 'ref' to 'git_ref' for clarity Results: - Lightweight response: ~182 bytes for empty findings vs kilobytes before - All essential security findings and repository metadata preserved - Better SCM identification with purl and scm_type fields - All tests passing with no regressions Fixes #359
3916e93 to
b711017
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have successfully tested with local build against Claude Code
|
Fixes #359 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The number of bytes is not the best metrics to estimate AI usage. We should have token estimator and compare the number of tokens. Using XML or other format then JSON might show a much lower token usage even at the cost of raw byte size.
|
|
||
| // The lightweight response should be significantly smaller than a full PackageInsights response | ||
| // which would include many more fields like workflows, dependencies, repo stats, etc. | ||
| assert.Less(t, len(lightweightData), 1000, "Lightweight response should be under 1KB for empty findings") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This number seems really arbitrary... We should probably compare the two results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Talgarr I agree, but the goal here is more to do the clean up, validate it's still working and that's enough. we know it will drastically drop tokens in practice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fproulx-boostsecurity that test is pretty useless honestly
| } | ||
|
|
||
| // TestMCPResponseStructure verifies the new mcpAnalysisResponse structure | ||
| func TestMCPResponseStructure(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is not very uselful and a copy of the previous one essentially
Summary
Implements lightweight response structure for MCP server that reduces JSON payload size while preserving all essential security data.
Changes
mcpAnalysisResponsestruct with only essential fields:purl,repository,scm_type,git_ref,commit_sha,last_commitPackageInsightsto eliminate heavy fields likegithub_actions_workflows,package_dependencies, and repository statisticsanalyze_repo,analyze_local,analyze_org,analyze_stale_branches) to use lightweight responseResults
PackageInsights)purlandscm_typefields for better repository identificationReview Feedback Addressed
purlfield for fully qualified package identifierscm_typefield to identify SCM platform (github/gitlab)reftogit_reffor clarityFixes #359