Skip to content

[Code Quality] Optimize MCP tool response payloads to reduce token usage #11945

@github-actions

Description

@github-actions

Description

MCP structural analysis reveals that two GitHub MCP tools return bloated payloads, consuming excessive tokens and degrading performance. list_code_scanning_alerts returns 24K tokens (97KB) and list_pull_requests duplicates repository objects in every PR result.

Current State

Observed payload sizes (from MCP analysis):

  • list_code_scanning_alerts: 24,000 tokens (97KB) - largest payload
  • list_pull_requests: Heavy due to duplicated repository objects in each PR
  • Efficient tools for comparison: list_labels, list_branches, list_workflows (minimal payload bloat)

Impact

Token Cost:

  • Every call to these tools consumes 5-20x more tokens than necessary
  • Accumulated cost across hundreds of daily workflow runs
  • Particularly expensive for workflows that call these tools multiple times

Performance:

  • Larger context windows slow down AI agent processing
  • Increased network transfer time
  • Higher memory usage for payload processing

Efficiency Gap:

  • list_labels, list_branches, list_discussions remain highly efficient
  • Code security tools lag significantly behind in efficiency

Suggested Changes

Option 1: Return Selective Fields Only

For list_code_scanning_alerts:

// BEFORE: Return full alert object (97KB)
return alerts;

// AFTER: Return only essential fields
return alerts.map(alert => ({
  number: alert.number,
  state: alert.state,
  severity: alert.rule.severity,
  description: alert.rule.description,
  location: alert.most_recent_instance.location,
  // Omit: tool details, full rule objects, extensive metadata
}));

For list_pull_requests:

// BEFORE: Duplicate repo object in every PR
return prs;  // Each PR includes full repository object

// AFTER: Return repo once, reference in PRs
return {
  repository: { /* repo details once */ },
  pull_requests: prs.map(pr => ({
    number: pr.number,
    title: pr.title,
    state: pr.state,
    // Omit duplicated repo object
  }))
};

Option 2: Add Summary Modes

Add optional mode parameter:

  • mode: "summary" → Minimal fields (default)
  • mode: "full" → Complete objects (when needed)
tools:
  github:
    toolsets: [code_scanning]
    options:
      mode: summary  # NEW: Request lightweight responses

Option 3: Pagination with Field Selection

Implement field selection in pagination:

// Allow agents to specify which fields they need
GET /repos/{owner}/{repo}/code-scanning/alerts?fields=number,state,severity

Files Affected

This issue likely requires changes in the GitHub MCP server, not gh-aw directly:

  • Upstream: github-mcp-server repository (if maintained by GitHub)
  • Local: gh-aw's MCP integration layer if it can filter responses

Investigation needed: Determine if gh-aw can implement response filtering or if upstream changes are required.

Success Criteria

  • list_code_scanning_alerts payload reduced from 24K tokens to <10K tokens
  • list_pull_requests eliminates duplicated repository objects
  • ✅ Token usage for affected workflows reduced by 30-50%
  • ✅ All existing workflows continue to function (backward compatible)
  • ✅ Documentation updated with guidance on using summary vs. full modes

Alternative: Workflow Guidance

If upstream changes aren't feasible, add workflow documentation advising:

  • Avoid list_code_scanning_alerts unless essential
  • Use targeted queries instead of full listings
  • Filter results after retrieval to minimize context usage

Source

Extracted from DeepReport Intelligence Briefing discussion #11897

Relevant excerpt:

MCP structural analysis confirms list_code_scanning_alerts is the largest payload (24K tokens, 97KB) and list_pull_requests remains heavy due to duplicated repo objects.

Comparison:

  • ✅ Efficient: list_labels, list_branches, list_workflows, list_discussions
  • ❌ Bloated: list_code_scanning_alerts, list_pull_requests

Priority

High - Direct impact on token costs and performance. Token spend concentration analysis shows this affects high-frequency workflows.

Implementation Estimate

Effort: 2-3 days

  • Day 1: Investigate gh-aw vs. upstream GitHub MCP server responsibility
  • Day 2: Implement response filtering/summarization
  • Day 3: Test with real workflows, measure token reduction, document changes

AI generated by Discussion Task Miner - Code Quality Improvement Agent

  • expires on Feb 9, 2026, 9:07 PM UTC

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions