# MCP Tool Description Optimization

This notebook covers optimizing MCP (Model Context Protocol) tool descriptions using GEPA. When models use tools during evaluation, the quality of tool descriptions significantly impacts:

1. **Tool selection**: Which tool the model chooses to use
2. **Parameter accuracy**: How well the model fills in tool parameters
3. **Overall performance**: Task completion rate and quality

GEPA can optimize these descriptions to improve model-tool interactions.

---

## Prerequisites

MCP tool optimization requires:
1. A benchmark where models use MCP tools to answer questions
2. MCP servers configured in the verification config
3. Tool descriptions that can be optimized

**New in this version**: GEPA can automatically fetch seed descriptions directly from MCP server tool docstrings, eliminating the need to manually specify them.

---

## Setup

In [None]:
import sys
import tempfile
from pathlib import Path

sys.path.insert(0, str(Path.cwd().parent.parent.parent / "src"))

from karenina.integrations.gepa import (
    KareninaOutput,
    OptimizationConfig,
    OptimizationTarget,
    export_prompts_json,
    export_to_preset,
)
from karenina.schemas import ModelConfig, VerificationConfig

# Temp directory for outputs
OUTPUT_DIR = Path(tempfile.mkdtemp(prefix="mcp_optimization_"))
print(f"Output directory: {OUTPUT_DIR}")

---

## Understanding MCP Tool Descriptions

MCP tools are external capabilities that models can invoke during evaluation. Each tool has:
- **Name**: Identifier for the tool (e.g., `calculator`, `web_search`)
- **Description**: Text explaining what the tool does and when to use it
- **Parameters**: Schema for tool inputs

The **description** is what GEPA optimizes - it's the text shown to the model that guides tool selection.

In [None]:
# Example: Default tool descriptions (before optimization)
default_tool_descriptions = {
    "calculator": "A calculator for mathematical operations.",
    "web_search": "Search the web for information.",
    "code_interpreter": "Execute Python code.",
    "file_reader": "Read contents of a file.",
}

print("Default tool descriptions:")
for tool, desc in default_tool_descriptions.items():
    print(f"  {tool}: {desc}")

In [None]:
# Example: Optimized tool descriptions (after GEPA optimization)
optimized_tool_descriptions = {
    "calculator": """Perform precise mathematical calculations including arithmetic, 
algebra, and numerical operations. Use this for any computation that requires 
exact numerical answers. Supports +, -, *, /, ^, sqrt, trig functions.""",
    "web_search": """Search the internet for current information, facts, or data 
that may not be in your training data. Use when the question asks about recent 
events, specific statistics, or verifiable facts.""",
    "code_interpreter": """Execute Python code to solve problems programmatically. 
Use for complex calculations, data manipulation, algorithm implementation, or 
when step-by-step computation is needed. Returns execution output.""",
    "file_reader": """Read and return the contents of a specified file. Use when 
the question references external data or documents that need to be examined.""",
}

print("Optimized tool descriptions:")
for tool, desc in optimized_tool_descriptions.items():
    print(f"\n{tool}:")
    print(f"  {desc[:80]}...")

---

## Fetching Descriptions from MCP Servers

Instead of manually specifying seed descriptions, you can fetch them directly from MCP server tool docstrings. This ensures your seed descriptions match exactly what the model sees during inference.

In [None]:
from karenina.infrastructure.llm.mcp_utils import sync_fetch_tool_descriptions

# Fetch tool descriptions from a real MCP server
# Example: Open Targets Platform MCP server
OPEN_TARGETS_MCP = "https://mcp.platform.opentargets.org/mcp"

fetched_descriptions = sync_fetch_tool_descriptions(
    mcp_urls_dict={"open_targets": OPEN_TARGETS_MCP}
)

print("Fetched tool descriptions from Open Targets MCP:")
for tool_name, description in fetched_descriptions.items():
    # Show first 100 chars of each description
    preview = description[:100] + "..." if len(description) > 100 else description
    print(f"\n{tool_name}:")
    print(f"  {preview}")

In [None]:
# You can also filter to specific tools
filtered_descriptions = sync_fetch_tool_descriptions(
    mcp_urls_dict={"open_targets": OPEN_TARGETS_MCP},
    tool_filter=["search_entities", "query_open_targets_graphql"],
)

print("Filtered fetch (2 tools only):")
for tool_name in filtered_descriptions:
    print(f"  - {tool_name}")

### Auto-Fetch in KareninaAdapter

When optimizing `MCP_TOOL_DESCRIPTIONS`, the `KareninaAdapter` can automatically fetch seed descriptions from your MCP servers at initialization. This eliminates the need to manually specify `seed_mcp_tool_descriptions`.

In [None]:
from karenina.integrations.gepa import KareninaAdapter

# Create a verification config with MCP tools
open_targets_config = VerificationConfig(
    answering_models=[
        ModelConfig(
            id="claude-with-open-targets",
            model_provider="anthropic",
            model_name="claude-haiku-4-5",
            temperature=0.0,
            interface="langchain",
            system_prompt="You are a biomedical research assistant.",
            mcp_urls_dict={"open_targets": OPEN_TARGETS_MCP},
        )
    ],
    parsing_models=[
        ModelConfig(
            id="parser",
            model_provider="anthropic",
            model_name="claude-haiku-4-5",
            temperature=0.0,
            interface="langchain",
        )
    ],
    evaluation_mode="template_only",
    replicate_count=1,
)

# Create adapter with auto-fetch enabled (default behavior)
# The adapter automatically fetches tool descriptions from MCP servers
adapter = KareninaAdapter(
    benchmark=None,  # Would be a real Benchmark in practice
    verification_config=open_targets_config,
    targets=[OptimizationTarget.MCP_TOOL_DESCRIPTIONS],
    auto_fetch_tool_descriptions=True,  # This is the default
)

# Access the auto-fetched seed descriptions
print("Auto-fetched seed tool descriptions:")
if adapter.seed_tool_descriptions:
    for tool_name, desc in adapter.seed_tool_descriptions.items():
        preview = desc[:80] + "..." if len(desc) > 80 else desc
        print(f"  {tool_name}: {preview}")
else:
    print("  (No descriptions fetched - MCP server may be unavailable)")

In [None]:
# You can also provide explicit seed descriptions or disable auto-fetch
adapter_with_seeds = KareninaAdapter(
    benchmark=None,
    verification_config=open_targets_config,
    targets=[OptimizationTarget.MCP_TOOL_DESCRIPTIONS],
    # Provide your own seed descriptions instead of auto-fetching
    seed_mcp_tool_descriptions={
        "search_entities": "Custom seed description for entity search.",
        "query_open_targets_graphql": "Custom seed for GraphQL queries.",
    },
    auto_fetch_tool_descriptions=False,  # Disable auto-fetch
)

print("Adapter with explicit seeds:")
print(f"  Tools: {list(adapter_with_seeds.seed_tool_descriptions.keys())}")

---

## Configuring MCP Tool Optimization

Use `OptimizationTarget.MCP_TOOL_DESCRIPTIONS` to optimize tool descriptions.

In [None]:
# Single target: Optimize only tool descriptions
mcp_only_config = OptimizationConfig(
    targets=[OptimizationTarget.MCP_TOOL_DESCRIPTIONS],
    # Seed tool descriptions
    seed_mcp_tool_descriptions=default_tool_descriptions,
    # Scoring
    template_weight=0.7,
    rubric_weight=0.3,
    # GEPA parameters
    reflection_model="anthropic/claude-haiku-4-5",
    max_metric_calls=100,
)

print("MCP-only optimization config:")
print(f"  Targets: {[t.value for t in mcp_only_config.targets]}")
print(f"  Tools to optimize: {list(mcp_only_config.seed_mcp_tool_descriptions.keys())}")

In [None]:
# Multi-target: Optimize prompts AND tool descriptions together
multi_target_config = OptimizationConfig(
    targets=[
        OptimizationTarget.ANSWERING_SYSTEM_PROMPT,
        OptimizationTarget.MCP_TOOL_DESCRIPTIONS,
    ],
    # Seed system prompt
    seed_answering_prompt="""You are a helpful assistant with access to tools.
Use the available tools when needed to answer questions accurately.""",
    # Seed tool descriptions
    seed_mcp_tool_descriptions=default_tool_descriptions,
    # Scoring
    template_weight=0.7,
    rubric_weight=0.3,
    # GEPA parameters
    reflection_model="anthropic/claude-haiku-4-5",
    max_metric_calls=150,
)

print("Multi-target optimization config:")
print(f"  Targets: {[t.value for t in multi_target_config.targets]}")

### Getting the Seed Candidate

The `get_seed_candidate()` method builds the initial candidate dict with tool descriptions prefixed by `mcp_tool_`:

In [None]:
# Get seed candidate for GEPA
seed_candidate = multi_target_config.get_seed_candidate()

print("Seed candidate dict:")
for key, value in seed_candidate.items():
    preview = value[:60] + "..." if len(value) > 60 else value
    print(f"  {key}: {preview}")

**Key Convention**: Tool descriptions use the `mcp_tool_` prefix:
- `mcp_tool_calculator` → description for the `calculator` tool
- `mcp_tool_web_search` → description for the `web_search` tool

---

## Verification Config with MCP Tools

To use MCP tools in verification, configure `mcp_urls_dict` on your model config:

In [None]:
# Example verification config with MCP tools
mcp_verification_config = VerificationConfig(
    answering_models=[
        ModelConfig(
            id="claude-with-tools",
            model_provider="anthropic",
            model_name="claude-haiku-4-5",
            temperature=0.0,
            interface="langchain",
            system_prompt="You are a helpful assistant with access to tools.",
            # MCP server configuration
            mcp_urls_dict={
                "calculator": "http://localhost:8001/mcp",
                "web_search": "http://localhost:8002/mcp",
                "code_interpreter": "http://localhost:8003/mcp",
            },
        )
    ],
    parsing_models=[
        ModelConfig(
            id="parser",
            model_provider="anthropic",
            model_name="claude-haiku-4-5",
            temperature=0.0,
            interface="langchain",
        )
    ],
    evaluation_mode="template_only",
    replicate_count=1,
)

print("MCP verification config:")
print(f"  MCP tools: {list(mcp_verification_config.answering_models[0].mcp_urls_dict.keys())}")

---

## How Tool Description Injection Works

When the `KareninaAdapter` evaluates a candidate with MCP tool descriptions:

1. **Extract tool descriptions** from candidate (keys starting with `mcp_tool_`)
2. **Remove prefix** to get tool names (`mcp_tool_calculator` → `calculator`)
3. **Inject as overrides** on model config via `mcp_tool_description_overrides`
4. **MCP client uses overrides** when presenting tools to the model

In [None]:
# Simulating what the adapter does internally
def extract_tool_overrides(candidate: dict) -> dict:
    """Extract MCP tool description overrides from a candidate."""
    tool_overrides = {}
    for key, value in candidate.items():
        if key.startswith("mcp_tool_"):
            tool_name = key[9:]  # Remove "mcp_tool_" prefix
            tool_overrides[tool_name] = value
    return tool_overrides


# Example candidate from GEPA
example_candidate = {
    "answering_system_prompt": "You are a helpful assistant.",
    "mcp_tool_calculator": "Perform precise mathematical calculations.",
    "mcp_tool_web_search": "Search the internet for current information.",
}

tool_overrides = extract_tool_overrides(example_candidate)

print("Extracted tool overrides:")
for tool, desc in tool_overrides.items():
    print(f"  {tool}: {desc}")

---

## KareninaOutput with MCP Tools

After optimization, `KareninaOutput` contains the optimized tool descriptions:

In [None]:
# Simulated optimization output
optimization_output = KareninaOutput(
    # Optimized system prompt
    answering_system_prompt="""You are a helpful assistant with access to specialized tools.
Always consider using tools when they can provide more accurate answers.
For calculations, use the calculator. For current events, use web search.""",
    # Optimized tool descriptions
    mcp_tool_descriptions=optimized_tool_descriptions,
    # Scores
    train_score=0.85,
    val_score=0.80,
    test_score=0.78,
    baseline_score=0.60,
    improvement=0.33,
    # Optimization metadata
    total_generations=12,
    total_metric_calls=100,
    best_generation=10,
)

print("Optimization output:")
print(f"  Val score: {optimization_output.val_score:.2%}")
print(f"  Improvement: {optimization_output.improvement:.2%}")
print(f"\nOptimized tools: {list(optimization_output.mcp_tool_descriptions.keys())}")

In [None]:
# Get all optimized prompts (including tools)
all_prompts = optimization_output.get_optimized_prompts()

print("All optimized prompts:")
for key in all_prompts:
    print(f"  - {key}")

---

## Exporting Optimized Tool Descriptions

Export optimized tool descriptions for use in production.

In [None]:
# Get optimized prompts dict
optimized_prompts = optimization_output.get_optimized_prompts()

print("Optimized prompts for export:")
for key, value in optimized_prompts.items():
    preview = value[:50] + "..." if len(value) > 50 else value
    print(f"  {key}: {preview}")

In [None]:
# Export as verification preset
preset_path = export_to_preset(
    optimized_prompts=optimized_prompts,
    base_config=mcp_verification_config,
    output_path=OUTPUT_DIR / "mcp_optimized_preset.json",
    targets=[
        OptimizationTarget.ANSWERING_SYSTEM_PROMPT,
        OptimizationTarget.MCP_TOOL_DESCRIPTIONS,
    ],
)

print(f"Exported preset: {preset_path}")

In [None]:
# View the exported preset structure
import json

with open(preset_path) as f:
    preset = json.load(f)

print("Preset structure:")
print(f"  Keys: {list(preset.keys())}")

# Check GEPA metadata
gepa_meta = preset.get("_gepa_optimization", {})
print("\nGEPA metadata:")
print(f"  Targets: {gepa_meta.get('targets')}")
print(f"  Optimized components: {gepa_meta.get('optimized_components')}")

In [None]:
# Export as lightweight JSON
prompts_path = export_prompts_json(
    optimized_prompts=optimized_prompts,
    metadata={
        "benchmark": "Tool-Enhanced QA",
        "val_score": optimization_output.val_score,
        "improvement": optimization_output.improvement,
        "tools_optimized": list(optimization_output.mcp_tool_descriptions.keys()),
    },
    output_path=OUTPUT_DIR / "mcp_optimized_prompts.json",
)

print(f"Exported prompts: {prompts_path}")

---

## Using Optimized Tool Descriptions

Load and apply optimized tool descriptions in a new verification config:

In [None]:
from karenina.integrations.gepa import load_prompts_json

# Load saved prompts
loaded_prompts, loaded_meta = load_prompts_json(prompts_path)

print(f"Loaded prompts from optimization with {loaded_meta['improvement']:.2%} improvement")
print(f"Tools optimized: {loaded_meta['tools_optimized']}")

In [None]:
# Extract tool descriptions from loaded prompts
tool_overrides = {}
system_prompt = None

for key, value in loaded_prompts.items():
    if key.startswith("mcp_tool_"):
        tool_name = key[9:]
        tool_overrides[tool_name] = value
    elif key == "answering_system_prompt":
        system_prompt = value

print(f"System prompt loaded: {system_prompt is not None}")
print(f"Tool overrides: {list(tool_overrides.keys())}")

In [None]:
# Create verification config with optimized descriptions
# Note: In practice, you would set mcp_tool_description_overrides on the model config
# The MCP client will use these overrides when presenting tools to the model

print("""
# Production usage:

optimized_config = VerificationConfig(
    answering_models=[
        ModelConfig(
            id="claude-with-tools",
            model_provider="anthropic",
            model_name="claude-haiku-4-5",
            temperature=0.0,
            interface="langchain",
            
            # Optimized system prompt
            system_prompt=loaded_prompts["answering_system_prompt"],
            
            # MCP server URLs
            mcp_urls_dict={
                "calculator": "http://localhost:8001/mcp",
                "web_search": "http://localhost:8002/mcp",
            },
            
            # Optimized tool descriptions
            mcp_tool_description_overrides=tool_overrides,
        )
    ],
    ...
)
""")

---

## Best Practices for MCP Tool Optimization

### 1. Start with Good Seed Descriptions

In [None]:
# Good: Specific, actionable descriptions
good_seeds = {
    "calculator": """Perform mathematical calculations. Supports arithmetic (+, -, *, /), 
    exponents (^), square root, and trigonometric functions (sin, cos, tan).""",
    "web_search": """Search the internet for factual information. Use when the question 
    asks about current events, statistics, or facts that need verification.""",
}

# Bad: Vague, unhelpful descriptions
bad_seeds = {
    "calculator": "Do math.",
    "web_search": "Search stuff.",
}

print("Good seeds provide context and use cases.")
print("Bad seeds don't give the model enough information.")

### 2. Optimize Related Components Together

In [None]:
# Good: Optimize system prompt AND tool descriptions together
# This allows GEPA to find synergies between the two

combined_config = OptimizationConfig(
    targets=[
        OptimizationTarget.ANSWERING_SYSTEM_PROMPT,
        OptimizationTarget.MCP_TOOL_DESCRIPTIONS,
    ],
    seed_answering_prompt="You are a helpful assistant with tools.",
    seed_mcp_tool_descriptions=good_seeds,
)

print("Combined optimization allows system prompt and tool descriptions to work together.")

### 3. Include Tool Usage Examples in Descriptions

In [None]:
# Enhanced descriptions with examples
example_rich_descriptions = {
    "calculator": """Perform precise mathematical calculations.

Use for:
- Arithmetic: 2 + 3, 100 * 0.15
- Algebra: solve(x^2 - 4 = 0)
- Trigonometry: sin(45), cos(pi/4)

Returns the exact numerical result.""",
    "code_interpreter": """Execute Python code to solve problems.

Use for:
- Complex algorithms
- Data processing
- Multi-step calculations

Example: To find primes up to N, write Python code that implements a sieve.""",
}

print("Example-rich descriptions help models understand when to use each tool.")

---

## Cleanup

In [None]:
# List output files
print("Output files:")
for f in OUTPUT_DIR.iterdir():
    print(f"  {f.name}: {f.stat().st_size} bytes")

In [None]:
# Clean up
import shutil

shutil.rmtree(OUTPUT_DIR, ignore_errors=True)
print(f"Cleaned up: {OUTPUT_DIR}")

---

## Summary

### MCP Tool Optimization API

| Component | Description |
|-----------|-------------|
| `OptimizationTarget.MCP_TOOL_DESCRIPTIONS` | Target for tool description optimization |
| `seed_mcp_tool_descriptions` | Dict of initial tool descriptions |
| `mcp_tool_<name>` prefix | Convention for tool descriptions in candidates |
| `mcp_tool_description_overrides` | ModelConfig field for applying optimized descriptions |
| `sync_fetch_tool_descriptions()` | Fetch descriptions directly from MCP servers |
| `auto_fetch_tool_descriptions` | KareninaAdapter param to auto-fetch at init (default: True) |
| `adapter.seed_tool_descriptions` | Property to access fetched/provided seed descriptions |

### Key Points

1. **Tool descriptions guide model tool selection** - optimize them for better accuracy
2. **Auto-fetch from MCP servers** - no need to manually specify seed descriptions
3. **Combine with system prompt optimization** - both affect tool usage behavior
4. **Use the `mcp_tool_` prefix convention** - adapter handles extraction automatically
5. **Export preserves tool descriptions** - preset includes overrides for all tools

### When to Use MCP Tool Optimization

- Benchmarks where models use external tools (calculators, search, code execution)
- Tasks where tool selection accuracy impacts performance
- Multi-tool scenarios where models need to choose the right tool

## Related Notebooks

- [02_configuration.ipynb](02_configuration.ipynb) - OptimizationConfig basics
- [05_karenina_adapter.ipynb](05_karenina_adapter.ipynb) - How the adapter injects candidates
- [08_export_and_reuse.ipynb](08_export_and_reuse.ipynb) - Exporting optimized prompts