# Getting Started with the Tool Search Tool in Amazon Bedrock

This notebook introduces the **Tool Search Tool (TST)**, a new advanced tool use capability that enables Claude to work with hundreds or even thousands of tools without loading all their definitions into the context window upfront.

## The Problem

When building AI systems with many tools, you face a trade-off:
- **Load all tools upfront**: Consumes significant context window space (potentially 40-80K tokens)
- **Limit your tools**: Restricts the capabilities of your AI system

## The Solution: Tool Search Tool

Instead of declaring all tools immediately, you mark them with `defer_loading: true`, and Claude discovers and loads only the tools it needs through the tool search mechanism. This improves both context efficiency and tool selection accuracy.

### Key Benefits

- **Scale to massive tool libraries**: Build AI systems with 200+ tools without consuming context window
- **Maintain tool selection accuracy**: Only 3-5 most relevant tools shown to Claude at a time, preventing schema confusion
- **Context window optimization**: Save 40-80K tokens that would otherwise be spent on tool definitions

### Important Notes

> **Beta Header**: This feature is gated behind the beta header `tool-search-tool-2025-10-19`

> **API Availability**: At launch, this feature is only available via **Invoke APIs** (as demonstrated in this notebook). Converse APIs will be supported as a fast follow.

By the end of this notebook, you'll understand how to:
1. Configure tools with deferred loading
2. Use the Tool Search Tool to discover relevant tools
3. Handle the response structure including `server_tool_use` and `tool_reference` blocks
4. Avoid common errors when using this feature

## How Tool Search Tool Works

The Tool Search Tool introduces a new pattern for tool discovery:

### Traditional Approach
```
Request: [All tool definitions loaded] + User message
    ↓
Claude selects and uses appropriate tool
```

### With Tool Search Tool
```
Request: [Tool Search Tool] + [Deferred tool definitions] + User message
    ↓
Claude invokes Tool Search Tool with a query
    ↓
Server returns matching tool_reference(s)
    ↓
Claude uses the discovered tool
```

### Key Concepts

1. **`defer_loading: true`**: Mark tools that should not be loaded into context upfront
2. **`tool_search_tool_regex`**: A server-side tool that searches through deferred tools
3. **`server_tool_use`**: A content block type indicating Claude is using a server-side tool
4. **`tool_reference`**: The result type that references a discovered tool by name

The server handles the tool search automatically - Claude queries it, receives matching tool references, and then can use those tools normally.

## Setup

Let's install the required dependencies and configure our Bedrock client.

In [None]:
!pip install boto3 botocore -qU --disable-pip-version-check

In [None]:
import boto3
import json
from botocore.config import Config

# Configure AWS region
REGION = 'us-west-2'  # Change to your preferred region

# Initialize Bedrock runtime client
bedrock_runtime = boto3.client(
    service_name='bedrock-runtime',
    region_name=REGION,
    config=Config(read_timeout=300)
)

# Model configuration
MODEL_ID = 'global.anthropic.claude-opus-4-5-20251101-v1:0'

# Beta header for Tool Search Tool
TOOL_SEARCH_BETA_HEADER = 'tool-search-tool-2025-10-19'

print(f"Region: {REGION}")
print(f"Model ID: {MODEL_ID}")
print(f"Beta Header: {TOOL_SEARCH_BETA_HEADER}")

## Utility Functions

Let's create helper functions to invoke Claude with the Tool Search Tool and display the responses clearly.

In [None]:
def invoke_with_tool_search(messages, tools, max_tokens=4096):
    """
    Invoke Claude with Tool Search Tool enabled.
    
    Args:
        messages: List of message objects
        tools: List of tool definitions (including tool_search_tool_regex and deferred tools)
        max_tokens: Maximum tokens for response
        
    Returns:
        dict: The API response
    """
    request_body = {
        "anthropic_version": "bedrock-2023-05-31",
        "anthropic_beta": [TOOL_SEARCH_BETA_HEADER],
        "max_tokens": max_tokens,
        "tools": tools,
        "messages": messages
    }
    
    response = bedrock_runtime.invoke_model(
        modelId=MODEL_ID,
        body=json.dumps(request_body)
    )
    
    return json.loads(response['body'].read())


def display_response(response):
    """
    Display the response content blocks in a readable format.
    """
    print("=" * 60)
    print("RESPONSE")
    print("=" * 60)
    print(f"Stop Reason: {response.get('stop_reason', 'N/A')}")
    print(f"Usage: {response.get('usage', {})}")
    print("-" * 60)
    
    for i, block in enumerate(response.get('content', [])):
        block_type = block.get('type')
        print(f"\n[Block {i + 1}] Type: {block_type}")
        
        if block_type == 'text':
            print(f"  Text: {block.get('text')}")
            
        elif block_type == 'server_tool_use':
            print(f"  ID: {block.get('id')}")
            print(f"  Name: {block.get('name')}")
            print(f"  Input: {json.dumps(block.get('input', {}), indent=4)}")
            
        elif block_type == 'tool_search_tool_result':
            print(f"  Content: {json.dumps(block, indent=4)}")
            
        elif block_type == 'tool_result':
            print(f"  Tool Use ID: {block.get('tool_use_id')}")
            print(f"  Content: {json.dumps(block.get('content', []), indent=4)}")
            
        elif block_type == 'tool_use':
            print(f"  ID: {block.get('id')}")
            print(f"  Name: {block.get('name')}")
            print(f"  Input: {json.dumps(block.get('input', {}), indent=4)}")
            
        else:
            print(f"  Content: {json.dumps(block, indent=4)}")
    
    print("\n" + "=" * 60)

## Basic Example: Weather Tool

Let's start with a simple example that demonstrates the core functionality. We'll define:
1. The `tool_search_tool_regex` - the server-side search tool
2. A `get_weather` tool with `defer_loading: true`

When we ask about the weather, Claude will:
1. Invoke the Tool Search Tool to find relevant tools
2. Receive a `tool_reference` pointing to `get_weather`
3. Use the `get_weather` tool with the appropriate parameters

In [None]:
# Define our tools
tools = [
    # The Tool Search Tool - this is a server-side tool
    {
        "type": "tool_search_tool_regex",
        "name": "tool_search_tool_regex"
    },
    # A deferred tool - won't be loaded into context until discovered
    {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {"type": "string"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["location"]
        },
        "defer_loading": True
    }
]

# Create our message
messages = [
    {
        "role": "user",
        "content": "What's the weather in Seattle?"
    }
]

print("Sending request with Tool Search Tool...")
print(f"User message: {messages[0]['content']}")
print()

# Invoke the model
response = invoke_with_tool_search(messages, tools)

# Display the response
display_response(response)

### Understanding the Response

Let's break down what happened in the response:

1. **`text` block**: Claude explains what it's about to do.

2. **`server_tool_use` block**: Claude invoked the `tool_search_tool_regex` with a search pattern (e.g., `"pattern": "weather"`). This is handled server-side.

3. **`tool_search_tool_result` block**: The server returned a `tool_reference` pointing to `get_weather`. This tells Claude that this tool is now available for use. Note the nested structure with `tool_references` array.

4. **`text` block**: Claude explains it found the tool and will use it.

5. **`tool_use` block**: Claude then called `get_weather` with the appropriate parameters (location: "Seattle").

6. **`stop_reason: tool_use`**: The response stopped because Claude wants to use a tool. In a real application, you would execute the tool and return the result.

Notice that:
- The `server_tool_use` has an ID starting with `srvtoolu_bdrk_` (server tool use)
- The regular `tool_use` has an ID starting with `toolu_bdrk_`
- The tool search and result happen automatically within a single API call

## Multiple Deferred Tools

Now let's add more deferred tools to demonstrate that Claude selects only the relevant one based on the user's query.

In [None]:
# Define multiple deferred tools
tools_expanded = [
    # The Tool Search Tool
    {
        "type": "tool_search_tool_regex",
        "name": "tool_search_tool_regex"
    },
    # Weather tool
    {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {"type": "string"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["location"]
        },
        "defer_loading": True
    },
    # File search tool
    {
        "name": "search_files",
        "description": "Search through files in the workspace",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {"type": "string"},
                "file_types": {"type": "array", "items": {"type": "string"}}
            },
            "required": ["query"]
        },
        "defer_loading": True
    },
    # Calculator tool
    {
        "name": "calculator",
        "description": "Perform mathematical calculations",
        "input_schema": {
            "type": "object",
            "properties": {
                "expression": {"type": "string", "description": "Mathematical expression to evaluate"}
            },
            "required": ["expression"]
        },
        "defer_loading": True
    },
    # Send email tool
    {
        "name": "send_email",
        "description": "Send an email to a recipient",
        "input_schema": {
            "type": "object",
            "properties": {
                "to": {"type": "string"},
                "subject": {"type": "string"},
                "body": {"type": "string"}
            },
            "required": ["to", "subject", "body"]
        },
        "defer_loading": True
    }
]

print(f"Total tools defined: {len(tools_expanded)} (1 search tool + {len(tools_expanded) - 1} deferred tools)")

In [None]:
# Test with a file search query
messages_files = [
    {
        "role": "user",
        "content": "Find all Python files that contain the word 'database'"
    }
]

print("Testing file search query...")
print(f"User message: {messages_files[0]['content']}")
print()

response_files = invoke_with_tool_search(messages_files, tools_expanded)
display_response(response_files)

In [None]:
# Test with a calculation query
messages_calc = [
    {
        "role": "user",
        "content": "What is 15% of 2500?"
    }
]

print("Testing calculation query...")
print(f"User message: {messages_calc[0]['content']}")
print()

response_calc = invoke_with_tool_search(messages_calc, tools_expanded)
display_response(response_calc)

## Understanding the Response Structure

Let's examine the different content block types you'll encounter when using the Tool Search Tool.

### Content Block Types

| Type | Description | When it appears |
|------|-------------|----------------|
| `text` | Regular text output from Claude | Claude's explanations |
| `server_tool_use` | Claude invoking a server-side tool | When Claude searches for tools |
| `tool_search_tool_result` | Result from the Tool Search Tool | Contains nested `tool_references` array |
| `tool_use` | Claude invoking a user-defined tool | When Claude uses a discovered tool |

### The `tool_search_tool_result` Structure

When the Tool Search Tool finds matching tools, it returns a `tool_search_tool_result` block with a nested structure:

```json
{
    "type": "tool_search_tool_result",
    "tool_use_id": "srvtoolu_bdrk_xxx",
    "content": {
        "type": "tool_search_tool_search_result",
        "tool_references": [
            {
                "type": "tool_reference",
                "tool_name": "get_weather"
            }
        ]
    }
}
```

This tells Claude that the referenced tool(s) are now available for use.

In [None]:
def analyze_tool_search_flow(response):
    """
    Analyze and display the tool search flow from a response.
    """
    print("Tool Search Flow Analysis")
    print("=" * 40)
    
    searched_tools = []
    discovered_tools = []
    used_tools = []
    
    for block in response.get('content', []):
        block_type = block.get('type')
        
        if block_type == 'server_tool_use':
            pattern = block.get('input', {}).get('pattern', 'N/A')
            searched_tools.append({
                'tool': block.get('name'),
                'pattern': pattern
            })
            
        elif block_type == 'tool_search_tool_result':
            # Handle the nested structure
            content = block.get('content', {})
            tool_refs = content.get('tool_references', [])
            for ref in tool_refs:
                if ref.get('type') == 'tool_reference':
                    discovered_tools.append(ref.get('tool_name'))
                    
        elif block_type == 'tool_use':
            used_tools.append({
                'tool': block.get('name'),
                'input': block.get('input', {})
            })
    
    print("\n1. SEARCH PHASE")
    if searched_tools:
        for search in searched_tools:
            print(f"   Tool: {search['tool']}")
            print(f"   Pattern: {search['pattern']}")
    else:
        print("   No search performed")
    
    print("\n2. DISCOVERY PHASE")
    if discovered_tools:
        for tool in discovered_tools:
            print(f"   Found: {tool}")
    else:
        print("   No tools discovered")
    
    print("\n3. USAGE PHASE")
    if used_tools:
        for tool in used_tools:
            print(f"   Tool: {tool['tool']}")
            print(f"   Input: {json.dumps(tool['input'], indent=4)}")
    else:
        print("   No tools used yet (waiting for tool result)")
    
    print("\n" + "=" * 40)

# Analyze our previous response
print("Analyzing the weather query response:\n")
analyze_tool_search_flow(response)

## Error Handling

There are specific error conditions to be aware of when using the Tool Search Tool.

### Error 1: All Tools Have `defer_loading: true`

You must have at least one tool with `defer_loading: false` (or without the `defer_loading` property). The Tool Search Tool itself does not have `defer_loading` set, so it serves this purpose.

If all tools (including the search tool) have `defer_loading: true`, you'll receive a 400 error.

### Error 2: Invalid `tool_reference`

If a `tool_reference` is returned that doesn't match any tool definition, you'll receive a 400 error.

In [None]:
# CORRECT: Tool Search Tool without defer_loading (implicitly false)
valid_tools = [
    {
        "type": "tool_search_tool_regex",
        "name": "tool_search_tool_regex"
        # No defer_loading - this is correct!
    },
    {
        "name": "some_tool",
        "description": "A deferred tool",
        "input_schema": {"type": "object", "properties": {}},
        "defer_loading": True  # This tool is deferred
    }
]

print("Valid configuration:")
print("- tool_search_tool_regex: defer_loading not set (defaults to false)")
print("- some_tool: defer_loading = true")
print("\nThis configuration will work correctly.")

In [None]:
# INCORRECT: This would cause a 400 error (DO NOT RUN)
# Shown for educational purposes only

invalid_tools_example = """
# This configuration would fail with a 400 error:
[
    {
        "type": "tool_search_tool_regex",
        "name": "tool_search_tool_regex",
        "defer_loading": true  # ERROR: Search tool cannot be deferred!
    },
    {
        "name": "some_tool",
        "description": "A tool",
        "input_schema": {...},
        "defer_loading": true
    }
]

Error: Setting defer_loading: true for all tools will throw a 400 error
"""

print("Invalid configuration example (DO NOT USE):")
print(invalid_tools_example)

## Custom Tool Search Tools

While `tool_search_tool_regex` provides built-in regex-based search, you can implement your own custom tool search mechanism (e.g., using embeddings for semantic search).

### Key Requirements for Custom Search Tools

1. **Your custom search tool must have `defer_loading: false`** (or omit the property)
2. **Other tools should have `defer_loading: true`**
3. **Your tool must return `tool_reference` blocks** to indicate which tools Claude can use

### Conceptual Example

```python
# Custom search tool definition
custom_search_tool = {
    "name": "semantic_tool_search",
    "description": "Search for tools using semantic similarity",
    "input_schema": {
        "type": "object",
        "properties": {
            "query": {"type": "string"}
        },
        "required": ["query"]
    }
    # defer_loading is implicitly false
}

# When Claude calls your custom search tool, return tool_reference blocks:
tool_result = {
    "type": "tool_result",
    "tool_use_id": "toolu_xxx",
    "content": [
        {"type": "tool_reference", "tool_name": "matched_tool_1"},
        {"type": "tool_reference", "tool_name": "matched_tool_2"}
    ]
}
```

This allows you to implement sophisticated tool discovery logic while maintaining the same interface.

## Best Practices

### When to Use Tool Search Tool

**Good use cases:**
- Large tool libraries (10+ tools)
- Plugin systems where tools are dynamically added
- MCP (Model Context Protocol) servers with many capabilities
- Applications where context window optimization is important

**When it may not be needed:**
- Small number of tools (< 5) that easily fit in context
- All tools are always relevant to every query
- Simple, single-purpose applications

### Configuration Tips

1. **Always include the Tool Search Tool** as your non-deferred tool
2. **Write clear tool descriptions** - the search relies on matching tool names and descriptions
3. **Test with representative queries** - ensure the right tools are discovered for your use cases
4. **Handle the tool_use response** - remember that `stop_reason: tool_use` means you need to execute the tool and continue the conversation

### Response Handling Pattern

```python
# Typical flow for handling Tool Search Tool responses
response = invoke_with_tool_search(messages, tools)

while response.get('stop_reason') == 'tool_use':
    # Find the tool_use block
    for block in response['content']:
        if block['type'] == 'tool_use':
            # Execute the tool
            result = execute_tool(block['name'], block['input'])
            
            # Add assistant response and tool result to messages
            messages.append({"role": "assistant", "content": response['content']})
            messages.append({
                "role": "user",
                "content": [{
                    "type": "tool_result",
                    "tool_use_id": block['id'],
                    "content": result
                }]
            })
    
    # Continue the conversation
    response = invoke_with_tool_search(messages, tools)
```

## Summary

In this notebook, we covered:

1. **What the Tool Search Tool does**: Enables Claude to work with many tools without loading all definitions upfront

2. **How it works**: Tools marked with `defer_loading: true` are discovered on-demand through the `tool_search_tool_regex`

3. **Key response types**:
   - `server_tool_use`: Claude invoking the search tool
   - `tool_search_tool_result` with nested `tool_references`: The discovered tools
   - `tool_use`: Claude using the discovered tool

4. **Error conditions**: Ensure at least one tool has `defer_loading: false`

5. **Custom search tools**: You can implement your own search mechanism that returns `tool_reference` blocks

### Key Observations from the Examples

- Claude may choose **not** to use tools if it can answer directly (as seen in the calculator example)
- Claude can perform **multiple searches** in a single response to find the right tool (as seen in the file search example)
- The search pattern is typically derived from keywords in the user's query
- Empty `tool_references` arrays indicate no matching tools were found for that search pattern

### Next Steps

- Experiment with larger tool libraries
- Implement a complete tool execution loop
- Consider building a custom semantic search tool for better matching
- Combine with other features like streaming for real-time responses