# Tool Optimization: Selective Tool Exposure

## Introduction

In this advanced notebook, you'll learn how to optimize tool usage by selectively exposing tools based on context. When you have many tools, showing all of them to the LLM on every request wastes tokens and can cause confusion. You'll learn the "tool shed" pattern and dynamic tool selection.

### What You'll Learn

- The tool shed pattern (selective tool exposure)
- Dynamic tool selection based on context
- Reducing tool confusion
- Measuring improvement in tool selection
- When to use tool optimization

### Prerequisites

- Completed Section 2 notebooks
- Completed `section-2-system-context/03_tool_selection_strategies.ipynb`
- Redis 8 running locally
- OpenAI API key set

## Concepts: The Tool Overload Problem

### The Problem with Many Tools

As your agent grows, you add more tools:

```python
tools = [
    search_courses,              # 1
    get_course_details,          # 2
    check_prerequisites,         # 3
    enroll_in_course,            # 4
    drop_course,                 # 5
    get_student_schedule,        # 6
    check_schedule_conflicts,    # 7
    get_course_reviews,          # 8
    submit_course_review,        # 9
    get_instructor_info,         # 10
    # ... 20 more tools
]
```

**Problems:**
- ❌ **Token waste**: Tool schemas consume tokens
- ❌ **Confusion**: Too many choices
- ❌ **Slower**: More tools = more processing
- ❌ **Wrong selection**: Similar tools confuse LLM

### The Tool Shed Pattern

**Idea:** Don't show all tools at once. Show only relevant tools based on context.

```python
# Instead of showing all 30 tools...
all_tools = [tool1, tool2, ..., tool30]

# Show only relevant tools
if query_type == "search":
    relevant_tools = [search_courses, get_course_details]
elif query_type == "enrollment":
    relevant_tools = [enroll_in_course, drop_course, check_conflicts]
elif query_type == "review":
    relevant_tools = [get_course_reviews, submit_review]
```

**Benefits:**
- ✅ Fewer tokens
- ✅ Less confusion
- ✅ Faster processing
- ✅ Better tool selection

### Dynamic Tool Selection Strategies

**1. Query-based filtering:**
```python
if "search" in query or "find" in query:
    tools = search_tools
elif "enroll" in query or "register" in query:
    tools = enrollment_tools
```

**2. Intent classification:**
```python
intent = classify_intent(query)  # "search", "enroll", "review"
tools = tool_groups[intent]
```

**3. Conversation state:**
```python
if conversation_state == "browsing":
    tools = [search, get_details]
elif conversation_state == "enrolling":
    tools = [enroll, check_conflicts]
```

**4. Hierarchical tools:**
```python
# First: Show high-level tools
tools = [search_courses, manage_enrollment, view_reviews]

# Then: Show specific tools based on choice
if user_chose == "manage_enrollment":
    tools = [enroll, drop, swap, check_conflicts]
```

## Setup

In [None]:
import os
import re
from typing import List, Dict, Any
from langchain_openai import ChatOpenAI
from langchain_core.messages import SystemMessage, HumanMessage
from langchain_core.tools import tool
from pydantic import BaseModel, Field
from redis_context_course import CourseManager

# Initialize
llm = ChatOpenAI(model="gpt-4o", temperature=0)
course_manager = CourseManager()

print("✅ Setup complete")

## Creating Tool Groups

Let's organize tools into logical groups.

In [None]:
# Define tools (simplified for demo)
class SearchInput(BaseModel):
    query: str = Field(description="Search query")

@tool(args_schema=SearchInput)
async def search_courses(query: str) -> str:
    """Search for courses by topic or description."""
    return f"Searching for: {query}"

@tool(args_schema=SearchInput)
async def get_course_details(query: str) -> str:
    """Get detailed information about a specific course."""
    return f"Details for: {query}"

@tool(args_schema=SearchInput)
async def check_prerequisites(query: str) -> str:
    """Check prerequisites for a course."""
    return f"Prerequisites for: {query}"

@tool(args_schema=SearchInput)
async def enroll_in_course(query: str) -> str:
    """Enroll student in a course."""
    return f"Enrolling in: {query}"

@tool(args_schema=SearchInput)
async def drop_course(query: str) -> str:
    """Drop a course from student's schedule."""
    return f"Dropping: {query}"

@tool(args_schema=SearchInput)
async def check_schedule_conflicts(query: str) -> str:
    """Check for schedule conflicts."""
    return f"Checking conflicts for: {query}"

@tool(args_schema=SearchInput)
async def get_course_reviews(query: str) -> str:
    """Get reviews for a course."""
    return f"Reviews for: {query}"

@tool(args_schema=SearchInput)
async def submit_course_review(query: str) -> str:
    """Submit a review for a course."""
    return f"Submitting review for: {query}"

# Organize into groups
TOOL_GROUPS = {
    "search": [
        search_courses,
        get_course_details,
        check_prerequisites
    ],
    "enrollment": [
        enroll_in_course,
        drop_course,
        check_schedule_conflicts
    ],
    "reviews": [
        get_course_reviews,
        submit_course_review
    ]
}

ALL_TOOLS = [
    search_courses,
    get_course_details,
    check_prerequisites,
    enroll_in_course,
    drop_course,
    check_schedule_conflicts,
    get_course_reviews,
    submit_course_review
]

print(f"✅ Created {len(ALL_TOOLS)} tools in {len(TOOL_GROUPS)} groups")

## Strategy 1: Query-Based Tool Filtering

Select tools based on keywords in the query.

In [None]:
def select_tools_by_keywords(query: str) -> List:
    """Select relevant tools based on query keywords."""
    query_lower = query.lower()
    
    # Search-related keywords
    if any(word in query_lower for word in ['search', 'find', 'show', 'what', 'which', 'tell me about']):
        return TOOL_GROUPS["search"]
    
    # Enrollment-related keywords
    elif any(word in query_lower for word in ['enroll', 'register', 'drop', 'add', 'remove', 'conflict']):
        return TOOL_GROUPS["enrollment"]
    
    # Review-related keywords
    elif any(word in query_lower for word in ['review', 'rating', 'feedback', 'opinion']):
        return TOOL_GROUPS["reviews"]
    
    # Default: return search tools
    else:
        return TOOL_GROUPS["search"]

# Test it
test_queries = [
    "I want to search for machine learning courses",
    "Can I enroll in CS401?",
    "What are the reviews for CS301?",
    "Tell me about database courses"
]

print("=" * 80)
print("QUERY-BASED TOOL FILTERING")
print("=" * 80)

for query in test_queries:
    selected_tools = select_tools_by_keywords(query)
    tool_names = [t.name for t in selected_tools]
    print(f"\nQuery: {query}")
    print(f"Selected tools: {', '.join(tool_names)}")
    print(f"Count: {len(selected_tools)} / {len(ALL_TOOLS)} tools")

print("\n" + "=" * 80)

## Strategy 2: Intent Classification

Use the LLM to classify intent, then select tools.

In [None]:
async def classify_intent(query: str) -> str:
    """Classify user intent using LLM."""
    prompt = f"""Classify the user's intent into one of these categories:
- search: Looking for courses or information
- enrollment: Enrolling, dropping, or managing courses
- reviews: Reading or writing course reviews

User query: "{query}"

Respond with only the category name (search, enrollment, or reviews).
"""
    
    messages = [
        SystemMessage(content="You are a helpful assistant that classifies user intents."),
        HumanMessage(content=prompt)
    ]
    
    response = llm.invoke(messages)
    intent = response.content.strip().lower()
    
    # Validate intent
    if intent not in TOOL_GROUPS:
        intent = "search"  # Default
    
    return intent

async def select_tools_by_intent(query: str) -> List:
    """Select tools based on classified intent."""
    intent = await classify_intent(query)
    return TOOL_GROUPS[intent], intent

# Test it
print("\n" + "=" * 80)
print("INTENT-BASED TOOL FILTERING")
print("=" * 80)

for query in test_queries:
    selected_tools, intent = await select_tools_by_intent(query)
    tool_names = [t.name for t in selected_tools]
    print(f"\nQuery: {query}")
    print(f"Intent: {intent}")
    print(f"Selected tools: {', '.join(tool_names)}")
    print(f"Count: {len(selected_tools)} / {len(ALL_TOOLS)} tools")

print("\n" + "=" * 80)

## Comparing: All Tools vs. Filtered Tools

Let's compare tool selection with and without filtering.

In [None]:
print("\n" + "=" * 80)
print("COMPARISON: ALL TOOLS vs. FILTERED TOOLS")
print("=" * 80)

test_query = "I want to enroll in CS401"

# Approach 1: All tools
print(f"\nQuery: {test_query}")
print("\n--- APPROACH 1: Show all tools ---")
llm_all_tools = llm.bind_tools(ALL_TOOLS)
messages = [
    SystemMessage(content="You are a class scheduling agent."),
    HumanMessage(content=test_query)
]
response_all = llm_all_tools.invoke(messages)

if response_all.tool_calls:
    print(f"Selected tool: {response_all.tool_calls[0]['name']}")
print(f"Tools shown: {len(ALL_TOOLS)}")

# Approach 2: Filtered tools
print("\n--- APPROACH 2: Show filtered tools ---")
filtered_tools = select_tools_by_keywords(test_query)
llm_filtered_tools = llm.bind_tools(filtered_tools)
response_filtered = llm_filtered_tools.invoke(messages)

if response_filtered.tool_calls:
    print(f"Selected tool: {response_filtered.tool_calls[0]['name']}")
print(f"Tools shown: {len(filtered_tools)}")

print("\n✅ Benefits of filtering:")
print(f"   - Reduced tools: {len(ALL_TOOLS)} → {len(filtered_tools)}")
print(f"   - Token savings: ~{(len(ALL_TOOLS) - len(filtered_tools)) * 100} tokens")
print(f"   - Less confusion: Fewer irrelevant tools")

print("\n" + "=" * 80)

## Strategy 3: Hierarchical Tools

Start with high-level tools, then drill down.

In [None]:
print("\n" + "=" * 80)
print("HIERARCHICAL TOOL APPROACH")
print("=" * 80)

# High-level tools
@tool
async def browse_courses(query: str) -> str:
    """Browse and search for courses. Use this for finding courses."""
    return "Browsing courses..."

@tool
async def manage_enrollment(query: str) -> str:
    """Manage course enrollment (enroll, drop, check conflicts). Use this for enrollment actions."""
    return "Managing enrollment..."

@tool
async def view_reviews(query: str) -> str:
    """View or submit course reviews. Use this for review-related queries."""
    return "Viewing reviews..."

high_level_tools = [browse_courses, manage_enrollment, view_reviews]

print("\nStep 1: Show high-level tools")
print(f"Tools: {[t.name for t in high_level_tools]}")
print(f"Count: {len(high_level_tools)} tools")

print("\nStep 2: User selects 'manage_enrollment'")
print("Now show specific enrollment tools:")
enrollment_tools = TOOL_GROUPS["enrollment"]
print(f"Tools: {[t.name for t in enrollment_tools]}")
print(f"Count: {len(enrollment_tools)} tools")

print("\n✅ Benefits:")
print("   - Start simple (3 tools)")
print("   - Drill down as needed")
print("   - User-guided filtering")

print("\n" + "=" * 80)

## Measuring Improvement

Let's measure the impact of tool filtering.

In [None]:
print("\n" + "=" * 80)
print("MEASURING IMPROVEMENT")
print("=" * 80)

# Test queries with expected tools
test_cases = [
    ("Find machine learning courses", "search_courses"),
    ("Enroll me in CS401", "enroll_in_course"),
    ("Show reviews for CS301", "get_course_reviews"),
    ("Drop CS201 from my schedule", "drop_course"),
    ("What are the prerequisites for CS401?", "check_prerequisites"),
]

print("\nTesting tool selection accuracy...\n")

correct_all = 0
correct_filtered = 0

for query, expected_tool in test_cases:
    # Test with all tools
    llm_all = llm.bind_tools(ALL_TOOLS)
    response_all = llm_all.invoke([
        SystemMessage(content="You are a class scheduling agent."),
        HumanMessage(content=query)
    ])
    selected_all = response_all.tool_calls[0]['name'] if response_all.tool_calls else None
    
    # Test with filtered tools
    filtered = select_tools_by_keywords(query)
    llm_filtered = llm.bind_tools(filtered)
    response_filtered = llm_filtered.invoke([
        SystemMessage(content="You are a class scheduling agent."),
        HumanMessage(content=query)
    ])
    selected_filtered = response_filtered.tool_calls[0]['name'] if response_filtered.tool_calls else None
    
    # Check correctness
    if selected_all == expected_tool:
        correct_all += 1
    if selected_filtered == expected_tool:
        correct_filtered += 1
    
    print(f"Query: {query}")
    print(f"  Expected: {expected_tool}")
    print(f"  All tools: {selected_all} {'✅' if selected_all == expected_tool else '❌'}")
    print(f"  Filtered: {selected_filtered} {'✅' if selected_filtered == expected_tool else '❌'}")
    print()

print("=" * 80)
print(f"\nAccuracy with all tools: {correct_all}/{len(test_cases)} ({correct_all/len(test_cases)*100:.0f}%)")
print(f"Accuracy with filtered tools: {correct_filtered}/{len(test_cases)} ({correct_filtered/len(test_cases)*100:.0f}%)")

print("\n✅ Tool filtering improves:")
print("   - Selection accuracy")
print("   - Token efficiency")
print("   - Processing speed")

print("\n" + "=" * 80)

## Key Takeaways

### When to Use Tool Filtering

**Use tool filtering when:**
- ✅ You have 10+ tools
- ✅ Tools have distinct use cases
- ✅ Token budget is tight
- ✅ Tool confusion is an issue

**Don't filter when:**
- ❌ You have < 5 tools
- ❌ All tools are frequently used
- ❌ Tools are highly related

### Filtering Strategies

**1. Keyword-based (Simple)**
- ✅ Fast, no LLM call
- ✅ Easy to implement
- ⚠️ Can be brittle

**2. Intent classification (Better)**
- ✅ More accurate
- ✅ Handles variations
- ⚠️ Requires LLM call

**3. Hierarchical (Best for many tools)**
- ✅ Scales well
- ✅ User-guided
- ⚠️ More complex

### Implementation Tips

1. **Group logically** - Organize tools by use case
2. **Start simple** - Use keyword filtering first
3. **Measure impact** - Track accuracy and token usage
4. **Iterate** - Refine based on real usage
5. **Have fallback** - Default to search tools if unsure

### Token Savings

Typical tool schema: ~100 tokens

**Example:**
- 30 tools × 100 tokens = 3,000 tokens
- Filtered to 5 tools × 100 tokens = 500 tokens
- **Savings: 2,500 tokens per request!**

Over 1,000 requests:
- Savings: 2.5M tokens
- Cost savings: ~$5-10 (depending on model)

## Exercises

1. **Create tool groups**: Organize your agent's tools into logical groups. How many groups make sense?

2. **Implement filtering**: Add keyword-based filtering to your agent. Measure token savings.

3. **Test accuracy**: Create 20 test queries. Does filtering improve or hurt tool selection accuracy?

4. **Hierarchical design**: Design a hierarchical tool structure for a complex agent with 30+ tools.

## Summary

In this notebook, you learned:

- ✅ Tool filtering reduces token usage and confusion
- ✅ The tool shed pattern: show only relevant tools
- ✅ Multiple filtering strategies: keywords, intent, hierarchical
- ✅ Filtering improves accuracy and efficiency
- ✅ Essential for agents with many tools

**Key insight:** Don't show all tools all the time. Selective tool exposure based on context improves tool selection, reduces token usage, and makes your agent more efficient. This is especially important as your agent grows and accumulates more tools.