# 04: Tool Integration - Multi-Agent System with Web Search

## The Architecture

We're building on our multi-agent system from Module 3:
- **Router**: Analyzes claims and routes to specialists
- **Technical Expert**: Handles technology/engineering claims
- **Historical Expert**: Handles historical facts
- **Current Events Expert**: NOW WITH WEB SEARCH TOOLS!
- **General Expert**: Handles everything else

The key insight: Only the Current Events Expert needs tools!

## Setup

In [9]:
import sys
from pathlib import Path
sys.path.append(str(Path.cwd().parent))

# Import our tool-enhanced multi-agent BS detector
from modules.m4_tools import (
    check_claim_with_tools,
    create_tool_enhanced_bs_detector,
    ToolEnhancedState
)
from config.llm_factory import LLMFactory

print("✅ Ready! Our multi-agent system can route claims and use tools when needed.")

✅ Ready! Our multi-agent system can route claims and use tools when needed.


## How It Works: LLM Agency

Instead of hardcoded rules, we give the LLM a tool and let it decide:

In [10]:
# Demonstrate date context awareness
from datetime import datetime

print(f"📅 Today's date: {datetime.now().strftime('%B %d, %Y')}")
print("\nWhen you say 'yesterday', the agent knows you mean the actual yesterday!")
print("This is crucial for verifying current events correctly.\n")

# Test a time-sensitive claim
result = check_claim_with_tools("Bitcoin reached $100,000 yesterday")

print(f"Claim: 'Bitcoin reached $100,000 yesterday'")
print(f"Used search: {'Yes' if result['used_search'] else 'No'}")

# Show the actual search query used
if result.get('search_results') and len(result['search_results']) > 0:
    print(f"Search query: {result['search_results'][0].query}")
    print("Notice how it includes the specific date!")

📅 Today's date: July 22, 2025

When you say 'yesterday', the agent knows you mean the actual yesterday!
This is crucial for verifying current events correctly.



  with DDGS() as ddgs:


Claim: 'Bitcoin reached $100,000 yesterday'
Used search: Yes
Search query: Bitcoin price on July 21, 2025
Notice how it includes the specific date!


In [11]:
import base64
from IPython.display import Image, HTML, display

# Create a mermaid diagram showing the multi-agent architecture
mermaid_diagram = """
graph TD
    A[Claim Input] --> R{Router}
    
    R -->|Technical| TE[Technical Expert]
    R -->|Historical| HE[Historical Expert]
    R -->|Current Event| CE[Current Events Expert]
    R -->|General| GE[General Expert]
    
    CE --> T{Need More Info?}
    T -->|Yes| S[🔍 Web Search Tool]
    T -->|No| CE2[Direct Analysis]
    
    S --> CE3[Analyze with Evidence]
    
    TE --> V[Verdict]
    HE --> V
    CE2 --> V
    CE3 --> V
    GE --> V
    
    style R fill:#f9f,stroke:#333,stroke-width:2px
    style CE fill:#bbf,stroke:#333,stroke-width:2px
    style S fill:#fbf,stroke:#333,stroke-width:2px
    style V fill:#bfb,stroke:#333,stroke-width:2px
"""

# Create a simple HTML visualization since mermaid.ink might have issues
html_content = f"""
<div style="text-align: center; padding: 20px;">
    <h3>🔄 Multi-Agent Tool Integration Architecture</h3>
    <div style="background-color: #f0f0f0; padding: 20px; border-radius: 10px;">
        <p><strong>Claim Input</strong> → <strong>Router</strong></p>
        <p>↙ ↓ ↓ ↘</p>
        <p>Technical | Historical | <span style="color: blue;">Current Events</span> | General</p>
        <p>Expert | Expert | <span style="color: blue;">Expert (+ Tools)</span> | Expert</p>
        <p>↓ ↓ ↓ ↓</p>
        <p><strong>Final Verdict</strong></p>
    </div>
    <p><em>Only the Current Events Expert has web search tools!</em></p>
</div>
"""

display(HTML(html_content))

# Try to render mermaid diagram too
def render_mermaid_diagram(graph_def):
    graph_bytes = graph_def.encode("utf-8")
    base64_string = base64.b64encode(graph_bytes).decode("ascii")
    image_url = f"https://mermaid.ink/img/{base64_string}?type=png"
    return Image(url=image_url)

try:
    display(render_mermaid_diagram(mermaid_diagram))
except:
    print("(Mermaid diagram rendering failed, see HTML visualization above)")

In [12]:
# Show the multi-agent routing in action
# Import the function if not already imported
try:
    app = create_tool_enhanced_bs_detector()
except NameError:
    from modules.m4_tools import create_tool_enhanced_bs_detector
    app = create_tool_enhanced_bs_detector()
    
print("\n📊 Multi-Agent Graph Structure:")
print("1. Router analyzes claim type")
print("2. Routes to appropriate expert:")
print("   - Technical Expert (no tools)")
print("   - Historical Expert (no tools)")
print("   - Current Events Expert (with web search!)")
print("   - General Expert (no tools)")
print("\nOnly the Current Events Expert has tools - smart specialization!")


📊 Multi-Agent Graph Structure:
1. Router analyzes claim type
2. Routes to appropriate expert:
   - Technical Expert (no tools)
   - Historical Expert (no tools)
   - Current Events Expert (with web search!)
   - General Expert (no tools)

Only the Current Events Expert has tools - smart specialization!


In [13]:
# Test various claims to see routing and tool usage
test_claims = [
    "Water boils at 100 degrees Celsius",
    "SpaceX launched 5 rockets yesterday",
    "The Boeing 787 uses composite materials",
    "Tesla's stock price is above $300 today",
    "The moon landing was in 1969"
]

print("🎬 Watch Multi-Agent Routing and Tool Usage:\n")

for claim in test_claims:
    print(f"\n{'='*70}")
    print(f"Claim: \"{claim}\"")
    print("-" * 70)
    
    # Use the multi-agent version with tools
    result = check_claim_with_tools(claim)
    
    # Show routing info if available
    if result.get('claim_type'):
        print(f"1️⃣ Router Decision: {result['claim_type']} claim")
        print(f"2️⃣ Assigned to: {result['analyzing_agent']}")
    else:
        print("⚠️  No routing info - using direct analysis")
        
    print(f"3️⃣ Tool Usage: {'🔍 SEARCHED' if result['used_search'] else '⚡ NO SEARCH'}")
    print(f"\nVerdict: {result['verdict']} ({result['confidence']}%)")
    
    if result['used_search'] and result.get('tools_used'):
        print(f"Tools used: {', '.join(result['tools_used'])}")
    
    # Show first part of reasoning
    if result.get('reasoning'):
        print(f"\nReasoning: {result['reasoning'][:150]}...")

🎬 Watch Multi-Agent Routing and Tool Usage:


Claim: "Water boils at 100 degrees Celsius"
----------------------------------------------------------------------
1️⃣ Router Decision: general claim
2️⃣ Assigned to: General Expert
3️⃣ Tool Usage: ⚡ NO SEARCH

Verdict: LEGITIMATE (95%)

Reasoning: Under standard atmospheric pressure (1 atmosphere or 101.3 kPa), pure water boils at 100 degrees Celsius. This is a well-established scientific fact t...

Claim: "SpaceX launched 5 rockets yesterday"
----------------------------------------------------------------------


  with DDGS() as ddgs:


1️⃣ Router Decision: current_event claim
2️⃣ Assigned to: Current Events Expert (with tools)
3️⃣ Tool Usage: 🔍 SEARCHED

Verdict: BS (90%)
Tools used: search_for_information

Reasoning: While SpaceX is known for having a high launch cadence and is the world's dominant space launch provider as of 2025, launching 5 rockets in a single d...

Claim: "The Boeing 787 uses composite materials"
----------------------------------------------------------------------
1️⃣ Router Decision: technical claim
2️⃣ Assigned to: Technical Expert
3️⃣ Tool Usage: ⚡ NO SEARCH

Verdict: LEGITIMATE (100%)

Reasoning: The Boeing 787 Dreamliner is well-known for its extensive use of composite materials in its airframe. Approximately 50% of the primary structure, incl...

Claim: "Tesla's stock price is above $300 today"
----------------------------------------------------------------------


  with DDGS() as ddgs:


1️⃣ Router Decision: current_event claim
2️⃣ Assigned to: Current Events Expert (with tools)
3️⃣ Tool Usage: 🔍 SEARCHED

Verdict: BS (85%)
Tools used: search_for_information

Reasoning: The search did not return a specific, verifiable figure for Tesla's stock price on July 22, 2025. Given that Tesla's stock price has historically fluc...

Claim: "The moon landing was in 1969"
----------------------------------------------------------------------
1️⃣ Router Decision: historical claim
2️⃣ Assigned to: Historical Expert
3️⃣ Tool Usage: ⚡ NO SEARCH

Verdict: LEGITIMATE (100%)

Reasoning: The claim that the moon landing occurred in 1969 is historically accurate. The first successful manned moon landing was conducted by NASA's Apollo 11 ...


## Deep Dive: A Recent Event

In [14]:
# Let's trace through a claim that should trigger search
recent_claim = "Apple announced new AI features yesterday"

print(f"🔍 Detailed Analysis: \"{recent_claim}\"\n")

# Run the claim
result = check_claim_with_tools(recent_claim)

print("1️⃣ LLM's Decision Process:")
print(f"   Used search: {'Yes' if result['used_search'] else 'No'}")

if result['used_search']:
    print(f"\n2️⃣ Tools Used:")
    for tool in result.get('tools_used', []):
        print(f"   - {tool}")
    
    if result.get('search_results'):
        print(f"\n3️⃣ Search Results:")
        for sr in result['search_results']:
            # sr is a WebSearchResult object, not a dict
            print(f"   Query: {sr.query}")
            if sr.facts:
                print("   Facts found:")
                for fact in sr.facts[:2]:
                    print(f"     - {fact[:150]}...")

print(f"\n4️⃣ Final Analysis:")
print(f"   Verdict: {result['verdict']}")
print(f"   Confidence: {result['confidence']}%")

if result.get('claim_type'):
    print(f"   Routed to: {result['claim_type']} → {result['analyzing_agent']}")

if result.get('reasoning'):
    print(f"\n   Full Reasoning:")
    print(f"   {result['reasoning']}")

🔍 Detailed Analysis: "Apple announced new AI features yesterday"



  with DDGS() as ddgs:


1️⃣ LLM's Decision Process:
   Used search: Yes

2️⃣ Tools Used:
   - search_for_information

3️⃣ Search Results:
   Query: Apple new AI features announcement July 21, 2025
   Facts found:
     - Jun 20, 2025 · Explore all the new Apple Intelligence features from WWDC 2025—Li...
     - Here’s what your iPhone and Mac can now do...

4️⃣ Final Analysis:
   Verdict: BS
   Confidence: 90%
   Routed to: current_event → Current Events Expert (with tools)

   Full Reasoning:
   The latest verified announcements of new AI features from Apple occurred in June 2025, specifically around June 9 and June 20 during WWDC 2025. There is no evidence of any new AI feature announcement from Apple on July 21, 2025 (yesterday). Therefore, the claim that Apple announced new AI features yesterday is not supported by available information.


## The Power of LLM Agency

Let's see how the LLM handles edge cases better than hardcoded rules:

In [16]:
# Edge cases that are hard to categorize with rules
edge_cases = [
    "The Concorde flew at Mach 2.04",  # Historical but specific number
    "Boeing is developing a new aircraft",  # Vague timeline
    "AI will replace pilots by 2030",  # Future prediction
]

print("🤔 Edge Cases - LLM Decides Based on Context:\n")

for claim in edge_cases:
    result = check_claim_with_tools(claim)
    
    print(f"\nClaim: \"{claim}\"")
    print(f"  LLM searched: {'Yes' if result['used_search'] else 'No'}")
    print(f"  Verdict: {result['verdict']} ({result['confidence']}%)")
    
    # Show reasoning snippet
    if result.get('reasoning'):
        reason_preview = result['reasoning'][:100].replace('\n', ' ')
        print(f"  Reasoning: {reason_preview}...")

🤔 Edge Cases - LLM Decides Based on Context:


Claim: "The Concorde flew at Mach 2.04"
  LLM searched: No
  Verdict: LEGITIMATE (95%)
  Reasoning: The Concorde was a supersonic passenger airliner capable of cruising at speeds over Mach 2. The typi...


  with DDGS() as ddgs:



Claim: "Boeing is developing a new aircraft"
  LLM searched: Yes
  Verdict: BS (80%)
  Reasoning: The search for recent information on Boeing developing a new aircraft specifically in July 2025 did ...

Claim: "AI will replace pilots by 2030"
  LLM searched: No
  Verdict: BS (85%)
  Reasoning: While AI and automation have made significant advances in aviation, fully replacing human pilots by ...


## Performance Analysis

In [17]:
import time

# Compare performance
test_cases = [
    ("Historical fact", "The Wright brothers flew in 1903"),
    ("Recent event", "SpaceX launched yesterday"),
    ("Current data", "Bitcoin price today")
]

print("⏱️ Performance Analysis:\n")

for case_type, claim in test_cases:
    start = time.time()
    result = check_claim_with_tools(claim)
    duration = time.time() - start
    
    print(f"{case_type}: \"{claim[:30]}...\"")
    print(f"  Time: {duration:.2f}s")
    print(f"  Searched: {'Yes' if result['used_search'] else 'No'}")
    print(f"  Result: {result['verdict']}")
    print()

⏱️ Performance Analysis:

Historical fact: "The Wright brothers flew in 19..."
  Time: 2.87s
  Searched: No
  Result: LEGITIMATE



  with DDGS() as ddgs:


Recent event: "SpaceX launched yesterday..."
  Time: 4.27s
  Searched: Yes
  Result: BS



  with DDGS() as ddgs:


Current data: "Bitcoin price today..."
  Time: 8.62s
  Searched: Yes
  Result: LEGITIMATE



In [18]:
print("📝 Implementation Pattern:\n")
print("❌ OLD WAY (Single Agent with Tools):")
print("```python")
print("# Every claim goes through the same process")
print("if needs_search(claim):")
print("    search()  # All claims evaluated the same way")
print("```")

print("\n✅ NEW WAY (Multi-Agent with Specialized Tools):")
print("```python")
print("# 1. Router analyzes claim type")
print("claim_type = router.analyze(claim)")
print("")
print("# 2. Route to specialist")
print("if claim_type == 'current_event':")
print("    # Only this expert has tools!")
print("    expert = CurrentEventsExpert(tools=[search_tool])")
print("elif claim_type == 'technical':")
print("    expert = TechnicalExpert()  # No tools needed")
print("")
print("# 3. Expert decides if they need their tools")
print("result = expert.analyze(claim)")
print("```")

print("\n🎯 Key Benefits:")
print("1. Efficient: Only current events expert has search overhead")
print("2. Specialized: Each expert optimized for their domain")
print("3. Scalable: Easy to add new experts or tools")
print("4. Smart: Tools only where they make sense")

📝 Implementation Pattern:

❌ OLD WAY (Single Agent with Tools):
```python
# Every claim goes through the same process
if needs_search(claim):
    search()  # All claims evaluated the same way
```

✅ NEW WAY (Multi-Agent with Specialized Tools):
```python
# 1. Router analyzes claim type
claim_type = router.analyze(claim)

# 2. Route to specialist
if claim_type == 'current_event':
    # Only this expert has tools!
    expert = CurrentEventsExpert(tools=[search_tool])
elif claim_type == 'technical':
    expert = TechnicalExpert()  # No tools needed

# 3. Expert decides if they need their tools
result = expert.analyze(claim)
```

🎯 Key Benefits:
1. Efficient: Only current events expert has search overhead
2. Specialized: Each expert optimized for their domain
3. Scalable: Easy to add new experts or tools
4. Smart: Tools only where they make sense


In [19]:
## Summary & Takeaways

### What We Built
- **Multi-Agent System** with specialized experts
- **Smart Tool Assignment** - only Current Events Expert has web search
- **Efficient Routing** - claims go to the right expert
- **Tool-calling LLM** - expert decides when to search

### Architecture Evolution
1. **Module 1**: Simple BS detector
2. **Module 2**: Added LangGraph for reliability  
3. **Module 3**: Multi-agent routing to specialists
4. **Module 4**: Tools for the expert who needs them!

### Key Implementation Details
```python
# Extend existing multi-agent system
from modules.m3_routing import (
    router_node,
    technical_expert_node,
    historical_expert_node,
    general_expert_node
)

# Enhance only the current events expert
@tool
def search_for_information(query: str) -> str:
    """Web search tool..."""

# Current events expert with tools
def current_events_expert_with_tools_node(state):
    llm_with_tools = llm.bind_tools([search_for_information])
    # Expert decides when to search
```

### Why This Architecture Works
1. **Separation of Concerns**
   - Router focuses on classification
   - Experts focus on their domains
   - Tools only where needed

2. **Efficiency**
   - Historical/Technical claims: Fast, no search needed
   - Current events: Search when necessary
   - No wasted API calls

3. **Maintainability**
   - Easy to add new experts
   - Easy to add tools to specific experts
   - Clear responsibility boundaries

### Next Steps
In Iteration 5, we could add:
- More specialized tools (fact databases, calculators)
- Human-in-the-loop for low confidence
- Multi-expert consensus for complex claims

SyntaxError: invalid syntax (1634182694.py, line 4)

## Summary & Takeaways

### What We Built
- **Tool-calling LLM agent** that decides when it needs more information
- **Information sufficiency-based decisions** - not keyword matching
- **Proper tool binding pattern** using LangChain's @tool decorator

### Key Implementation Details
```python
# 1. Define tool with clear documentation
@tool
def search_for_information(query: str) -> str:
    """Search the web when you lack sufficient information..."""
    
# 2. Bind tools to LLM
llm_with_tools = llm.bind_tools([search_for_information])

# 3. System prompt emphasizes information sufficiency
"If you lack sufficient information to make a confident assessment, 
 use the search_for_information tool"
```

### Why This Approach Works
1. **LLM Agency > Hardcoded Rules**
   - Adapts to context naturally
   - No brittle keyword matching
   - Handles edge cases intelligently

2. **Tool Documentation Matters**
   - Clear docstrings guide the LLM
   - Explains WHEN to use the tool
   - Self-documenting code

3. **Information Sufficiency Focus**
   - Searches based on actual need
   - Not triggered by keywords
   - More efficient and accurate

### Next Steps
In Iteration 5, we'll add human-in-the-loop for cases where even search isn't enough!

In [20]:
# Try your own claims!
print("🎯 Your Turn!\n")
print("Try these claims and see how the LLM decides:")
print("1. 'ChatGPT-5 was released last week'")
print("2. 'Quantum computers can break all encryption'")
print("3. 'The price of gold is $2000 per ounce'")
print("\nNotice how the LLM's decisions are more nuanced than simple keyword matching!")

🎯 Your Turn!

Try these claims and see how the LLM decides:
1. 'ChatGPT-5 was released last week'
2. 'Quantum computers can break all encryption'
3. 'The price of gold is $2000 per ounce'

Notice how the LLM's decisions are more nuanced than simple keyword matching!
