# Building an AI BS Detector with LangGraph

## Workshop Overview

In this workshop, we'll build a sophisticated BS detector using LangGraph, starting from a simple prompt and progressively adding:
- Structured outputs with Pydantic
- State management with LangGraph
- Retry logic and error handling
- Multiple expert agents with routing
- Tool integration for evidence gathering
- Human-in-the-loop for low confidence cases
- Memory for learning from past claims

### What You'll Learn
1. **LangGraph Fundamentals** - States, nodes, edges, and graphs
2. **Agent Design** - Building specialized agents for different claim types
3. **Tool Integration** - Adding search capabilities to agents
4. **Human-in-the-Loop** - Using interrupts for human review
5. **Memory Systems** - Building context from previous interactions

### Prerequisites
- Basic Python knowledge
- Familiarity with LangChain concepts helpful but not required

## Setup and Imports

First, let's import everything we'll need for the workshop.

In [1]:
# Core imports
import sys
from pathlib import Path
sys.path.append(str(Path.cwd().parent))

# LangChain imports
from langchain_core.messages import SystemMessage, HumanMessage, AIMessage
from langchain_core.tools import tool
from pydantic import BaseModel, Field
from typing import List, Optional, Dict, Any, Literal, Annotated

# LangGraph imports
from langgraph.graph import StateGraph, END, START
from langgraph.graph.message import add_messages
from langgraph.prebuilt import ToolNode, tools_condition
from langgraph.checkpoint.memory import MemorySaver
from langgraph.types import Command, interrupt

# Utilities
import time
import re
from datetime import datetime
from collections import defaultdict
import base64
from IPython.display import Image, display

# Our LLM factory (only external dependency)
from config.llm_factory import LLMFactory

print("✅ All imports successful!")

✅ All imports successful!


## Helper Functions

Let's define some helper functions we'll use throughout the workshop.

In [2]:
def render_mermaid(graph_definition: str) -> Image:
    """Render a mermaid diagram in the notebook"""
    graph_bytes = graph_definition.encode("utf-8")
    base64_string = base64.b64encode(graph_bytes).decode("ascii")
    image_url = f"https://mermaid.ink/img/{base64_string}?type=png"
    return Image(url=image_url)

# Initialize our LLM
llm = LLMFactory.create_llm()
print(f"✅ LLM initialized: {type(llm).__name__}")

✅ LLM initialized: ChatOpenAI


## Part 1: Simple BS Detector

Let's start with the simplest possible BS detector - just a prompt and an LLM call.

In [3]:
# What we'll build first
simple_flow = """
graph LR
    A[Claim] --> B[LLM]
    B --> C[Verdict]
"""
display(render_mermaid(simple_flow))

In [4]:
# Define a simple verdict model
class SimpleVerdict(BaseModel):
    """Simple verdict output"""
    verdict: Literal["BS", "LEGITIMATE"] = Field(
        description="The verdict on whether the claim is BS or LEGITIMATE"
    )

def simple_bs_detector(claim: str) -> str:
    """The simplest possible BS detector"""
    prompt = f"""Is this claim BS or LEGITIMATE? 
    
Claim: {claim}

Answer with just BS or LEGITIMATE."""
    
    # Use structured output even for simple detector
    structured_llm = llm.with_structured_output(SimpleVerdict)
    result = structured_llm.invoke(prompt)
    return result.verdict

# Test it
test_claims = [
    "The Boeing 747 has four engines",
    "Cats can fly at supersonic speeds"
]

for claim in test_claims:
    verdict = simple_bs_detector(claim)
    print(f"Claim: {claim}")
    print(f"Verdict: {verdict}\n")

Claim: The Boeing 747 has four engines
Verdict: LEGITIMATE

Claim: Cats can fly at supersonic speeds
Verdict: BS



## Part 2: Structured Output with Pydantic

Let's improve our detector by getting structured output with confidence scores.

## Important: How LangGraph Handles State

Before we dive into LangGraph, it's important to understand how state works:

1. **Input**: You pass a dictionary to `invoke()`
2. **Processing**: Nodes receive state objects and return dictionaries with updates
3. **Output**: LangGraph returns the final state as a dictionary

This means:
- When invoking: `app.invoke({"claim": "..."})`
- When accessing results: `result['verdict']` (not `result.verdict`)
- But within our functions, we work with Pydantic models directly

### Key Pattern to Remember:
```python
# Inside a node function:
structured_llm = llm.with_structured_output(BSDetectorOutput)
result = structured_llm.invoke(prompt)  # This is a Pydantic model
print(result.verdict)  # Access as attributes

# But the node returns a dictionary:
return {
    "verdict": result.verdict,
    "confidence": result.confidence
}

# And LangGraph returns dictionaries:
final_result = app.invoke({"claim": "..."})
print(final_result['verdict'])  # Access as dictionary
```

Let's see this in action:

In [5]:
# Define our output structure
class BSDetectorOutput(BaseModel):
    """Structured output for BS detection"""
    verdict: Literal["BS", "LEGITIMATE", "UNCERTAIN"] = Field(
        description="The verdict on whether the claim is BS"
    )
    confidence: int = Field(
        description="Confidence level from 0-100",
        ge=0,
        le=100
    )
    reasoning: str = Field(
        description="Brief explanation of the verdict"
    )

def structured_bs_detector(claim: str) -> BSDetectorOutput:
    """BS detector with structured output"""
    prompt = f"""Analyze if this claim is BS or legitimate.
    
Claim: {claim}

Provide your analysis with a verdict, confidence score, and reasoning."""
    
    # Use structured output
    structured_llm = llm.with_structured_output(BSDetectorOutput)
    return structured_llm.invoke(prompt)

# Test it
result = structured_bs_detector("The Earth is flat")
print(f"Verdict: {result.verdict}")
print(f"Confidence: {result.confidence}%")
print(f"Reasoning: {result.reasoning}")

Verdict: BS
Confidence: 100%
Reasoning: Extensive scientific evidence from astronomy, physics, and direct observations (such as satellite imagery and circumnavigation) conclusively shows that the Earth is an oblate spheroid, not flat. The claim that the Earth is flat contradicts well-established facts and empirical data.


## Part 3: LangGraph State Management

Now let's use LangGraph to manage state and add retry logic for robustness.

In [6]:
# Define our state
class BSDetectorState(BaseModel):
    """State for the BS detector graph"""
    claim: str
    verdict: Optional[str] = None
    confidence: Optional[int] = None
    reasoning: Optional[str] = None
    retry_count: int = 0
    max_retries: int = 3
    error: Optional[str] = None

# Define our nodes
def detect_bs_node(state: BSDetectorState) -> dict:
    """Node that detects BS"""
    try:
        result = structured_bs_detector(state.claim)
        return {
            "verdict": result.verdict,
            "confidence": result.confidence,
            "reasoning": result.reasoning
        }
    except Exception as e:
        return {
            "error": str(e),
            "retry_count": state.retry_count + 1
        }

def retry_node(state: BSDetectorState) -> dict:
    """Node that handles retries"""
    print(f"Retrying... (attempt {state.retry_count + 1}/{state.max_retries})")
    time.sleep(1)  # Brief pause before retry
    return {}

# Define routing logic
def route_after_detection(state: BSDetectorState) -> str:
    """Decide where to go after detection attempt"""
    if state.verdict and not state.error:
        return "success"
    elif state.retry_count < state.max_retries:
        return "retry"
    else:
        return "error"

# Build the graph
def create_bs_detector_graph():
    """Create the BS detector graph"""
    workflow = StateGraph(BSDetectorState)
    
    # Add nodes
    workflow.add_node("detect_bs", detect_bs_node)
    workflow.add_node("retry", retry_node)
    
    # Add edges
    workflow.add_edge(START, "detect_bs")
    
    # Conditional routing
    workflow.add_conditional_edges(
        "detect_bs",
        route_after_detection,
        {
            "success": END,
            "retry": "retry",
            "error": END
        }
    )
    
    workflow.add_edge("retry", "detect_bs")
    
    return workflow.compile()

# Test it
app = create_bs_detector_graph()
state_input = {"claim": "Airplanes fly by flapping their wings"}
result = app.invoke(state_input)

# Access result as dictionary
print(f"\nVerdict: {result['verdict']}")
print(f"Confidence: {result['confidence']}%")
print(f"Reasoning: {result['reasoning']}")


Verdict: BS
Confidence: 95%
Reasoning: Airplanes do not fly by flapping their wings; they generate lift through fixed wings and engine thrust. Flapping wings are characteristic of birds or insects, not airplanes.


In [7]:
# What we'll build with LangGraph
langgraph_flow = """
graph TD
    A[Start] --> B[Detect BS]
    B --> C{Success?}
    C -->|Yes| D[End]
    C -->|No| E[Retry]
    E --> B
    E --> F{Max Retries?}
    F -->|Yes| G[Error]
    F -->|No| B
"""
display(render_mermaid(langgraph_flow))

In [8]:
# Define routing output model
class RoutingOutput(BaseModel):
    """Output for claim routing"""
    claim_type: Literal["technical", "historical", "current_event", "general"] = Field(
        description="The category of the claim"
    )

# Enhanced state with routing
class RoutingState(BSDetectorState):
    """State with routing information"""
    claim_type: Optional[str] = None
    analyzing_agent: Optional[str] = None

# Router node
def router_node(state: RoutingState) -> dict:
    """Route claims to appropriate expert"""
    routing_prompt = f"""Categorize this claim into one of these types:
- technical: Engineering, physics, specifications
- historical: Past events, dates, historical facts
- current_event: Recent news, current happenings
- general: Everything else

Claim: {state.claim}

Respond with just the category name."""
    
    # Use structured output for routing
    structured_llm = llm.with_structured_output(RoutingOutput)
    result = structured_llm.invoke(routing_prompt)
    
    return {"claim_type": result.claim_type}

# Expert nodes
def technical_expert_node(state: RoutingState) -> dict:
    """Technical expert for engineering/physics claims"""
    prompt = f"""You are a technical expert in engineering and physics.
Analyze this technical claim for accuracy.

Claim: {state.claim}

Consider physical laws, engineering principles, and technical feasibility."""
    
    structured_llm = llm.with_structured_output(BSDetectorOutput)
    result = structured_llm.invoke(prompt)
    
    return {
        "verdict": result.verdict,
        "confidence": result.confidence,
        "reasoning": f"[Technical Expert] {result.reasoning}",
        "analyzing_agent": "technical_expert"
    }

def historical_expert_node(state: RoutingState) -> dict:
    """Historical expert for past events"""
    prompt = f"""You are a historical expert.
Analyze this historical claim for accuracy.

Claim: {state.claim}

Consider historical records, dates, and documented events."""
    
    structured_llm = llm.with_structured_output(BSDetectorOutput)
    result = structured_llm.invoke(prompt)
    
    return {
        "verdict": result.verdict,
        "confidence": result.confidence,
        "reasoning": f"[Historical Expert] {result.reasoning}",
        "analyzing_agent": "historical_expert"
    }

def current_events_expert_node(state: RoutingState) -> dict:
    """Current events expert"""
    prompt = f"""You are a current events expert.
Analyze this claim about recent events.

Claim: {state.claim}

Note: You may not have access to events after your training cutoff.
Be transparent about limitations."""
    
    structured_llm = llm.with_structured_output(BSDetectorOutput)
    result = structured_llm.invoke(prompt)
    
    return {
        "verdict": result.verdict,
        "confidence": min(result.confidence, 70),  # Cap confidence for current events
        "reasoning": f"[Current Events Expert] {result.reasoning}",
        "analyzing_agent": "current_events_expert"
    }

def general_expert_node(state: RoutingState) -> dict:
    """General expert for other claims"""
    prompt = f"""Analyze if this claim is BS or legitimate.

Claim: {state.claim}

Provide your analysis with a verdict, confidence score, and reasoning."""
    
    structured_llm = llm.with_structured_output(BSDetectorOutput)
    result = structured_llm.invoke(prompt)
    
    return {
        "verdict": result.verdict,
        "confidence": result.confidence,
        "reasoning": f"[General Expert] {result.reasoning}",
        "analyzing_agent": "general_expert"
    }

# Build multi-agent graph
def create_multi_agent_graph():
    """Create the multi-agent BS detector"""
    workflow = StateGraph(RoutingState)
    
    # Add nodes
    workflow.add_node("router", router_node)
    workflow.add_node("technical_expert", technical_expert_node)
    workflow.add_node("historical_expert", historical_expert_node)
    workflow.add_node("current_events_expert", current_events_expert_node)
    workflow.add_node("general_expert", general_expert_node)
    
    # Add edges
    workflow.add_edge(START, "router")
    
    # Route to appropriate expert
    def route_to_expert(state: RoutingState) -> str:
        return f"{state.claim_type}_expert"
    
    workflow.add_conditional_edges(
        "router",
        route_to_expert,
        {
            "technical_expert": "technical_expert",
            "historical_expert": "historical_expert",
            "current_event_expert": "current_events_expert",
            "general_expert": "general_expert"
        }
    )
    
    # All experts go to END
    for expert in ["technical_expert", "historical_expert", "current_events_expert", "general_expert"]:
        workflow.add_edge(expert, END)
    
    return workflow.compile()

# Test with different claims
multi_agent_app = create_multi_agent_graph()

test_claims = [
    "The Boeing 747 uses anti-gravity propulsion",
    "Napoleon invaded Russia in 1812",
    "A major tech company announced layoffs yesterday",
    "Eating carrots improves night vision"
]

for claim in test_claims:
    result = multi_agent_app.invoke({"claim": claim})  # Pass dictionary
    print(f"\nClaim: {claim}")
    print(f"Type: {result['claim_type']}")
    print(f"Agent: {result['analyzing_agent']}")
    print(f"Verdict: {result['verdict']} ({result['confidence']}%)")


Claim: The Boeing 747 uses anti-gravity propulsion
Type: technical
Agent: technical_expert
Verdict: BS (95%)

Claim: Napoleon invaded Russia in 1812
Type: historical
Agent: historical_expert
Verdict: LEGITIMATE (95%)

Claim: A major tech company announced layoffs yesterday
Type: current_event
Agent: current_events_expert
Verdict: UNCERTAIN (70%)

Claim: Eating carrots improves night vision
Type: general
Agent: general_expert
Verdict: UNCERTAIN (70%)


## Part 4: Multiple Expert Agents

Let's add specialized agents for different types of claims.

In [9]:
# Multi-agent architecture
multi_agent_flow = """
graph TD
    A[Claim] --> B[Router]
    B -->|Technical| C[Technical Expert]
    B -->|Historical| D[Historical Expert]
    B -->|Current Events| E[Current Events Expert]
    B -->|General| F[General Expert]
    C --> G[Verdict]
    D --> G
    E --> G
    F --> G
"""
display(render_mermaid(multi_agent_flow))

In [10]:
# Reuse the RoutingOutput model defined earlier
# Enhanced state with routing
class RoutingState(BSDetectorState):
    """State with routing information"""
    claim_type: Optional[str] = None
    analyzing_agent: Optional[str] = None

# Router node
def router_node(state: RoutingState) -> dict:
    """Route claims to appropriate expert"""
    routing_prompt = f"""Categorize this claim into one of these types:
- technical: Engineering, physics, specifications
- historical: Past events, dates, historical facts
- current_event: Recent news, current happenings
- general: Everything else

Claim: {state.claim}

Respond with just the category name."""
    
    # Use structured output for routing
    structured_llm = llm.with_structured_output(RoutingOutput)
    result = structured_llm.invoke(routing_prompt)
    
    return {"claim_type": result.claim_type}

# Expert nodes
def technical_expert_node(state: RoutingState) -> dict:
    """Technical expert for engineering/physics claims"""
    prompt = f"""You are a technical expert in engineering and physics.
Analyze this technical claim for accuracy.

Claim: {state.claim}

Consider physical laws, engineering principles, and technical feasibility."""
    
    structured_llm = llm.with_structured_output(BSDetectorOutput)
    result = structured_llm.invoke(prompt)
    
    return {
        "verdict": result.verdict,
        "confidence": result.confidence,
        "reasoning": f"[Technical Expert] {result.reasoning}",
        "analyzing_agent": "technical_expert"
    }

def historical_expert_node(state: RoutingState) -> dict:
    """Historical expert for past events"""
    prompt = f"""You are a historical expert.
Analyze this historical claim for accuracy.

Claim: {state.claim}

Consider historical records, dates, and documented events."""
    
    structured_llm = llm.with_structured_output(BSDetectorOutput)
    result = structured_llm.invoke(prompt)
    
    return {
        "verdict": result.verdict,
        "confidence": result.confidence,
        "reasoning": f"[Historical Expert] {result.reasoning}",
        "analyzing_agent": "historical_expert"
    }

def current_events_expert_node(state: RoutingState) -> dict:
    """Current events expert"""
    prompt = f"""You are a current events expert.
Analyze this claim about recent events.

Claim: {state.claim}

Note: You may not have access to events after your training cutoff.
Be transparent about limitations."""
    
    structured_llm = llm.with_structured_output(BSDetectorOutput)
    result = structured_llm.invoke(prompt)
    
    return {
        "verdict": result.verdict,
        "confidence": min(result.confidence, 70),  # Cap confidence for current events
        "reasoning": f"[Current Events Expert] {result.reasoning}",
        "analyzing_agent": "current_events_expert"
    }

def general_expert_node(state: RoutingState) -> dict:
    """General expert for other claims"""
    prompt = f"""Analyze if this claim is BS or legitimate.

Claim: {state.claim}

Provide your analysis with a verdict, confidence score, and reasoning."""
    
    structured_llm = llm.with_structured_output(BSDetectorOutput)
    result = structured_llm.invoke(prompt)
    
    return {
        "verdict": result.verdict,
        "confidence": result.confidence,
        "reasoning": f"[General Expert] {result.reasoning}",
        "analyzing_agent": "general_expert"
    }

# Build multi-agent graph
def create_multi_agent_graph():
    """Create the multi-agent BS detector"""
    workflow = StateGraph(RoutingState)
    
    # Add nodes
    workflow.add_node("router", router_node)
    workflow.add_node("technical_expert", technical_expert_node)
    workflow.add_node("historical_expert", historical_expert_node)
    workflow.add_node("current_events_expert", current_events_expert_node)
    workflow.add_node("general_expert", general_expert_node)
    
    # Add edges
    workflow.add_edge(START, "router")
    
    # Route to appropriate expert
    def route_to_expert(state: RoutingState) -> str:
        return f"{state.claim_type}_expert"
    
    workflow.add_conditional_edges(
        "router",
        route_to_expert,
        {
            "technical_expert": "technical_expert",
            "historical_expert": "historical_expert",
            "current_event_expert": "current_events_expert",
            "general_expert": "general_expert"
        }
    )
    
    # All experts go to END
    for expert in ["technical_expert", "historical_expert", "current_events_expert", "general_expert"]:
        workflow.add_edge(expert, END)
    
    return workflow.compile()

# Test with different claims
multi_agent_app = create_multi_agent_graph()

test_claims = [
    "The Boeing 747 uses anti-gravity propulsion",
    "Napoleon invaded Russia in 1812",
    "A major tech company announced layoffs yesterday",
    "Eating carrots improves night vision"
]

for claim in test_claims:
    result = multi_agent_app.invoke({"claim": claim})  # Pass dictionary
    print(f"\nClaim: {claim}")
    print(f"Type: {result['claim_type']}")
    print(f"Agent: {result['analyzing_agent']}")
    print(f"Verdict: {result['verdict']} ({result['confidence']}%)")


Claim: The Boeing 747 uses anti-gravity propulsion
Type: technical
Agent: technical_expert
Verdict: BS (95%)

Claim: Napoleon invaded Russia in 1812
Type: historical
Agent: historical_expert
Verdict: LEGITIMATE (95%)

Claim: A major tech company announced layoffs yesterday
Type: current_event
Agent: current_events_expert
Verdict: UNCERTAIN (70%)

Claim: Eating carrots improves night vision
Type: general
Agent: general_expert
Verdict: UNCERTAIN (75%)


## Part 5: Adding Tools for Evidence

Let's add search capabilities so our agents can gather evidence.

In [11]:
# Tool-enhanced architecture
tool_flow = """
graph TD
    A[Claim] --> B[Router]
    B --> C[Expert]
    C --> D{Need Evidence?}
    D -->|Yes| E[Search Tool]
    D -->|No| F[Verdict]
    E --> G[Analyze Evidence]
    G --> F
"""
display(render_mermaid(tool_flow))

In [12]:
# Define search tool
@tool
def search_evidence(query: str) -> str:
    """Search for evidence about a claim.
    
    In a real implementation, this would use web search.
    For the workshop, we'll simulate with aviation facts.
    """
    # Simulated knowledge base
    facts = {
        "747": "The Boeing 747 has 4 engines and cruises at Mach 0.85",
        "a380": "The Airbus A380 is the world's largest passenger airliner",
        "wright": "The Wright Brothers first flew on December 17, 1903",
        "concorde": "The Concorde could fly at Mach 2.04",
        "jet fuel": "Jet fuel is a type of aviation fuel, not a conspiracy theory",
        "speed of sound": "The speed of sound is approximately 767 mph at sea level"
    }
    
    query_lower = query.lower()
    relevant_facts = []
    
    for key, fact in facts.items():
        if key in query_lower:
            relevant_facts.append(fact)
    
    if relevant_facts:
        return " | ".join(relevant_facts)
    else:
        return "No specific evidence found for this query."

# Enhanced state with tools
class ToolState(RoutingState):
    """State with tool usage tracking"""
    messages: List[Any] = Field(default_factory=list)
    evidence: Optional[str] = None
    used_tools: List[str] = Field(default_factory=list)

# Tool-enhanced expert node
def expert_with_tools_node(state: ToolState) -> dict:
    """Expert that can use tools"""
    # Bind tools to LLM
    llm_with_tools = llm.bind_tools([search_evidence])
    
    # Initial prompt
    messages = [
        SystemMessage(content="""You are a BS detector with access to search.
        First search for evidence about the claim, then make your verdict.
        Always use the search tool before making a decision."""),
        HumanMessage(content=f"Analyze this claim: {state.claim}")
    ]
    
    # Get LLM response
    response = llm_with_tools.invoke(messages)
    
    # Check if tool was called
    if response.tool_calls:
        tool_call = response.tool_calls[0]
        tool_result = search_evidence.invoke(tool_call['args'])
        
        # Add tool result to messages
        messages.extend([
            response,
            {"role": "tool", "content": tool_result, "tool_call_id": tool_call['id']}
        ])
        
        # Get final response with evidence
        final_prompt = f"""Based on the evidence, analyze if this claim is BS:
        
Claim: {state.claim}
Evidence: {tool_result}

Provide verdict, confidence, and reasoning."""
        
        structured_llm = llm.with_structured_output(BSDetectorOutput)
        result = structured_llm.invoke(final_prompt)
        
        return {
            "verdict": result.verdict,
            "confidence": result.confidence,
            "reasoning": result.reasoning,
            "evidence": tool_result,
            "used_tools": ["search_evidence"]
        }
    
    # Fallback if no tool used
    return detect_bs_node(state)

# Test tool-enhanced detection
print("Testing tool-enhanced BS detection:\n")

state = ToolState(claim="The Boeing 747 can fly faster than the speed of sound")
result = expert_with_tools_node(state)

print(f"Claim: {state.claim}")
print(f"Evidence found: {result['evidence']}")
print(f"Verdict: {result['verdict']} ({result['confidence']}%)")
print(f"Reasoning: {result['reasoning']}")

Testing tool-enhanced BS detection:

Claim: The Boeing 747 can fly faster than the speed of sound
Evidence found: The Boeing 747 has 4 engines and cruises at Mach 0.85
Verdict: BS (95%)
Reasoning: The Boeing 747 is a large commercial airliner designed to cruise at subsonic speeds, specifically around Mach 0.85, which is below the speed of sound (Mach 1). While it has 4 engines, these are not capable of pushing the aircraft into supersonic speeds. Therefore, the claim that the Boeing 747 can fly faster than the speed of sound is false.


## Part 6: Human-in-the-Loop

Add human review for low confidence cases using LangGraph's interrupt feature.

In [13]:
# Human-in-the-loop flow
human_flow = """
graph TD
    A[Claim] --> B[AI Analysis]
    B --> C{Confidence > 70%?}
    C -->|Yes| D[Final Verdict]
    C -->|No| E[Interrupt for Human]
    E --> F[Human Review]
    F --> G[Resume with Human Input]
    G --> D
"""
display(render_mermaid(human_flow))

In [14]:
# Human review tool
@tool
def request_human_review(claim: str, ai_verdict: str, confidence: int, reasoning: str) -> str:
    """Request human review when confidence is low."""
    # Use LangGraph's interrupt to pause execution
    human_response = interrupt({
        "claim": claim,
        "ai_verdict": ai_verdict,
        "confidence": confidence,
        "reasoning": reasoning,
        "message": f"Low confidence ({confidence}%) - requesting human review"
    })
    return human_response.get("verdict", ai_verdict)

# State with human review
class HumanState(ToolState):
    """State with human review tracking"""
    needs_human_review: bool = False
    human_reviewed: bool = False
    human_verdict: Optional[str] = None

# Human-in-the-loop node
def bs_detector_with_human(state: HumanState) -> dict:
    """BS detector that requests human input for low confidence"""
    # First, get AI verdict
    ai_result = expert_with_tools_node(state)
    
    # Check if human review needed
    if ai_result['confidence'] < 70:
        # Bind human review tool
        llm_with_human = llm.bind_tools([request_human_review])
        
        # Create message requesting human review
        review_prompt = f"""The AI analysis has low confidence. Please use the request_human_review tool.
        
Claim: {state.claim}
AI Verdict: {ai_result['verdict']}
Confidence: {ai_result['confidence']}%
Reasoning: {ai_result['reasoning']}"""
        
        response = llm_with_human.invoke(review_prompt)
        
        # This will interrupt execution
        if response.tool_calls:
            return {
                **ai_result,
                "needs_human_review": True
            }
    
    return ai_result

# Build human-in-the-loop graph
def create_human_loop_graph():
    """Create graph with human-in-the-loop"""
    workflow = StateGraph(HumanState)
    
    workflow.add_node("analyze", bs_detector_with_human)
    
    workflow.add_edge(START, "analyze")
    workflow.add_edge("analyze", END)
    
    # Compile with memory for interrupts
    memory = MemorySaver()
    return workflow.compile(
        checkpointer=memory,
        interrupt_before=["analyze"]  # Can interrupt before analysis
    )

# Simulate human-in-the-loop
print("Simulating human-in-the-loop:\n")

# This would normally interrupt and wait for human input
# For demo, we'll show the concept
low_confidence_claim = "Quantum computers can solve NP-complete problems in polynomial time"
state = HumanState(claim=low_confidence_claim)
result = bs_detector_with_human(state)

print(f"Claim: {low_confidence_claim}")
print(f"AI Verdict: {result['verdict']} ({result['confidence']}%)")
print(f"Needs human review: {result.get('needs_human_review', False)}")

if result.get('needs_human_review'):
    print("\n⚠️  In a real system, execution would pause here for human input.")
    print("The human could provide a final verdict and additional context.")

Simulating human-in-the-loop:

Claim: Quantum computers can solve NP-complete problems in polynomial time
AI Verdict: BS (90%)
Needs human review: False


## Part 7: Memory System

Finally, let's add memory so our BS detector learns from past claims.

In [15]:
# Memory architecture
memory_flow = """
graph LR
    A[New Claim] --> B[Extract Entities]
    B --> C[Search Memory]
    C --> D[Enhanced Context]
    D --> E[AI Analysis]
    E --> F[Store Result]
    F --> G[Update Patterns]
"""
display(render_mermaid(memory_flow))

In [16]:
# Simple in-memory storage
MEMORY_STORE = {
    "claims": [],
    "entities": defaultdict(list),
    "patterns": defaultdict(int)
}

class MemoryManager:
    """Manages memory operations"""
    
    @staticmethod
    def extract_entities(text: str) -> List[str]:
        """Extract entities from text"""
        # Simple entity extraction
        entities = []
        
        # Find capitalized words (proper nouns)
        entities.extend(re.findall(r'\b[A-Z][a-z]+(?:\s+[A-Z][a-z]+)*\b', text))
        
        # Find numbers with context (747, A380, etc)
        entities.extend(re.findall(r'\b[A-Z]?\d+\b', text))
        
        return list(set(entities))  # Remove duplicates
    
    @staticmethod
    def store_claim(claim: str, verdict: str, confidence: int, reasoning: str):
        """Store claim and verdict in memory"""
        entities = MemoryManager.extract_entities(claim)
        
        # Store claim record
        claim_record = {
            "claim": claim,
            "verdict": verdict,
            "confidence": confidence,
            "reasoning": reasoning,
            "entities": entities,
            "timestamp": datetime.now().isoformat()
        }
        MEMORY_STORE["claims"].append(claim_record)
        
        # Index by entities
        for entity in entities:
            MEMORY_STORE["entities"][entity].append(len(MEMORY_STORE["claims"]) - 1)
        
        # Track BS patterns
        if verdict == "BS":
            bs_keywords = ['quantum', 'perpetual', 'free energy', 'anti-gravity']
            for keyword in bs_keywords:
                if keyword in claim.lower():
                    MEMORY_STORE["patterns"][keyword] += 1
    
    @staticmethod
    def get_context(claim: str) -> dict:
        """Get relevant context from memory"""
        entities = MemoryManager.extract_entities(claim)
        
        # Find related claims
        related_claims = []
        for entity in entities:
            for idx in MEMORY_STORE["entities"].get(entity, []):
                if idx < len(MEMORY_STORE["claims"]):
                    related_claims.append(MEMORY_STORE["claims"][idx])
        
        # Check for BS patterns
        detected_patterns = []
        claim_lower = claim.lower()
        for pattern, count in MEMORY_STORE["patterns"].items():
            if pattern in claim_lower and count >= 2:
                detected_patterns.append(pattern)
        
        return {
            "related_claims": related_claims[:3],  # Limit to 3
            "detected_patterns": detected_patterns,
            "entities": entities
        }

# Memory-enhanced state
class MemoryState(HumanState):
    """State with memory context"""
    memory_context: Optional[dict] = None
    related_claims: List[dict] = Field(default_factory=list)

# Memory-enhanced BS detector
def memory_enhanced_detector(state: MemoryState) -> dict:
    """BS detector with memory"""
    # Get memory context
    context = MemoryManager.get_context(state.claim)
    
    # Build enhanced prompt with context
    context_prompt = "Analyze this claim for accuracy.\n\n"
    
    if context["related_claims"]:
        context_prompt += "Related previous claims:\n"
        for rc in context["related_claims"]:
            context_prompt += f"- {rc['claim']}: {rc['verdict']} ({rc['confidence']}%)\n"
        context_prompt += "\n"
    
    if context["detected_patterns"]:
        context_prompt += f"Warning: Contains known BS patterns: {', '.join(context['detected_patterns'])}\n\n"
    
    context_prompt += f"Current claim: {state.claim}"
    
    # Get verdict with context
    structured_llm = llm.with_structured_output(BSDetectorOutput)
    result = structured_llm.invoke(context_prompt)
    
    # Store in memory
    MemoryManager.store_claim(
        state.claim,
        result.verdict,
        result.confidence,
        result.reasoning
    )
    
    return {
        "verdict": result.verdict,
        "confidence": result.confidence,
        "reasoning": result.reasoning,
        "memory_context": context,
        "related_claims": context["related_claims"]
    }

# Test memory system
print("Testing memory system:\n")

# Clear memory
MEMORY_STORE["claims"].clear()
MEMORY_STORE["entities"].clear()
MEMORY_STORE["patterns"].clear()

# Process related claims
test_sequence = [
    "The Boeing 747 has four engines",
    "The Boeing 747 can fly at Mach 2",  # Should use context about 747
    "A quantum engine can achieve light speed",
    "Another quantum breakthrough defies physics"  # Should detect pattern
]

for i, claim in enumerate(test_sequence):
    state = MemoryState(claim=claim)
    result = memory_enhanced_detector(state)
    
    print(f"\nClaim {i+1}: {claim}")
    print(f"Verdict: {result['verdict']} ({result['confidence']}%)")
    
    if result['related_claims']:
        print(f"Used context from {len(result['related_claims'])} related claims")
    
    if result['memory_context']['detected_patterns']:
        print(f"⚠️  Detected BS pattern: {result['memory_context']['detected_patterns']}")

# Show memory statistics
print(f"\n\n📊 Memory Statistics:")
print(f"Total claims stored: {len(MEMORY_STORE['claims'])}")
print(f"Unique entities: {len(MEMORY_STORE['entities'])}")
print(f"BS patterns detected: {dict(MEMORY_STORE['patterns'])}")

Testing memory system:


Claim 1: The Boeing 747 has four engines
Verdict: LEGITIMATE (95%)

Claim 2: The Boeing 747 can fly at Mach 2
Verdict: BS (90%)
Used context from 2 related claims

Claim 3: A quantum engine can achieve light speed
Verdict: BS (95%)

Claim 4: Another quantum breakthrough defies physics
Verdict: UNCERTAIN (60%)


📊 Memory Statistics:
Total claims stored: 4
Unique entities: 5
BS patterns detected: {'quantum': 1}


## Putting It All Together

Now let's build the complete BS detector with all features.

In [17]:
# Complete architecture
complete_flow = """
graph TD
    A[Claim] --> B[Memory Retrieval]
    B --> C[Router]
    C --> D[Expert + Tools]
    D --> E{Confidence Check}
    E -->|High| F[Store & Return]
    E -->|Low| G[Human Review]
    G --> F
    F --> H[Update Memory]
"""
display(render_mermaid(complete_flow))

In [18]:
# Complete state
class CompleteState(BaseModel):
    """Complete state with all features"""
    # Core fields
    claim: str
    verdict: Optional[str] = None
    confidence: Optional[int] = None
    reasoning: Optional[str] = None
    
    # Routing
    claim_type: Optional[str] = None
    analyzing_agent: Optional[str] = None
    
    # Tools
    evidence: Optional[str] = None
    used_tools: List[str] = Field(default_factory=list)
    
    # Human review
    needs_human_review: bool = False
    human_reviewed: bool = False
    
    # Memory
    memory_context: Optional[dict] = None
    related_claims: List[dict] = Field(default_factory=list)

# Build complete graph
def create_complete_bs_detector():
    """Create the complete BS detector graph"""
    workflow = StateGraph(CompleteState)
    
    # Memory retrieval node
    def memory_node(state: CompleteState) -> dict:
        context = MemoryManager.get_context(state.claim)
        return {"memory_context": context}
    
    # Enhanced router with memory
    def smart_router(state: CompleteState) -> dict:
        # Use memory context in routing
        context_info = ""
        if state.memory_context and state.memory_context["detected_patterns"]:
            context_info = f"(Known BS patterns detected: {state.memory_context['detected_patterns']})"
        
        routing_prompt = f"""Categorize this claim {context_info}:
        
Categories: technical, historical, current_event, general

Claim: {state.claim}"""
        
        # Use structured output for routing
        structured_llm = llm.with_structured_output(RoutingOutput)
        result = structured_llm.invoke(routing_prompt)
        
        return {"claim_type": result.claim_type}
    
    # Expert with all features
    def complete_expert(state: CompleteState) -> dict:
        # Build context-aware prompt
        prompt = f"Analyze this {state.claim_type} claim:\n\n"
        
        if state.memory_context and state.memory_context["related_claims"]:
            prompt += "Related claims:\n"
            for rc in state.memory_context["related_claims"]:
                prompt += f"- {rc['claim']}: {rc['verdict']}\n"
            prompt += "\n"
        
        prompt += f"Claim: {state.claim}"
        
        # Get verdict
        structured_llm = llm.with_structured_output(BSDetectorOutput)
        result = structured_llm.invoke(prompt)
        
        # Store in memory
        MemoryManager.store_claim(
            state.claim,
            result.verdict,
            result.confidence,
            result.reasoning
        )
        
        return {
            "verdict": result.verdict,
            "confidence": result.confidence,
            "reasoning": result.reasoning,
            "analyzing_agent": f"{state.claim_type}_expert",
            "needs_human_review": result.confidence < 70
        }
    
    # Add nodes
    workflow.add_node("memory", memory_node)
    workflow.add_node("router", smart_router)
    workflow.add_node("expert", complete_expert)
    
    # Add edges
    workflow.add_edge(START, "memory")
    workflow.add_edge("memory", "router")
    workflow.add_edge("router", "expert")
    workflow.add_edge("expert", END)
    
    return workflow.compile()

# Test complete system
print("Testing complete BS detector:\n")

complete_app = create_complete_bs_detector()

# Test various claims
final_test_claims = [
    "The Wright Brothers first flew in 1903",
    "The Wright Brothers used jet engines",
    "Quantum computers violate thermodynamics",
    "Another quantum device creates free energy"
]

for claim in final_test_claims:
    result = complete_app.invoke({"claim": claim})  # Pass dictionary
    
    print(f"\nClaim: {claim}")
    print(f"Type: {result['claim_type']} (Agent: {result['analyzing_agent']})")
    print(f"Verdict: {result['verdict']} ({result['confidence']}%)")
    print(f"Reasoning: {result['reasoning'][:100]}...")
    
    if result['needs_human_review']:
        print("⚠️  Flagged for human review")
    
    if result['memory_context'] and result['memory_context']["detected_patterns"]:
        print(f"🚨 BS Pattern: {result['memory_context']['detected_patterns']}")

Testing complete BS detector:


Claim: The Wright Brothers first flew in 1903
Type: historical (Agent: historical_expert)
Verdict: LEGITIMATE (95%)
Reasoning: The claim that the Wright Brothers first flew in 1903 is well-documented and widely accepted by hist...

Claim: The Wright Brothers used jet engines
Type: historical (Agent: historical_expert)
Verdict: BS (95%)
Reasoning: The Wright Brothers achieved their first powered flight in 1903 using a piston engine, not jet engin...

Claim: Quantum computers violate thermodynamics
Type: technical (Agent: technical_expert)
Verdict: BS (90%)
Reasoning: Quantum computers operate based on the laws of quantum mechanics and do not violate the fundamental ...

Claim: Another quantum device creates free energy
Type: technical (Agent: technical_expert)
Verdict: BS (90%)
Reasoning: The claim that a quantum device creates free energy violates the fundamental laws of physics, specif...
🚨 BS Pattern: ['quantum']


## Summary and Next Steps

### What We Built
1. **Simple Detector** → Basic prompt engineering
2. **Structured Output** → Pydantic models for consistent results
3. **LangGraph Basics** → State management and retry logic
4. **Multi-Agent System** → Specialized experts for different domains
5. **Tool Integration** → Evidence gathering with search
6. **Human-in-the-Loop** → Interrupt pattern for human review
7. **Memory System** → Learning from past claims

### Key LangGraph Concepts
- **StateGraph** - Manages state throughout execution
- **Nodes** - Functions that process and update state
- **Edges** - Define flow between nodes
- **Conditional Edges** - Dynamic routing based on state
- **Tools** - Extend agent capabilities
- **Interrupts** - Pause for human input
- **Memory** - Persist state across runs

### Production Considerations
1. **Persistence** - Use proper database for memory
2. **Search** - Integrate real web search APIs
3. **Monitoring** - Track performance and accuracy
4. **Scaling** - Consider async execution
5. **Security** - Validate inputs and outputs

### Exercises
1. Add a new expert type (e.g., medical claims)
2. Implement confidence calibration
3. Add citation tracking for evidence
4. Build a UI for human review
5. Export memory to analyze patterns

### Resources
- [LangGraph Documentation](https://github.com/langchain-ai/langgraph)
- [LangChain Tools](https://python.langchain.com/docs/modules/tools/)
- [Pydantic Models](https://docs.pydantic.dev/)

Thank you for participating! 🚀