# Advanced LangChain Patterns

This notebook covers advanced agent patterns:

1. **Structured Output** - Extracting typed data from LLM responses
2. **Dynamic Prompts** - Runtime prompt customization via middleware
3. **Human-in-the-Loop (HITL)** - Requiring approval before tool execution

---

## Setup

In [17]:
# %pip install -qU langchain langchain-openai langchain-community langgraph

In [35]:
import os
import getpass

def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"{var}: ")

_set_env("OPENAI_API_KEY")

---

# Part 1: Structured Output

Often you need structured data from LLMs - not just text. LangChain's `response_format` parameter ensures the output matches a specific schema.

## 1.1 Using TypedDict

In [36]:
from openai import OpenAI

client = OpenAI()

In [37]:
from pydantic import BaseModel, Field
from typing import List


class NamesInfo(BaseModel):
    names: List[str] = Field(description="names in the notes")

raw_notes = """
Lucas has implemented the architecture and asked to John
from product about the features to include.
John talked to Tiffany the designer about the layout design.
"""


response = client.responses.parse(
    model="gpt-5-mini",
    input=f"Extract the names from these meetings notes: {raw_notes}",
    text_format=NamesInfo
)

response

ParsedResponse[NamesInfo](id='resp_0cf040e01edc34eb006941900400c08190836f3b1b5e5d47a3', created_at=1765904388.0, error=None, incomplete_details=None, instructions=None, metadata={}, model='gpt-5-mini-2025-08-07', object='response', output=[ResponseReasoningItem(id='rs_0cf040e01edc34eb006941900485bc81909e50a06b99cb2fee', summary=[], type='reasoning', content=None, encrypted_content=None, status=None), ParsedResponseOutputMessage[NamesInfo](id='msg_0cf040e01edc34eb0069419006064c81909601f02d703f1d18', content=[ParsedResponseOutputText[NamesInfo](annotations=[], text='{"names":["Lucas","John","Tiffany"]}', type='output_text', logprobs=[], parsed=NamesInfo(names=['Lucas', 'John', 'Tiffany']))], role='assistant', status='completed', type='message')], parallel_tool_calls=True, temperature=1.0, tool_choice='auto', tools=[], top_p=1.0, background=False, conversation=None, max_output_tokens=None, max_tool_calls=None, previous_response_id=None, prompt=None, prompt_cache_key=None, prompt_cache_ret

In [50]:
response.output_parsed.names

['Lucas', 'John', 'Tiffany']

In [51]:
from typing_extensions import TypedDict
from langchain.agents import create_agent

# Define the output schema
class ContactInfo(TypedDict):
    name: str
    email: str
    phone: str

# Create agent with structured output
contact_extractor = create_agent(
    model="openai:gpt-4o-mini", 
    response_format=ContactInfo
)

In [52]:
# Extract contact info from unstructured text
recorded_conversation = """
We talked with John Doe. He works over at Example Corp. 
His number is five, five, five, one two three, four five six seven. 
And his email was john at example.com. 
He wanted to order 50 boxes of supplies.
"""

result = contact_extractor.invoke(
    {"messages": recorded_conversation}
)

# Access the structured response
print("Structured output:")
print(result["structured_response"])

Structured output:
{'name': 'John Doe', 'email': 'john@example.com', 'phone': '5551234567'}


In [53]:
# Access individual fields
contact = result["structured_response"]
print(f"Name: {contact['name']}")
print(f"Email: {contact['email']}")
print(f"Phone: {contact['phone']}")

Name: John Doe
Email: john@example.com
Phone: 5551234567


## 1.2 Using Pydantic Models

Pydantic provides validation and richer type support:

In [54]:
from pydantic import BaseModel, Field
from typing import List, Optional

class MeetingNote(BaseModel):
    title: str = Field(description="Brief title for the meeting")
    attendees: List[str] = Field(description="List of people who attended")
    action_items: List[str] = Field(description="Tasks to be completed")
    next_meeting: Optional[str] = Field(description="Date/time of next meeting if mentioned")
    summary: str = Field(description="2-3 sentence summary of the meeting")

meeting_summarizer = create_agent(
    model="openai:gpt-5-mini",
    response_format=MeetingNote,
    system_prompt="Extract structured meeting notes from the transcript."
)

In [55]:
transcript = """
Meeting started at 2pm with Alice, Bob, and Charlie present.

Alice: Let's discuss the Q4 roadmap. We need to finalize features by Friday.

Bob: I can handle the API documentation. Should be done by Wednesday.

Charlie: I'll review the security audit report and send recommendations.

Alice: Great. Let's meet again next Monday at 10am to review progress.

Meeting ended at 2:30pm.
"""

result = meeting_summarizer.invoke({"messages": transcript})
notes = result["structured_response"]

print(f"Title: {notes.title}")
print(f"Attendees: {', '.join(notes.attendees)}")
print(f"\nAction Items:")
for item in notes.action_items:
    print(f"  - {item}")
print(f"\nNext Meeting: {notes.next_meeting}")
print(f"\nSummary: {notes.summary}")

Title: Q4 Roadmap Meeting
Attendees: Alice, Bob, Charlie

Action Items:
  - Finalize Q4 roadmap features by Friday.
  - Bob to complete API documentation by Wednesday.
  - Charlie to review the security audit report and send recommendations.

Next Meeting: Next Monday at 10am

Summary: The team reviewed the Q4 roadmap and set deadlines: features must be finalized by Friday. Bob will complete API documentation by Wednesday, and Charlie will review the security audit report and send recommendations. A follow-up meeting is scheduled for next Monday at 10am.


## 1.3 Complex Nested Structures

In [56]:
from typing import Literal

class Task(BaseModel):
    description: str
    assignee: str
    priority: Literal["high", "medium", "low"]
    estimated_hours: float

class ProjectPlan(BaseModel):
    project_name: str
    objective: str
    tasks: List[Task]
    total_estimated_hours: float
    risks: List[str]

project_planner = create_agent(
    model="openai:gpt-5-mini",
    response_format=ProjectPlan,
    system_prompt="Create a detailed project plan from the given requirements."
)

In [57]:
requirements = """
We need to build a customer feedback dashboard.
Team: Sarah (frontend), Mike (backend), Lisa (design)
Must be done in 2 weeks.
Features: sentiment analysis, charts, export to PDF.
"""

result = project_planner.invoke({"messages": requirements})
plan = result["structured_response"]

print(f"Project: {plan.project_name}")
print(f"Objective: {plan.objective}")
print(f"\nTasks:")
for task in plan.tasks:
    print(f"  [{task.priority.upper()}] {task.description}")
    print(f"      Assignee: {task.assignee}, Est: {task.estimated_hours}h")
print(f"\nTotal Hours: {plan.total_estimated_hours}")
print(f"\nRisks:")
for risk in plan.risks:
    print(f"  - {risk}")

Project: Customer Feedback Dashboard
Objective: Deliver a production-ready dashboard in 2 weeks that ingests customer feedback, runs sentiment analysis, displays interactive charts, and allows exporting reports to PDF.

Tasks:
  [HIGH] Kickoff meeting to confirm scope, success criteria, data sources, roles, timeline, and acceptance tests. Create a prioritized feature backlog for the 2-week delivery.
      Assignee: Sarah, Mike, Lisa, Est: 4.0h
  [HIGH] Create UX wireframes and define layout, user flows, chart types, and PDF report layout. Deliver design tokens and basic design system (colors, typography, spacing).
      Assignee: Lisa, Est: 12.0h
  [HIGH] Produce high-fidelity mockups for dashboard screens (overview, detail, filters, export) and iterate with team for sign-off.
      Assignee: Lisa, Est: 8.0h
  [MEDIUM] Design handoff: export assets, spec component states, and provide CSS/asset guidance for frontend implementation.
      Assignee: Lisa, Est: 6.0h
  [HIGH] Define backend

In [61]:
result.keys()

dict_keys(['messages', 'structured_response'])

---

# Part 2: Dynamic Prompts

Sometimes you need to customize the system prompt at runtime based on context. LangChain middleware enables this with `@dynamic_prompt`.

## 2.1 Role-Based Access Control

In [26]:
from dataclasses import dataclass
from langchain_community.utilities import SQLDatabase
from langchain_core.tools import tool
from langgraph.runtime import get_runtime
from langchain.agents.middleware.types import ModelRequest, dynamic_prompt

db = SQLDatabase.from_uri("sqlite:///./assets-resources/Chinook.db")

@dataclass
class RuntimeContext:
    is_employee: bool  # Access control flag
    db: SQLDatabase

@tool
def execute_sql(query: str) -> str:
    """Execute a SQLite SELECT query and return results."""
    runtime = get_runtime(RuntimeContext)
    try:
        return runtime.context.db.run(query)
    except Exception as e:
        return f"Error: {e}"

In [27]:
DATABASE_SCHEMA = """
Tables and columns:
- Album(AlbumId, Title, ArtistId)
- Artist(ArtistId, Name)
- Customer(CustomerId, FirstName, LastName, Company, Address, City, State, Country, PostalCode, Phone, Fax, Email, SupportRepId)
- Employee(EmployeeId, LastName, FirstName, Title, ReportsTo, BirthDate, HireDate, Address, City, State, Country, PostalCode, Phone, Fax, Email)
- Genre(GenreId, Name)
- Invoice(InvoiceId, CustomerId, InvoiceDate, BillingAddress, BillingCity, BillingState, BillingCountry, BillingPostalCode, Total)
- InvoiceLine(InvoiceLineId, InvoiceId, TrackId, UnitPrice, Quantity)
- MediaType(MediaTypeId, Name)
- Playlist(PlaylistId, Name)
- PlaylistTrack(PlaylistId, TrackId)
- Track(TrackId, Name, AlbumId, MediaTypeId, GenreId, Composer, Milliseconds, Bytes, UnitPrice)
"""

SYSTEM_PROMPT_TEMPLATE = """You are a SQLite analyst for a music store database.

Database Schema:
{schema}

Rules:
- Use execute_sql for SELECT queries only.
- Limit to 5 rows unless asked otherwise.
{table_limits}
- If errors occur, revise and retry.
"""

@dynamic_prompt
def access_controlled_prompt(request: ModelRequest) -> str:
    """Generate prompt based on user's access level."""
    if not request.runtime.context.is_employee:
        # Non-employees have limited table access
        table_limits = "- You can ONLY access: Album, Artist, Genre, Playlist, PlaylistTrack, Track."
    else:
        # Employees have full access
        table_limits = "- You have access to all tables."
    
    return SYSTEM_PROMPT_TEMPLATE.format(schema=DATABASE_SCHEMA, table_limits=table_limits)

In [28]:
from langchain.agents import create_agent

# Create agent with dynamic prompt middleware
access_controlled_agent = create_agent(
    model="openai:gpt-4o-mini",
    tools=[execute_sql],
    middleware=[access_controlled_prompt],  # <-- Dynamic prompt
    context_schema=RuntimeContext,
)

In [29]:
# Non-employee: Should be denied access to customer data
question = "What is the most costly purchase by Frank Harris?"

print("=== Non-Employee Access ===")
for step in access_controlled_agent.stream(
    {"messages": question},
    context=RuntimeContext(is_employee=False, db=db),
    stream_mode="values",
):
    step["messages"][-1].pretty_print()

=== Non-Employee Access ===

What is the most costly purchase by Frank Harris?
Tool Calls:
  execute_sql (call_hqQAhcldGcXkGAksZ3Jyr6nv)
 Call ID: call_hqQAhcldGcXkGAksZ3Jyr6nv
  Args:
    query: SELECT Invoice.InvoiceId, Invoice.Total FROM Invoice JOIN Customer ON Invoice.CustomerId = Customer.CustomerId WHERE Customer.FirstName = 'Frank' AND Customer.LastName = 'Harris' ORDER BY Invoice.Total DESC LIMIT 1;
Name: execute_sql

[(145, 13.86)]

The most costly purchase by Frank Harris was an invoice with a total amount of $13.86 (Invoice ID: 145).


In [30]:
# Employee: Should have full access
print("\n=== Employee Access ===")
for step in access_controlled_agent.stream(
    {"messages": question},
    context=RuntimeContext(is_employee=True, db=db),
    stream_mode="values",
):
    step["messages"][-1].pretty_print()


=== Employee Access ===

What is the most costly purchase by Frank Harris?
Tool Calls:
  execute_sql (call_yaLQ1BgnX2izDGxQtR25xGeW)
 Call ID: call_yaLQ1BgnX2izDGxQtR25xGeW
  Args:
    query: SELECT Invoice.InvoiceId, Invoice.Total FROM Invoice
JOIN Customer ON Invoice.CustomerId = Customer.CustomerId
WHERE Customer.FirstName = 'Frank' AND Customer.LastName = 'Harris'
ORDER BY Invoice.Total DESC
LIMIT 1;
Name: execute_sql

[(145, 13.86)]

The most costly purchase by Frank Harris was an invoice with a total of $13.86, specifically for Invoice ID 145.


---

# Part 3: Human-in-the-Loop (HITL)

For sensitive operations, you may want human approval before the agent executes tools. The `HumanInTheLoopMiddleware` provides this capability.

## 3.1 Basic HITL Setup

In [31]:
from langchain.agents import create_agent
from langchain.agents.middleware import HumanInTheLoopMiddleware
from langgraph.checkpoint.memory import InMemorySaver

@dataclass
class RuntimeContext:
    db: SQLDatabase

SYSTEM_PROMPT = f"""You are a SQLite analyst for a music store database.

Database Schema:
{DATABASE_SCHEMA}

Rules:
- Use execute_sql for SELECT queries only.
- Limit to 5 rows unless asked otherwise.
- If the database is offline, ask user to try again later.
"""

# Create agent with HITL middleware
hitl_agent = create_agent(
    model="openai:gpt-4o-mini",
    tools=[execute_sql],
    system_prompt=SYSTEM_PROMPT,
    checkpointer=InMemorySaver(),  # Required for HITL
    context_schema=RuntimeContext,
    middleware=[
        HumanInTheLoopMiddleware(
            interrupt_on={
                "execute_sql": {"allowed_decisions": ["approve", "reject"]}
            },
        ),
    ],
)

---

## Summary

In this notebook, we covered:

1. **Structured Output** - Extracting typed data:
   - `TypedDict` for simple schemas
   - Pydantic `BaseModel` for validation and complex types
   - Nested structures with Lists and Optional fields
   - Access via `result["structured_response"]`

2. **Dynamic Prompts** - Runtime customization:
   - `@dynamic_prompt` decorator
   - Access runtime context via `request.runtime.context`
   - Role-based access control example

3. **Human-in-the-Loop** - Approval workflows:
   - `HumanInTheLoopMiddleware` for tool approval

---

**Next:** [Notebook 4: Modern RAG with LangChain](./2.0-modern-rag-langchain.ipynb)