# üéì Week 14 - Day 4: LangChain II - Advanced Features

## Today's Goals:
‚úÖ Create structured outputs with Pydantic  
‚úÖ Implement advanced memory patterns  
‚úÖ Use LangSmith for debugging and monitoring  
‚úÖ Build production-ready LLM applications  
‚úÖ Optimize chain performance

## ‚è±Ô∏è Estimated Time: 90 minutes

**Note:** We'll use OpenAI and LangSmith (both have free tiers).


## üîß Part 1: Setup - Install & Import All Libraries

**IMPORTANT:** Run ALL cells in this part sequentially!


In [1]:
# STEP 1: Install required packages

!pip install -q langchain==0.1.20
!pip install -q langchain-openai==0.1.7
!pip install -q pydantic==2.6.4
!pip install -q langsmith==0.1.0

print("‚úÖ All libraries installed!")


  You can safely remove it manually.


‚úÖ All libraries installed!


ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
langchain 0.1.20 requires langsmith<0.2.0,>=0.1.17, but you have langsmith 0.1.0 which is incompatible.


In [2]:
# STEP 2: Import libraries

from langchain.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langchain.memory import (
    ConversationBufferMemory,
    ConversationBufferWindowMemory,
    ConversationSummaryMemory
)
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.pydantic_v1 import BaseModel, Field
from typing import List
import os

print("‚úÖ Libraries imported!")


‚úÖ Libraries imported!


In [3]:
# STEP 3: Setup API keys

# OpenAI API key
os.environ["OPENAI_API_KEY"] = "your-api-key-here"

# LangSmith API key (optional - for tracing)
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "your-langsmith-key"  # Get from smith.langchain.com

print("‚úÖ API keys configured!")
print("üí° Replace API keys with your actual keys!")


‚úÖ API keys configured!
üí° Replace API keys with your actual keys!


### üí° Key Insights:
- **Pydantic** for data validation
- **LangSmith** for debugging (optional but recommended)
- All features work without LangSmith too


## üìù Part 2: The Problem with Unstructured Outputs

See why we need structured outputs!


In [4]:
# STEP 1: Make a normal LLM call

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)

prompt = "Extract the person's name and age from: 'John is 25 years old'. Return as JSON."
response = llm.invoke(prompt)

print("ü§ñ LLM Response:")
print(response.content)
print("\n‚ö†Ô∏è  Issues:")
print("1. Response is a string, not structured data")
print("2. May include extra text")
print("3. Format can vary")
print("4. Hard to parse reliably")


AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: your-api*****here. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}

### üí° Key Insights:
- **Raw LLM output** = String (unpredictable)
- Need to **parse manually** (error-prone)
- **Not production-ready**


## ‚úÖ Part 3: Structured Outputs with Pydantic

Define exact output format!


In [None]:
# STEP 1: Define output schema with Pydantic

class Person(BaseModel):
    """Person information"""
    name: str = Field(description="The person's full name")
    age: int = Field(description="The person's age in years")

print("‚úÖ Person schema created!")
print(f"Fields: {Person.__fields__.keys()}")


In [None]:
# STEP 2: Create parser and prompt

parser = JsonOutputParser(pydantic_object=Person)

prompt = ChatPromptTemplate.from_messages([
    ("system", "Extract person information. {format_instructions}"),
    ("human", "{text}")
])

# Add format instructions
prompt = prompt.partial(format_instructions=parser.get_format_instructions())

print("‚úÖ Parser and prompt ready!")


In [None]:
# STEP 3: Create chain with structured output

chain = prompt | llm | parser

# Test it
result = chain.invoke({"text": "Sarah is 30 years old and lives in NYC"})

print("‚úÖ Structured Result:")
print(f"Type: {type(result)}")
print(f"Name: {result['name']}")
print(f"Age: {result['age']}")
print("\nüí° Clean, validated Python dict!")


In [None]:
# STEP 4: Try multiple examples

texts = [
    "Mike is 45 years old",
    "Emma turned 22 yesterday",
    "The doctor, Alex Chen, is 38"
]

print("üìä Batch Processing:\n")
for text in texts:
    result = chain.invoke({"text": text})
    print(f"Text: {text}")
    print(f"  ‚Üí Name: {result['name']}, Age: {result['age']}")
    print()


### üí° Key Insights:
- **Pydantic schema** defines exact structure
- **Parser** enforces the schema
- **Reliable** output every time
- **Production-ready** validation


## üìã Part 4: Complex Structured Outputs

Extract multiple entities!


In [None]:
# STEP 1: Define complex schema

class Task(BaseModel):
    """A single task"""
    title: str = Field(description="Task title")
    priority: str = Field(description="Priority: high, medium, or low")
    completed: bool = Field(description="Whether task is done")

class TaskList(BaseModel):
    """List of tasks"""
    tasks: List[Task] = Field(description="List of tasks")

print("‚úÖ Complex schema created!")


In [None]:
# STEP 2: Create parser for complex schema

parser = JsonOutputParser(pydantic_object=TaskList)

prompt = ChatPromptTemplate.from_messages([
    ("system", "Extract tasks from the text. {format_instructions}"),
    ("human", "{text}")
])

prompt = prompt.partial(format_instructions=parser.get_format_instructions())

chain = prompt | llm | parser

print("‚úÖ Complex chain ready!")


In [None]:
# STEP 3: Test complex extraction

text = """
I need to:
1. Finish the project report (high priority, not done yet)
2. Reply to emails (medium priority, already done)
3. Buy groceries (low priority, not done)
"""

result = chain.invoke({"text": text})

print("üìã Extracted Tasks:\n")
for i, task in enumerate(result['tasks'], 1):
    status = "‚úÖ" if task['completed'] else "‚¨ú"
    print(f"{i}. {status} [{task['priority'].upper()}] {task['title']}")


### üí° Key Insights:
- **Nested structures** work perfectly
- **Lists of objects** handled automatically
- **Complex extraction** made simple


## ü™ü Part 5: Window Memory - Recent Context Only

Keep only last N messages for efficiency!


In [None]:
# STEP 1: Create window memory (keep last 2 exchanges)

memory = ConversationBufferWindowMemory(
    k=2,  # Keep last 2 exchanges (4 messages)
    return_messages=True,
    memory_key="chat_history"
)

print("‚úÖ Window memory created!")
print(f"Window size: {memory.k} exchanges")


In [None]:
# STEP 2: Simulate conversation

from langchain.chains import ConversationChain

conversation = ConversationChain(
    llm=llm,
    memory=memory,
    verbose=False
)

print("üí¨ Conversation:\n")

# Message 1
r1 = conversation.predict(input="Hi! My favorite color is blue.")
print("You: Hi! My favorite color is blue.")
print(f"AI: {r1[:60]}...\n")

# Message 2
r2 = conversation.predict(input="I also love pizza.")
print("You: I also love pizza.")
print(f"AI: {r2[:60]}...\n")

# Message 3
r3 = conversation.predict(input="And I have a dog named Max.")
print("You: And I have a dog named Max.")
print(f"AI: {r3[:60]}...\n")


In [None]:
# STEP 3: Test memory (only remembers recent messages)

# This should remember dog (recent)
r4 = conversation.predict(input="What's my dog's name?")
print("You: What's my dog's name?")
print(f"AI: {r4}\n")

# This might NOT remember color (too old - outside window)
r5 = conversation.predict(input="What's my favorite color?")
print("You: What's my favorite color?")
print(f"AI: {r5}")
print("\nüí° Window memory forgot old messages!")


### üí° Key Insights:
- **Window memory** = Only last K exchanges
- **Efficient** for long conversations
- **Recent context** matters most
- Trade-off: **Forgets old info**


## üìù Part 6: Summary Memory - Compress Old Messages

Summarize old messages to save tokens!


In [None]:
# STEP 1: Create summary memory

summary_memory = ConversationSummaryMemory(
    llm=llm,
    return_messages=True,
    memory_key="chat_history"
)

print("‚úÖ Summary memory created!")
print("Old messages will be summarized")


In [None]:
# STEP 2: Use summary memory

conversation = ConversationChain(
    llm=llm,
    memory=summary_memory,
    verbose=True  # See summaries
)

print("üí¨ Conversation with Summary:\n")

# Have a longer conversation
messages = [
    "I'm planning a trip to Japan next month.",
    "I want to visit Tokyo, Kyoto, and Osaka.",
    "My budget is around $3000.",
    "I love trying new foods.",
    "What should I pack?"
]

for msg in messages:
    response = conversation.predict(input=msg)
    print(f"You: {msg}")
    print(f"AI: {response[:80]}...\n")


In [None]:
# STEP 3: Check the summary

print("üìù Conversation Summary:")
print(summary_memory.load_memory_variables({}))


### üí° Key Insights:
- **Summary memory** compresses old messages
- **Saves tokens** in long conversations
- **Retains key information**
- Best for **cost optimization**


## üîç Part 7: LangSmith - Debugging & Tracing

See what happens inside your chains!


In [None]:
# STEP 1: Enable LangSmith tracing

# If you set the env vars in Part 1, tracing is already on!
# Otherwise:
os.environ["LANGCHAIN_TRACING_V2"] = "true"
# os.environ["LANGCHAIN_API_KEY"] = "your-key"

print("‚úÖ LangSmith tracing enabled!")
print("\nüí° All LLM calls will be traced")
print("View at: https://smith.langchain.com")


In [None]:
# STEP 2: Create a chain with multiple steps

from langchain_core.runnables import RunnablePassthrough

# Multi-step chain
joke_prompt = ChatPromptTemplate.from_template(
    "Tell a short joke about {topic}"
)

explain_prompt = ChatPromptTemplate.from_template(
    "Explain why this joke is funny: {joke}"
)

# Chain: topic ‚Üí joke ‚Üí explanation
joke_chain = joke_prompt | llm
explain_chain = explain_prompt | llm

full_chain = (
    {"topic": RunnablePassthrough()}
    | {"joke": joke_chain}
    | explain_chain
)

print("‚úÖ Multi-step chain created!")


In [None]:
# STEP 3: Run chain (will be traced in LangSmith)

result = full_chain.invoke("programming")

print("ü§ñ Result:")
print(result.content)

print("\n‚úÖ Check LangSmith dashboard to see:")
print("  ‚Ä¢ Each step of the chain")
print("  ‚Ä¢ Input/output at each stage")
print("  ‚Ä¢ Token counts")
print("  ‚Ä¢ Latency")
print("\nLink: https://smith.langchain.com")


### üí° Key Insights:
- **LangSmith** traces every step
- **Debug** complex chains easily
- **Monitor** production apps
- **Free tier** available


## üèóÔ∏è Part 8: Production-Ready Chain with Everything

Combine structured outputs, memory, and tracing!


In [None]:
# STEP 1: Define structured output for chatbot

class ChatResponse(BaseModel):
    """Chatbot response with metadata"""
    message: str = Field(description="Response message")
    sentiment: str = Field(description="Sentiment: positive, neutral, or negative")
    topics: List[str] = Field(description="Main topics discussed")

print("‚úÖ Response schema defined!")


In [None]:
# STEP 2: Create production chain

parser = JsonOutputParser(pydantic_object=ChatResponse)

system_prompt = """You are a helpful assistant. 
Respond to the user and analyze the conversation.

{format_instructions}

Recent conversation:
{chat_history}
"""

prompt = ChatPromptTemplate.from_messages([
    ("system", system_prompt),
    ("human", "{input}")
])

prompt = prompt.partial(format_instructions=parser.get_format_instructions())

# Use window memory
memory = ConversationBufferWindowMemory(
    k=3,
    return_messages=True,
    memory_key="chat_history"
)

print("‚úÖ Production chain ready!")


In [None]:
# STEP 3: Create chat function

def chat(user_input):
    """Chat with structured output and memory"""
    
    # Get chat history
    history = memory.load_memory_variables({})["chat_history"]
    
    # Run chain
    chain = prompt | llm | parser
    result = chain.invoke({
        "input": user_input,
        "chat_history": history
    })
    
    # Save to memory
    memory.save_context(
        {"input": user_input},
        {"output": result["message"]}
    )
    
    return result

print("‚úÖ Chat function created!")


In [None]:
# STEP 4: Test production chatbot

print("üí¨ Testing Production Chatbot:\n")

messages = [
    "I just got accepted to my dream university!",
    "But I'm worried about the tuition costs.",
    "What should I do?"
]

for msg in messages:
    response = chat(msg)
    
    print(f"You: {msg}")
    print(f"AI: {response['message']}")
    print(f"üìä Sentiment: {response['sentiment']}")
    print(f"üè∑Ô∏è  Topics: {', '.join(response['topics'])}")
    print()


### üí° Key Insights:
- **Structured output** for analysis
- **Memory** for context
- **LangSmith** traces everything
- **Production-ready** pattern


## üéØ Challenge Time!

### üèÜ Beginner Challenge:

**Task:** Build a smart email classifier!

**Requirements:**
1. Define Pydantic schema:
   - category: str (work, personal, spam)
   - priority: str (high, medium, low)
   - summary: str (brief summary)
   - action_needed: bool

2. Create chain with structured output

3. Test with sample emails

4. Add window memory (bonus)

**Example:**
```
Email: "Meeting tomorrow at 2pm about project deadline"
Output: {
  category: "work",
  priority: "high",
  summary: "Project deadline meeting",
  action_needed: true
}
```

**Try it yourself!** üöÄ


## üìö Summary: What We Learned

### ‚úÖ Key Concepts:

**1. Structured Outputs:**
- Pydantic schemas define exact format
- JsonOutputParser enforces structure
- Reliable, type-safe data
- Production-ready

**2. Pydantic Benefits:**
- Field validation
- Type checking
- Nested structures
- List handling

**3. Advanced Memory:**
- WindowMemory: Recent context only
- SummaryMemory: Compress old messages
- Choose based on use case
- Optimize token usage

**4. LangSmith:**
- Trace every LLM call
- Debug complex chains
- Monitor production
- Optimize performance

**5. Production Patterns:**
- Combine all features
- Structured + Memory + Tracing
- Reliable, maintainable apps
- Ready to deploy

### üéØ Key Takeaways:

1Ô∏è‚É£ Structured outputs solve the unpredictability problem

2Ô∏è‚É£ Different memory types for different needs

3Ô∏è‚É£ LangSmith is essential for production

4Ô∏è‚É£ These patterns make apps reliable

5Ô∏è‚É£ You're ready to build production LLM apps!

---

## üöÄ Next Steps:

- **Practice:** Build your own structured applications
- **Experiment:** Try different memory types
- **Explore:** LangSmith dashboard features
- **Build:** Production-ready chatbots

---

## üìñ Resources:

- **Pydantic Docs:** https://docs.pydantic.dev/
- **LangChain Memory:** https://python.langchain.com/docs/modules/memory/
- **LangSmith:** https://smith.langchain.com

---

**üéâ Congratulations! You've learned advanced LangChain features! üéâ**

**You're now ready to build production-ready LLM applications!** üöÄ
