# LangChain Chat Models & Prompt Templates - Comprehensive Guide

This notebook provides a deep dive into LangChain's chat models and prompt engineering techniques using Anthropic's Claude.

## 📚 Learning Objectives
By the end of this notebook, you will understand:
- How to configure and initialize chat models
- Advanced prompt engineering patterns
- Memory management for conversational AI
- Streaming responses for better UX
- Error handling and production-ready practices
- Real-world applications and use cases

## 🎯 Prerequisites
- Python 3.8+
- Anthropic API key
- Basic understanding of LLMs and chat interfaces

---
## Section 1: Environment Setup

### 📖 What This Cell Does:
Imports all necessary libraries for working with LangChain and Claude.

### 🎓 Key Learning:
- **langchain_anthropic**: Provides the ChatAnthropic interface to Claude models
- **langchain_core.messages**: Message types (HumanMessage, SystemMessage, AIMessage) for chat
- **langchain_core.prompts**: Tools for creating structured, reusable prompt templates
- **langchain_core.runnables**: Chain composition and message history management
- **dotenv**: Secure environment variable management for API keys

### ⚠️ Best Practice:
Always use environment variables for API keys - never hardcode them in your code!

In [None]:
# Import required libraries
import os
from dotenv import load_dotenv
from typing import List, Dict, Any

# LangChain imports
from langchain_anthropic import ChatAnthropic
from langchain_core.messages import HumanMessage, SystemMessage, AIMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory, InMemoryChatMessageHistory

print("✓ All libraries imported successfully")

---
### 📖 What This Cell Does:
Loads API keys from the `.env` file and validates they exist.

### 🎓 Key Learning:
- **load_dotenv()**: Reads `.env` file and loads variables into `os.environ`
- **ANTHROPIC_API_KEY**: Required for accessing Claude models
- **LANGSMITH_API_KEY**: Optional - enables tracing and debugging of LangChain applications

### 💡 Task Performed:
1. Load environment variables from `.env` file
2. Retrieve API keys from environment
3. Validate that the Anthropic API key exists
4. Display configuration status (masking sensitive key data)

### ⚠️ Troubleshooting:
If you get "ANTHROPIC_API_KEY not found" error:
1. Ensure `.env` file exists in project root
2. Verify `.env` contains: `ANTHROPIC_API_KEY=your_key_here`
3. Get your API key from: https://console.anthropic.com/

In [None]:
# Load environment variables
load_dotenv()

# Verify API keys are loaded
anthropic_key = os.getenv("ANTHROPIC_API_KEY")
langsmith_key = os.getenv("LANGSMITH_API_KEY")

if not anthropic_key:
    raise ValueError("ANTHROPIC_API_KEY not found. Please create a .env file with your API key.")

print("✓ Environment variables loaded successfully")
print(f"✓ Anthropic API Key: {'*' * 20}{anthropic_key[-4:] if anthropic_key else 'Not set'}")
print(f"✓ LangSmith tracking: {'Enabled' if langsmith_key else 'Disabled (optional)'}")

---
## Section 2: Chat Model Initialization

### 📖 What This Cell Does:
Creates a reusable function to initialize Claude chat models with custom configurations.

### 🎓 Key Learning:
**Model Configuration Parameters:**
- **model_name**: Which Claude version to use (e.g., `claude-3-5-sonnet-20241022`)
  - Sonnet: Balanced performance and speed
  - Opus: Highest intelligence for complex tasks
  - Haiku: Fastest, most cost-effective

- **temperature** (0.0-1.0): Controls randomness/creativity
  - 0.0-0.3: Focused, deterministic (good for factual Q&A, code generation)
  - 0.4-0.7: Balanced (default for general chat)
  - 0.8-1.0: Creative, varied (good for creative writing, brainstorming)

- **max_tokens**: Maximum length of response (up to 4096 for Claude)
  - Higher = longer responses but more cost
  - Lower = more concise responses, faster, cheaper

### 💡 Task Performed:
1. Define factory function for creating chat models
2. Set sensible defaults (temperature=0.7, max_tokens=4096)
3. Create a default model instance
4. Display model configuration for verification

### 🔍 When to Adjust:
- **Lower temperature** when you need consistent, accurate answers
- **Higher temperature** when you want diverse, creative outputs
- **Lower max_tokens** for quick responses or cost optimization

In [None]:
# Initialize Claude model with various configurations
def create_chat_model(
    model_name: str = "claude-3-5-sonnet-20241022",
    temperature: float = 0.7,
    max_tokens: int = 4096
) -> ChatAnthropic:
    """Create and configure a Claude chat model."""
    return ChatAnthropic(
        model=model_name,
        temperature=temperature,
        max_tokens=max_tokens,
        anthropic_api_key=anthropic_key
    )

# Create default model instance
chat_model = create_chat_model()
print(f"✓ Chat model initialized: {chat_model.model}")
print(f"  - Temperature: {chat_model.temperature}")
print(f"  - Max tokens: {chat_model.max_tokens}")

---
## Section 3: Basic Chat Interaction

### 📖 What This Cell Does:
Demonstrates the simplest way to interact with Claude - sending a single message and getting a response.

### 🎓 Key Learning:
- **HumanMessage**: Represents user input in the conversation
- **invoke()**: Synchronous method to send messages and wait for complete response
- **response.content**: The actual text content of Claude's reply
- **Error handling**: Always wrap API calls in try-except blocks

### 💡 Task Performed:
1. Create a simple chat function that takes a question string
2. Wrap the question in a HumanMessage object
3. Send it to Claude using invoke()
4. Extract and return the text response
5. Handle any errors gracefully

### 🎯 Use Case:
Perfect for:
- Single-turn Q&A
- Simple queries that don't need context
- Quick tests and prototyping

### ⚡ Performance Note:
This is synchronous (blocking) - for real applications with long responses, consider streaming (covered later).

In [None]:
# Simple chat interaction
def simple_chat(question: str) -> str:
    """Send a simple question to Claude and get a response."""
    try:
        messages = [HumanMessage(content=question)]
        response = chat_model.invoke(messages)
        return response.content
    except Exception as e:
        return f"Error: {str(e)}"

# Test basic chat
question = "What is LangChain and why is it useful?"
print(f"Question: {question}\n")
response = simple_chat(question)
print(f"Response:\n{response}")

---
## Section 4: Advanced Prompt Engineering

### 📖 What This Cell Does:
Creates reusable prompt templates for different specialized tasks.

### 🎓 Key Learning - Prompt Templates:
**Why use templates?**
- **Reusability**: Write once, use with different inputs
- **Consistency**: Ensure same structure across multiple calls
- **Maintainability**: Change behavior in one place
- **Type safety**: Variables are clearly defined

**Template Structure:**
```python
ChatPromptTemplate.from_messages([
    ("system", "System instructions with {variable}"),
    ("human", "{user_input}")
])
```

### 💡 Three Template Patterns:

**1. Q&A Template (qa_prompt)**
- **Purpose**: Factual, accurate responses with expertise
- **Variables**: {domain}, {question}
- **Best for**: Technical documentation, research, explanations
- **Temperature**: Low (0.2-0.4) for accuracy

**2. Creative Writing Template (creative_prompt)**
- **Purpose**: Generate engaging narratives and content
- **Variables**: {genre}, {request}
- **Best for**: Stories, marketing copy, creative content
- **Temperature**: High (0.7-0.9) for variety

**3. Code Generation Template (code_prompt)**
- **Purpose**: Generate production-ready code with best practices
- **Variables**: {language}, {task}
- **Best for**: Code generation, refactoring, examples
- **Temperature**: Low (0.2-0.3) for deterministic output

### 🎯 Task Performed:
Define three specialized prompt templates, each with:
1. Role definition (system message)
2. Behavioral guidelines
3. Output expectations
4. Variable placeholders for dynamic content

In [None]:
# Create advanced prompt templates

# Template 1: Question Answering with Context
qa_prompt = ChatPromptTemplate.from_messages([
    ("system", """You are an expert AI assistant specializing in {domain}. 
    Your responses should be:
    - Accurate and well-researched
    - Clear and concise
    - Include examples when appropriate
    - Cite sources or reasoning when possible"""),
    ("human", "{question}")
])

# Template 2: Creative Writing Assistant
creative_prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a creative writing assistant with expertise in {genre}.
    Help users craft compelling narratives with:
    - Rich descriptions and vivid imagery
    - Strong character development
    - Engaging plot structures
    - Appropriate tone and style"""),
    ("human", "{request}")
])

# Template 3: Code Generation Assistant
code_prompt = ChatPromptTemplate.from_messages([
    ("system", """You are an expert {language} programmer.
    When generating code:
    - Follow best practices and design patterns
    - Include clear comments and documentation
    - Consider edge cases and error handling
    - Write clean, maintainable, and efficient code"""),
    ("human", "{task}")
])

print("✓ Prompt templates created successfully")

---
### 📖 What This Cell Does:
Demonstrates the Q&A template in action with domain expertise.

### 🎓 Key Learning - Chain Composition:
**The Pipe Operator `|`:**
```python
chain = prompt | chat_model | output_parser
```

This creates a data pipeline:
1. **prompt**: Formats the template with variables
2. **chat_model**: Sends to Claude and gets response
3. **output_parser**: Extracts string content from response object

### 💡 Task Performed:
1. Create a chain: `qa_prompt | chat_model | StrOutputParser()`
2. Invoke with specific domain ("machine learning") and question
3. Get a parsed string response

### 🎯 Example Use Case:
Building a technical Q&A chatbot where:
- Domain can be: "cybersecurity", "data science", "web development"
- Users ask specific technical questions
- Responses are authoritative and educational

### 📊 Expected Output:
A comprehensive explanation of supervised vs unsupervised learning with:
- Clear definitions
- Real-world examples (e.g., email spam filtering, customer segmentation)
- Key differences highlighted

In [None]:
# Demonstrate Q&A with domain expertise
qa_chain = qa_prompt | chat_model | StrOutputParser()

qa_response = qa_chain.invoke({
    "domain": "machine learning and artificial intelligence",
    "question": "Explain the difference between supervised and unsupervised learning with real-world examples."
})

print("=== Q&A Example ===")
print(qa_response)

---
### 📖 What This Cell Does:
Uses the creative writing template to generate narrative content.

### 🎓 Key Learning:
**Same chain pattern, different purpose:**
- Same technical structure (`prompt | model | parser`)
- Different system instructions
- Different variables ({genre}, {request})
- Different output style (creative vs factual)

### 💡 Task Performed:
1. Create creative writing chain
2. Specify genre ("science fiction")
3. Request specific creative content
4. Get creative, narrative-style response

### 🎯 Use Cases:
- Content marketing (blog posts, product descriptions)
- Story generation for games or entertainment
- Creative brainstorming
- Character development

### 📊 Expected Output:
An engaging opening paragraph with:
- Vivid descriptions
- Emotional depth
- Compelling narrative hook
- Genre-appropriate tone

In [None]:
# Demonstrate creative writing
creative_chain = creative_prompt | chat_model | StrOutputParser()

creative_response = creative_chain.invoke({
    "genre": "science fiction",
    "request": "Write the opening paragraph of a story about an AI that discovers emotions for the first time."
})

print("=== Creative Writing Example ===")
print(creative_response)

---
### 📖 What This Cell Does:
Demonstrates code generation with best practices.

### 🎓 Key Learning:
**Code Generation Best Practices:**
1. **Specify language**: Python, JavaScript, Java, etc.
2. **Clear requirements**: What should the code do?
3. **Quality expectations**: Error handling, edge cases, documentation
4. **Lower temperature**: More deterministic, reliable code

### 💡 Task Performed:
1. Create code generation chain
2. Specify language ("Python")
3. Request specific algorithm with requirements
4. Get production-ready code with:
   - Function definition
   - Docstrings
   - Type hints
   - Error handling
   - Example usage

### 🎯 Use Cases:
- Code assistance tools
- Learning platforms
- Rapid prototyping
- Code review and refactoring

### 📊 Expected Output:
Complete binary search implementation with:
- Input validation
- Edge case handling (empty list, not found)
- Clear comments
- Time complexity notes

In [None]:
# Demonstrate code generation
code_chain = code_prompt | chat_model | StrOutputParser()

code_response = code_chain.invoke({
    "language": "Python",
    "task": "Create a function that implements a binary search algorithm with proper error handling."
})

print("=== Code Generation Example ===")
print(code_response)

---
## Section 5: Conversational Memory

### 📖 What This Cell Does:
Sets up conversation memory to enable multi-turn dialogues where the AI remembers previous context.

### 🎓 Key Learning - Memory Management:

**Why Memory Matters:**
- Enables natural conversations
- AI remembers user preferences and previous statements
- Allows follow-up questions without repeating context
- Creates more helpful, context-aware interactions

**Memory Architecture:**
```
store = {}  # Dictionary to hold all sessions
  └─ session_id_1
       └─ InMemoryChatMessageHistory()
           └─ [HumanMessage, AIMessage, HumanMessage, ...]
  └─ session_id_2
       └─ InMemoryChatMessageHistory()
```

**Components:**
1. **store**: Dictionary holding all session histories
2. **session_id**: Unique identifier for each conversation
3. **InMemoryChatMessageHistory**: Stores message list in memory
4. **get_session_history()**: Retrieves or creates history for a session

### 💡 Task Performed:
1. Initialize empty store dictionary
2. Create function to get/create session history
3. Define conversational prompt with MessagesPlaceholder
4. Create base chain
5. Wrap with RunnableWithMessageHistory for automatic memory

### 🔍 MessagesPlaceholder:
- Special placeholder for injecting conversation history
- Dynamically inserts previous messages
- Must match history_messages_key in RunnableWithMessageHistory

### ⚠️ Production Note:
This uses in-memory storage (lost when program stops).
For production, use:
- Database storage (PostgreSQL, MongoDB)
- Redis for caching
- Cloud storage (AWS DynamoDB, Google Firestore)

In [None]:
# Set up conversation memory storage
store = {}

def get_session_history(session_id: str) -> BaseChatMessageHistory:
    """Retrieve or create chat history for a session."""
    if session_id not in store:
        store[session_id] = InMemoryChatMessageHistory()
    return store[session_id]

# Create conversational prompt with memory
conversational_prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a helpful AI assistant engaged in a natural conversation.
    Remember the context of our conversation and refer back to previous messages when relevant.
    Be friendly, informative, and maintain consistency throughout the dialogue."""),
    MessagesPlaceholder(variable_name="history"),
    ("human", "{input}")
])

# Create the conversational chain
conversational_chain = conversational_prompt | chat_model

# Wrap with message history
chat_with_memory = RunnableWithMessageHistory(
    conversational_chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="history"
)

print("✓ Conversational chain with memory initialized")

---
### 📖 What This Cell Does:
Demonstrates a multi-turn conversation where Claude remembers previous context.

### 🎓 Key Learning - Context Awareness:

**Conversation Flow:**
```
Turn 1: User introduces themselves
  └─ AI: Stores name and interest in Python

Turn 2: User asks follow-up ("resources")
  └─ AI: Knows context (Python for beginners)
  └─ AI: Recommends relevant resources

Turn 3: User tests memory ("my name?")
  └─ AI: Recalls "Alex" from Turn 1
```

### 💡 Task Performed:
1. Create helper function for conversational interaction
2. Start conversation with introduction (establishes context)
3. Ask follow-up question (tests context understanding)
4. Test memory recall (validates memory works)

### 🎯 What Makes This Powerful:
- **No context repetition**: User doesn't need to re-explain
- **Natural flow**: Like talking to a human
- **Contextual responses**: AI tailors answers based on history
- **Pronoun resolution**: AI understands "my" and "this" references

### 🔍 Session Management:
```python
config={"configurable": {"session_id": session_id}}
```
- Each session_id = separate conversation
- Enables multiple concurrent users
- Isolates conversations from each other

### 📊 Expected Output:
- Turn 1: Friendly greeting, acknowledgment of name and interest
- Turn 2: Python learning resources (books, courses, practice sites)
- Turn 3: Correctly recalls "Alex" from first message

In [None]:
# Demonstrate multi-turn conversation
def chat_with_context(message: str, session_id: str = "default") -> str:
    """Send a message and get a response with conversation history."""
    try:
        response = chat_with_memory.invoke(
            {"input": message},
            config={"configurable": {"session_id": session_id}}
        )
        return response.content
    except Exception as e:
        return f"Error: {str(e)}"

# Multi-turn conversation example
session_id = "demo_session_1"

print("=== Multi-Turn Conversation Demo ===")

# Turn 1
msg1 = "Hi! My name is Alex and I'm learning Python programming."
print(f"\nUser: {msg1}")
resp1 = chat_with_context(msg1, session_id)
print(f"Assistant: {resp1}")
print("-" * 80)

# Turn 2
msg2 = "Can you recommend some good resources for beginners?"
print(f"\nUser: {msg2}")
resp2 = chat_with_context(msg2, session_id)
print(f"Assistant: {resp2}")
print("-" * 80)

# Turn 3
msg3 = "What was my name again?"
print(f"\nUser: {msg3}")
resp3 = chat_with_context(msg3, session_id)
print(f"Assistant: {resp3}")

---
## Section 6: Streaming Responses

### 📖 What This Cell Does:
Implements streaming to display AI responses in real-time as they're generated.

### 🎓 Key Learning - Streaming vs Invoke:

**invoke() - All at Once:**
```
User sends message → [Wait...] → Complete response appears
```
- Simple to implement
- User waits for entire response
- No feedback during generation

**stream() - Token by Token:**
```
User sends message → The → response → appears → word → by → word
```
- Better user experience
- Immediate feedback
- Feels more interactive
- Like ChatGPT interface

### 💡 Task Performed:
1. Create function that uses `chat_model.stream()` instead of `invoke()`
2. Iterate over response chunks
3. Print each chunk immediately with `flush=True`
4. Handle errors gracefully

### 🎯 When to Use Streaming:
- ✅ Web chat interfaces
- ✅ Long-form content generation
- ✅ Interactive applications
- ❌ Batch processing
- ❌ When you need complete response first

### 🔍 Technical Details:
```python
for chunk in chat_model.stream(...):
    print(chunk.content, end="", flush=True)
```
- `end=""`: No newline after each chunk
- `flush=True`: Force immediate output
- Each chunk contains partial response

### 📊 Expected Output:
Story appears word-by-word in real-time, creating engaging user experience.

In [None]:
# Demonstrate streaming for real-time responses
def stream_response(message: str) -> None:
    """Stream a response token by token."""
    print("Assistant: ", end="", flush=True)
    try:
        for chunk in chat_model.stream([HumanMessage(content=message)]):
            print(chunk.content, end="", flush=True)
        print("\n")
    except Exception as e:
        print(f"\nError during streaming: {str(e)}")

print("=== Streaming Response Demo ===")
print("\nUser: Tell me a short story about a robot learning to paint.\n")
stream_response("Tell me a short story about a robot learning to paint.")

---
## Section 7: Summary

### 🎉 What You've Learned:

#### 1️⃣ **Chat Model Basics**
- Initialize Claude models with custom configurations
- Understand temperature and token limits
- Send simple messages and receive responses

#### 2️⃣ **Prompt Engineering**
- Create reusable prompt templates
- Use system messages to define AI behavior
- Build specialized assistants (Q&A, Creative, Code)
- Compose chains with pipe operator (|)

#### 3️⃣ **Memory & Context**
- Implement conversation memory
- Enable multi-turn dialogues
- Manage multiple concurrent sessions
- Use MessagesPlaceholder for history injection

#### 4️⃣ **Advanced Features**
- Stream responses for better UX
- Handle errors gracefully
- Parse outputs efficiently

### 🚀 Next Steps:
1. **RAG (Retrieval-Augmented Generation)**: Add knowledge bases
2. **Function Calling**: Enable AI to use tools
3. **Multi-Agent Systems**: Multiple specialized AIs working together
4. **Production Deployment**: Build web apps with FastAPI/Flask
5. **Vector Databases**: Semantic search with embeddings

### 📚 Resources:
- [LangChain Documentation](https://python.langchain.com/)
- [Anthropic Claude Docs](https://docs.anthropic.com/)
- [Prompt Engineering Guide](https://www.promptingguide.ai/)

### 💡 Best Practices Recap:
✅ Use environment variables for API keys  
✅ Implement error handling and retries  
✅ Adjust temperature based on use case  
✅ Use templates for reusability  
✅ Stream long responses  
✅ Implement session management for multi-user apps  
✅ Consider persistent storage for production  

In [None]:
# Summary statistics
print("=== Session Summary ===")
print(f"\nActive Sessions: {len(store)}")
print(f"Session IDs: {list(store.keys())}")

if store:
    for session_id, history in store.items():
        print(f"\n  {session_id}:")
        print(f"    Messages: {len(history.messages)}")
        print(f"    User turns: {sum(1 for m in history.messages if isinstance(m, HumanMessage))}")
        print(f"    AI turns: {sum(1 for m in history.messages if isinstance(m, AIMessage))}")