# Module 26: LangChain

**Building LLM Applications with Composable Components**

---

## 1. Objectives

- ✅ Understand LangChain architecture
- ✅ Master chains and prompt templates
- ✅ Build agents with tools
- ✅ Implement conversation memory
- ✅ Create RAG pipelines with LangChain

## 2. Prerequisites

- [Module 19: Prompt Engineering](../19_prompt_engineering/19_prompt_engineering.ipynb)
- [Module 21: RAG](../21_rag/21_rag.ipynb)

## 3. What is LangChain?

### Core Concept

LangChain provides **composable building blocks** for LLM applications:

```
┌─────────────────────────────────────────────────────────────┐
│                    LangChain Architecture                    │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  [Prompts] ──→ [LLM/Chat Model] ──→ [Output Parsers]        │
│       ↑                                      ↓               │
│  [Memory]                               [Chains]             │
│       ↑                                      ↓               │
│  [Retrievers] ←── [Vector Stores] ←── [Documents]           │
│                                                              │
│              [Agents] ←── [Tools]                            │
│                                                              │
└─────────────────────────────────────────────────────────────┘
```

### Key Components

| Component | Purpose |
|-----------|--------|
| Models | LLM wrappers (OpenAI, Anthropic, local) |
| Prompts | Template management |
| Chains | Compose multiple steps |
| Memory | Persist conversation state |
| Agents | Dynamic tool selection |
| Retrievers | Document retrieval for RAG |

In [None]:
# Install: pip install langchain langchain-openai langchain-community chromadb

# Note: Set your API key
# import os
# os.environ["OPENAI_API_KEY"] = "your-key-here"

from langchain_core.prompts import ChatPromptTemplate, PromptTemplate
from langchain_core.output_parsers import StrOutputParser, JsonOutputParser
from langchain_core.runnables import RunnablePassthrough, RunnableLambda

print("LangChain imported! (v0.2+ with LCEL)")

## 4. Prompt Templates

### Theory

Prompt templates separate **structure** from **content**:
- Reusable across different inputs
- Easy to version and manage
- Support for few-shot examples

In [None]:
# Basic prompt template
from langchain_core.prompts import PromptTemplate

template = PromptTemplate.from_template(
    "You are an expert in {domain}. Answer this question: {question}"
)

# Format with variables
prompt = template.format(
    domain="machine learning",
    question="What is gradient descent?"
)
print("Formatted prompt:")
print(prompt)

In [None]:
# Chat prompt template (for chat models)
from langchain_core.prompts import ChatPromptTemplate

chat_template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful {role}. Be concise."),
    ("human", "{input}")
])

messages = chat_template.format_messages(
    role="Python tutor",
    input="Explain list comprehensions"
)

print("Chat messages:")
for msg in messages:
    print(f"  [{msg.type}]: {msg.content}")

In [None]:
# Few-shot prompt template
from langchain_core.prompts import FewShotPromptTemplate

examples = [
    {"input": "happy", "output": "sad"},
    {"input": "hot", "output": "cold"},
    {"input": "big", "output": "small"}
]

example_template = PromptTemplate(
    input_variables=["input", "output"],
    template="Input: {input}\nOutput: {output}"
)

few_shot_prompt = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_template,
    prefix="Give the opposite of each word:",
    suffix="Input: {word}\nOutput:",
    input_variables=["word"]
)

print(few_shot_prompt.format(word="fast"))

## 5. LCEL (LangChain Expression Language)

### Theory

LCEL is LangChain's **declarative composition** syntax:

```python
chain = prompt | llm | output_parser
```

Benefits:
- Streaming support built-in
- Async support built-in  
- Batch processing
- Easy debugging

In [None]:
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser

# Define components
prompt = ChatPromptTemplate.from_template(
    "Write a one-sentence summary of: {topic}"
)
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
output_parser = StrOutputParser()

# Compose with LCEL (pipe operator)
chain = prompt | llm | output_parser

# Invoke
result = chain.invoke({"topic": "machine learning"})
print(result)

In [None]:
# Streaming with LCEL
for chunk in chain.stream({"topic": "neural networks"}):
    print(chunk, end="", flush=True)

In [None]:
# Parallel chains with RunnableParallel
from langchain_core.runnables import RunnableParallel

summary_chain = ChatPromptTemplate.from_template(
    "Summarize in one sentence: {topic}"
) | llm | StrOutputParser()

keywords_chain = ChatPromptTemplate.from_template(
    "Extract 3 keywords from: {topic}"
) | llm | StrOutputParser()

# Run both in parallel
parallel_chain = RunnableParallel(
    summary=summary_chain,
    keywords=keywords_chain
)

result = parallel_chain.invoke({"topic": "Transformers in NLP"})
print(f"Summary: {result['summary']}")
print(f"Keywords: {result['keywords']}")

## 6. Memory

### Theory

Memory persists information across conversation turns:

| Type | What it Stores | Best For |
|------|---------------|----------|
| ConversationBufferMemory | All messages | Short convos |
| ConversationSummaryMemory | Summary | Long convos |
| ConversationBufferWindowMemory | Last K messages | Medium convos |
| VectorStoreRetrieverMemory | Relevant history | Large history |

In [None]:
from langchain.memory import ConversationBufferMemory
from langchain_core.prompts import MessagesPlaceholder

# Create memory
memory = ConversationBufferMemory(return_messages=True)

# Add to prompt
prompt_with_memory = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    MessagesPlaceholder(variable_name="history"),
    ("human", "{input}")
])

# Simulate conversation
memory.save_context(
    {"input": "My name is Alice"},
    {"output": "Hello Alice! How can I help you today?"}
)

memory.save_context(
    {"input": "I'm learning Python"},
    {"output": "That's great! Python is a wonderful language to learn."}
)

# Load memory
history = memory.load_memory_variables({})
print("Conversation history:")
for msg in history['history']:
    print(f"  {msg.type}: {msg.content}")

In [None]:
# Summary memory for long conversations
from langchain.memory import ConversationSummaryMemory

summary_memory = ConversationSummaryMemory(
    llm=llm,
    return_messages=True
)

# This automatically summarizes when buffer gets long
print("Summary memory created - automatically condenses long conversations")

## 7. Agents and Tools

### Theory

Agents **dynamically select** which tools to use based on the input:

```
User Query ──→ Agent ──→ Think: Which tool?
                  ↓
            [Tool 1] [Tool 2] [Tool 3]
                  ↓
            Execute Tool ──→ Observe Result
                  ↓
            Think: Done? ──→ No ──→ Loop back
                  ↓ Yes
            Final Answer
```

In [None]:
from langchain.agents import tool
from langchain_core.tools import Tool

# Define custom tools
@tool
def calculator(expression: str) -> str:
    """Evaluate a mathematical expression. Input should be a valid Python math expression."""
    try:
        return str(eval(expression))
    except:
        return "Error: Invalid expression"

@tool
def get_word_length(word: str) -> int:
    """Get the length of a word."""
    return len(word)

@tool  
def search_knowledge(query: str) -> str:
    """Search for information. Returns relevant facts."""
    # Simulated search
    knowledge_base = {
        "python": "Python is a programming language created by Guido van Rossum.",
        "langchain": "LangChain is a framework for building LLM applications.",
        "transformer": "Transformers use self-attention for sequence modeling."
    }
    for key, value in knowledge_base.items():
        if key in query.lower():
            return value
    return "No information found."

tools = [calculator, get_word_length, search_knowledge]
print(f"Defined {len(tools)} tools:")
for t in tools:
    print(f"  - {t.name}: {t.description[:50]}...")

In [None]:
from langchain.agents import create_openai_tools_agent, AgentExecutor
from langchain import hub

# Get a prompt for the agent
prompt = hub.pull("hwchase17/openai-tools-agent")

# Create the agent
agent = create_openai_tools_agent(llm, tools, prompt)

# Create executor
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True  # See the agent's thinking
)

# Run the agent
result = agent_executor.invoke({
    "input": "What is 25 * 4 + 10? Also, how long is the word 'artificial'?"
})
print(f"\nFinal answer: {result['output']}")

## 8. RAG with LangChain

### Complete Pipeline

In [None]:
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_core.runnables import RunnablePassthrough
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Sample documents
documents = [
    "LangChain is a framework for developing LLM applications.",
    "It provides tools for prompt management, memory, and agents.",
    "LCEL (LangChain Expression Language) enables chain composition.",
    "Agents can use tools to interact with external systems."
]

# Create vector store
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_texts(documents, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

print(f"Created vector store with {len(documents)} documents")

In [None]:
# RAG chain with LCEL
from langchain_core.prompts import ChatPromptTemplate

rag_prompt = ChatPromptTemplate.from_template("""
Answer based on the context below. If unsure, say "I don't know."

Context: {context}

Question: {question}

Answer:""")

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | rag_prompt
    | llm
    | StrOutputParser()
)

# Query
answer = rag_chain.invoke("What is LCEL?")
print(f"Answer: {answer}")

## 9. Output Parsers

Convert LLM text output to structured data:

In [None]:
from langchain_core.output_parsers import JsonOutputParser
from pydantic import BaseModel, Field

# Define output schema
class MovieReview(BaseModel):
    title: str = Field(description="Movie title")
    rating: int = Field(description="Rating out of 10")
    summary: str = Field(description="One-line summary")

parser = JsonOutputParser(pydantic_object=MovieReview)

prompt = ChatPromptTemplate.from_messages([
    ("system", "Extract movie review info.\n{format_instructions}"),
    ("human", "{review}")
]).partial(format_instructions=parser.get_format_instructions())

chain = prompt | llm | parser

result = chain.invoke({
    "review": "Inception is mind-bending! The visuals are stunning and the plot keeps you guessing. Solid 9/10."
})
print(f"Parsed: {result}")

## 10. Interview Questions

**Q1: What is the difference between chains and agents?**
<details><summary>Answer</summary>

- **Chains**: Fixed sequence of steps, deterministic execution
- **Agents**: Dynamic tool selection based on LLM reasoning, can loop
- Use chains for straightforward pipelines, agents for complex decision-making
</details>

**Q2: What is LCEL and why use it?**
<details><summary>Answer</summary>

LCEL (LangChain Expression Language) is declarative chain composition using `|` operator:
- Built-in streaming, async, and batching
- Easier debugging with `chain.get_graph()`
- Cleaner code than legacy `LLMChain`
</details>

**Q3: How would you handle a long conversation in LangChain?**
<details><summary>Answer</summary>

Use appropriate memory:
- `ConversationSummaryMemory`: Summarizes old messages
- `ConversationBufferWindowMemory`: Keeps last K messages
- `VectorStoreRetrieverMemory`: Retrieves relevant history
</details>

## 11. Summary

| Component | Purpose |
|-----------|--------|
| Prompts | Template management |
| LCEL | Declarative chain composition |
| Memory | Conversation persistence |
| Agents | Dynamic tool selection |
| Retrievers | RAG document retrieval |

## 12. References

- [LangChain Docs](https://python.langchain.com/docs/)
- [LCEL Guide](https://python.langchain.com/docs/expression_language/)
- [LangChain Hub](https://smith.langchain.com/hub)

---
**Next:** [Module 27: LangGraph](../27_langgraph/27_langgraph.ipynb)