# LangChain: From Zero to Hero

Welcome to this comprehensive LangChain course! By the end of this tutorial, you'll have a solid understanding of how to build powerful applications with Large Language Models (LLMs) using the LangChain framework.

## Course Outline

1. Introduction to LangChain
2. Setting Up Your Environment
3. LangChain Components
   - Models
   - Prompts
   - Memory
   - Chains
   - Agents
   - Tools
4. Building Simple Applications
5. Advanced Use Cases
6. Best Practices
7. Project Implementation

## 1. Introduction to LangChain

LangChain is a framework for developing applications powered by language models. It enables the creation of applications that are:
- **Data-aware**: connect LLMs to other data sources
- **Agentic**: allow LLMs to interact with their environment

### Why LangChain?

- Simplifies integration with various LLMs (OpenAI, Anthropic, Hugging Face, etc.)
- Provides components for building complex chains and agents
- Offers tools for enhancing LLM capabilities
- Enables memory management for contextual conversations

## 2. Setting Up Your Environment

Let's start by installing the required packages and setting up our environment.

In [1]:
# Install necessary packages
!pip install langchain openai chromadb tiktoken bs4 python-dotenv

Defaulting to user installation because normal site-packages is not writeable
[0mCollecting langchain
  Downloading langchain-0.3.25-py3-none-any.whl.metadata (7.8 kB)
Collecting openai
  Downloading openai-1.79.0-py3-none-any.whl.metadata (25 kB)
Collecting bs4
  Downloading bs4-0.0.2-py2.py3-none-any.whl.metadata (411 bytes)
Collecting langchain-text-splitters<1.0.0,>=0.3.8 (from langchain)
  Downloading langchain_text_splitters-0.3.8-py3-none-any.whl.metadata (1.9 kB)
Collecting async-timeout<5.0.0,>=4.0.0 (from langchain)
  Downloading async_timeout-4.0.3-py3-none-any.whl.metadata (4.2 kB)
Collecting jiter<1,>=0.4.0 (from openai)
  Downloading jiter-0.10.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.2 kB)
Collecting protobuf (from onnxruntime>=1.14.1->chromadb)
  Using cached protobuf-5.29.4-cp38-abi3-manylinux2014_x86_64.whl.metadata (592 bytes)
Downloading langchain-0.3.25-py3-none-any.whl (1.0 MB)
[2K   [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Before proceeding, you'll need to set up your API keys. For this course, we'll use OpenAI's models, but LangChain supports many other providers.

Create a `.env` file in your project directory with:

```
OPENAI_API_KEY=your_openai_api_key
```

Let's load the environment variables:

In [None]:
# Load environment variables
import os
from dotenv import load_dotenv

load_dotenv()

# Check if the API key is loaded
if os.getenv("OPENAI_API_KEY"):
    print("OpenAI API key loaded successfully!")
else:
    print("OpenAI API key not found. Please check your .env file.")

## 3. LangChain Components

LangChain provides several key components that you can combine to create powerful applications. Let's explore each one.

### 3.1 Models

LangChain supports various types of models:
- Language models (LLMs)
- Chat models
- Text embedding models

Let's start with a simple example using an LLM:

In [None]:
from langchain.llms import OpenAI

# Initialize the LLM
llm = OpenAI(temperature=0.7)

# Ask a question
response = llm.invoke("What is the capital of France?")
print(response)

Now, let's try a chat model, which is designed for conversational interactions:

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage

# Initialize the chat model
chat = ChatOpenAI(temperature=0.7)

# Create a simple conversation
messages = [
    SystemMessage(content="You are a helpful assistant."),
    HumanMessage(content="Hello! Can you tell me about LangChain?")
]

# Get a response
response = chat.invoke(messages)
print(response.content)

### 3.2 Prompts

Prompts are structured inputs to language models. LangChain provides tools for managing prompts:

- **PromptTemplates**: Create dynamic templates with variables
- **Few-shot examples**: Include examples in prompts
- **OutputParsers**: Structure the output from LLMs

In [None]:
from langchain.prompts import PromptTemplate

# Simple prompt template
template = """
Answer the question based on the context below.

Context: {context}

Question: {question}

Answer:
"""

prompt = PromptTemplate(
    template=template,
    input_variables=["context", "question"]
)

# Format the prompt with actual values
formatted_prompt = prompt.format(
    context="Paris is the capital and most populous city of France.",
    question="What is the capital of France?"
)

print(formatted_prompt)

In [None]:
# Use the formatted prompt with our LLM
response = llm.invoke(formatted_prompt)
print(response)

Let's try a more complex example with few-shot learning:

In [None]:
from langchain.prompts import FewShotPromptTemplate, PromptTemplate

# Define our examples
examples = [
    {"word": "happy", "antonym": "sad"},
    {"word": "tall", "antonym": "short"},
    {"word": "big", "antonym": "small"}
]

# Create an example template
example_template = """
Word: {word}
Antonym: {antonym}
"""

example_prompt = PromptTemplate(
    input_variables=["word", "antonym"],
    template=example_template
)

# Create a few-shot prompt template
few_shot_prompt = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix="Give the antonym of the following word:",
    suffix="Word: {input}\nAntonym:",
    input_variables=["input"],
    example_separator="\n\n"
)

# Format the prompt with our input
formatted_prompt = few_shot_prompt.format(input="hot")
print(formatted_prompt)

In [None]:
# Use the formatted prompt with our LLM
response = llm.invoke(formatted_prompt)
print(response)

### 3.3 Memory

Memory components allow LLMs to retain context across interactions. This is crucial for building conversational applications.

Let's implement a simple conversation with memory:

In [None]:
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain

# Initialize the conversation with memory
conversation = ConversationChain(
    llm=llm,
    memory=ConversationBufferMemory()
)

# First interaction
response1 = conversation.invoke("My name is Alex.")
print(response1["response"])

# Second interaction (the model should remember the name)
response2 = conversation.invoke("What's my name?")
print(response2["response"])

Let's explore different types of memory:

In [None]:
from langchain.memory import ConversationBufferWindowMemory

# Initialize with window memory (only remembers last k interactions)
conversation_window = ConversationChain(
    llm=llm,
    memory=ConversationBufferWindowMemory(k=2)
)

# Conversation with limited memory
response1 = conversation_window.invoke("Hi, I'm Sarah.")
print("Response 1:", response1["response"])

response2 = conversation_window.invoke("I live in New York.")
print("Response 2:", response2["response"])

response3 = conversation_window.invoke("I work as a software engineer.")
print("Response 3:", response3["response"])

# This should only remember the last 2 interactions, not the first one
response4 = conversation_window.invoke("Where do I live and what's my job?")
print("Response 4:", response4["response"])

response5 = conversation_window.invoke("What's my name?")  # Should not remember
print("Response 5:", response5["response"])

### 3.4 Chains

Chains combine multiple components to create a specific sequence of operations. Let's look at some examples:

In [None]:
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate

# Create a simple chain that formats a prompt and passes it to an LLM
template = "What is the capital of {country}?"
prompt_template = PromptTemplate(template=template, input_variables=["country"])

chain = LLMChain(llm=llm, prompt=prompt_template)

# Run the chain
response = chain.invoke("France")
print(response["text"])

Let's create a more complex chain that combines multiple steps:

In [None]:
from langchain.chains import SequentialChain, LLMChain

# First chain: Generate a short story about a topic
template1 = "Write a very short story about {topic} in 3 sentences."
prompt_template1 = PromptTemplate(template=template1, input_variables=["topic"])
story_chain = LLMChain(llm=llm, prompt=prompt_template1, output_key="story")

# Second chain: Generate a title for the story
template2 = "Create a title for this story: {story}"
prompt_template2 = PromptTemplate(template=template2, input_variables=["story"])
title_chain = LLMChain(llm=llm, prompt=prompt_template2, output_key="title")

# Combine the chains
sequential_chain = SequentialChain(
    chains=[story_chain, title_chain],
    input_variables=["topic"],
    output_variables=["story", "title"]
)

# Run the combined chain
response = sequential_chain.invoke("a magical forest")
print("Title:", response["title"])
print("\nStory:", response["story"])

### 3.5 Agents

Agents use LLMs to determine which actions to take and in what order. They combine LLMs with tools to interact with external systems.

Let's create a simple agent that can perform calculations:

In [None]:
from langchain.agents import load_tools, initialize_agent, AgentType
from langchain.tools import Tool

# Load tools
tools = load_tools(["llm-math"], llm=llm)

# Initialize agent
agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)

# Run the agent
agent.invoke("What is the square root of 4 plus the square root of 9?")

Let's create a more complex agent that can search the internet and perform calculations:

In [None]:
# Let's create a custom tool
from langchain.agents import Tool
from langchain.utilities import GoogleSearchAPIWrapper

# You would need to set up Google Search API keys for this
# search = GoogleSearchAPIWrapper()

# For demo purposes, let's create a dummy search tool
def dummy_search(query):
    if "weather" in query.lower():
        return "The weather is sunny and 75 degrees."
    elif "population" in query.lower():
        return "The population of the mentioned location is approximately 3 million."
    else:
        return "I found several websites related to your query."

# Create our tools
tools = [
    Tool(
        name="Search",
        func=dummy_search,
        description="Useful for when you need to search for information on the internet."
    ),
    *load_tools(["llm-math"], llm=llm)
]

# Initialize the agent with our tools
agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)

# Run the agent with a complex query
agent.invoke("What's the weather like today? If the temperature in Celsius is the current temperature minus 32 divided by 1.8, what would it be in Celsius?")

### 3.6 Tools

Tools allow language models to interact with external systems. LangChain provides many built-in tools, and you can create custom ones.

Let's create a custom tool:

In [None]:
from langchain.tools import BaseTool
from langchain.agents import initialize_agent, Tool
from typing import Optional, Type

class CalculatorTool(BaseTool):
    name = "Calculator"
    description = "Useful for performing basic arithmetic operations"
    
    def _run(self, query: str) -> str:
        try:
            result = eval(query)
            return str(result)
        except Exception as e:
            return f"Error: {str(e)}"
    
    def _arun(self, query: str):
        # For async implementation
        raise NotImplementedError("CalculatorTool does not support async")

# Create an instance of our custom tool
calculator_tool = CalculatorTool()

# Create an agent with our custom tool
tools = [calculator_tool]

agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)

# Test the agent with an arithmetic query
agent.invoke("What is 123 * 456?")

## 4. Building Simple Applications

Now that we understand the core components, let's build some simple applications.

### 4.1 Question-Answering over Documents

In [None]:
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.chains import RetrievalQA

# First, let's create a sample document
with open("sample_document.txt", "w") as f:
    f.write("""
LangChain is a framework for developing applications powered by language models.
It enables applications that are context-aware and reason-driven.
LangChain provides modules for working with language models, prompts, memory, indexes, chains and agents.
The framework is designed to be modular and extensible, making it easy to build complex applications.
LangChain was developed to make it easier for developers to create applications using large language models.
It provides a standard interface for connecting language models to other data sources and allowing language models to interact with their environment.
One of the key features of LangChain is its ability to chain together multiple components to create more complex applications.
"""
    )

# Load the document
loader = TextLoader("sample_document.txt")
documents = loader.load()

# Split the text into chunks
text_splitter = CharacterTextSplitter(chunk_size=300, chunk_overlap=50)
chunks = text_splitter.split_documents(documents)

# Create embeddings and store in vector database
embeddings = OpenAIEmbeddings()
vectordb = Chroma.from_documents(chunks, embeddings)

# Create a retrieval QA chain
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectordb.as_retriever()
)

# Ask questions
question = "What is LangChain used for?"
response = qa_chain.invoke(question)
print(response["result"])

### 4.2 Building a Chatbot with Memory

In [None]:
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

# Create memory
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

# Create a conversational retrieval chain
retrieval_chain = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=vectordb.as_retriever(),
    memory=memory
)

# First question
response = retrieval_chain.invoke({"question": "What is LangChain?"})
print("Response 1:", response["answer"])

# Follow-up question (should use context from previous question)
response = retrieval_chain.invoke({"question": "What are its key features?"})
print("Response 2:", response["answer"])

## 5. Advanced Use Cases

Now let's explore some more advanced applications of LangChain.

### 5.1 Text Summarization

In [None]:
from langchain.chains.summarize import load_summarize_chain

# Let's create a longer text for summarization
with open("article.txt", "w") as f:
    f.write("""
Artificial Intelligence (AI) is revolutionizing industries across the globe. From healthcare to finance, transportation to entertainment, AI is transforming how businesses operate and how people live their daily lives.

In healthcare, AI systems are being used to diagnose diseases, analyze medical images, and develop personalized treatment plans. These technologies can process vast amounts of medical data far more quickly than human doctors, potentially leading to earlier diagnoses and better patient outcomes.

The financial sector has embraced AI for fraud detection, algorithmic trading, and customer service. AI-powered chatbots now handle routine customer inquiries, freeing up human representatives for more complex issues. Meanwhile, sophisticated algorithms analyze market trends and execute trades at speeds impossible for human traders.

Transportation is being revolutionized by autonomous vehicle technology. Companies like Tesla, Waymo, and Uber are investing heavily in self-driving cars, promising to reduce accidents, ease traffic congestion, and provide mobility options for those unable to drive.

In the entertainment industry, streaming services like Netflix and Spotify use AI to analyze user preferences and recommend content. These recommendation engines have transformed how people discover new movies, shows, and music.

Despite these advancements, AI raises important ethical considerations. Issues of privacy, bias in algorithms, job displacement, and the need for regulatory frameworks are ongoing challenges that society must address as AI technology continues to evolve.

As we look to the future, the integration of AI into everyday life is likely to accelerate. Those businesses and individuals who adapt to this changing technological landscape will be best positioned to thrive in an increasingly AI-driven world.
"""
    )

# Load the document
loader = TextLoader("article.txt")
documents = loader.load()

# Create a summarization chain
summarize_chain = load_summarize_chain(llm, chain_type="map_reduce")

# Generate summary
summary = summarize_chain.invoke(documents)
print(summary["output_text"])

### 5.2 Creating an Agent with Multiple Tools

In [None]:
from langchain.agents import Tool, initialize_agent, AgentType
from langchain.tools import BaseTool

# Create several custom tools

class WeatherTool(BaseTool):
    name = "Weather"
    description = "Get current weather in a location"
    
    def _run(self, location: str) -> str:
        # In a real scenario, you would call a weather API
        return f"The weather in {location} is currently sunny and 75°F (24°C)."
    
    def _arun(self, location: str):
        raise NotImplementedError("WeatherTool does not support async")

class NewsTool(BaseTool):
    name = "News"
    description = "Get latest news on a topic"
    
    def _run(self, topic: str) -> str:
        # In a real scenario, you would call a news API
        return f"Latest news on {topic}: New developments have been reported recently."
    
    def _arun(self, topic: str):
        raise NotImplementedError("NewsTool does not support async")

# Initialize our tools
weather_tool = WeatherTool()
news_tool = NewsTool()
calculator_tool = CalculatorTool()  # From earlier example

# Create an agent with all our tools
tools = [weather_tool, news_tool, calculator_tool]

agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)

# Test the agent with a complex query
agent.invoke("What's the weather like in Paris? Also, can you tell me the latest news on AI? Finally, what is 17 * 38?")

### 5.3 Document Question-Answering with Sources

In [None]:
from langchain.chains import RetrievalQAWithSourcesChain

# Create a QA chain that returns sources
qa_with_sources_chain = RetrievalQAWithSourcesChain.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectordb.as_retriever()
)

# Ask a question
response = qa_with_sources_chain.invoke({"question": "What are the key features of LangChain?"})
print("Answer:", response["answer"])
print("\nSources:", response["sources"])

## 6. Best Practices

Here are some best practices for working with LangChain:

### 6.1 Prompt Engineering

In [None]:
# Example of a well-structured prompt template
from langchain.prompts import PromptTemplate

good_prompt = PromptTemplate(
    template="""
You are an expert {profession}.

Context: {context}

Based on the context provided, please answer the following question in a {tone} tone. 
Be {detail_level} in your response.

Question: {question}

Answer:
""",
    input_variables=["profession", "context", "tone", "detail_level", "question"]
)

formatted_prompt = good_prompt.format(
    profession="data scientist",
    context="This is a dataset of customer purchase history over 5 years.",
    tone="professional",
    detail_level="detailed",
    question="What trends can we identify in this data?"
)

print(formatted_prompt)

### 6.2 Handling Rate Limits and Errors

In [None]:
import time
import random
from typing import Optional, List, Mapping, Any

# Example function showing how to handle rate limits
def safe_llm_call(llm, prompt, max_retries=3):
    retries = 0
    while retries < max_retries:
        try:
            return llm.invoke(prompt)
        except Exception as e:
            if "rate limit" in str(e).lower():
                wait_time = (2 ** retries) + random.random()  # Exponential backoff
                print(f"Rate limit hit. Waiting {wait_time:.2f} seconds...")
                time.sleep(wait_time)
                retries += 1
            else:
                print(f"Error: {str(e)}")
                return "An error occurred while processing your request."
    
    return "Maximum retries exceeded. Please try again later."

# Test with a simple prompt
result = safe_llm_call(llm, "Tell me a short joke about programming")
print(result)

### 6.3 Evaluating LLM Responses

In [None]:
from langchain.evaluation import load_evaluator

# Create an evaluator
evaluator = load_evaluator("criteria", criteria="correctness", llm=llm)

# Define a prediction and reference answer
prediction = "The capital of France is Paris."
reference = "Paris is the capital city of France."

# Evaluate the prediction
evaluation_result = evaluator.evaluate_strings(
    prediction=prediction,
    reference=reference
)

print("Evaluation Result:", evaluation_result)

## 7. Project Implementation

Let's put everything together to build a complete project: a document Q&A chatbot with memory and sources.

In [None]:
import os
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chat_models import ChatOpenAI
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain
from langchain.callbacks.base import BaseCallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

class DocumentQASystem:
    def __init__(self, document_path, model_name="gpt-3.5-turbo"):
        # Initialize the language model
        callback_manager = BaseCallbackManager([StreamingStdOutCallbackHandler()])
        self.llm = ChatOpenAI(
            model_name=model_name,
            temperature=0,
            streaming=True,
            callback_manager=callback_manager
        )
        
        # Load and process documents
        self.documents = self._load_documents(document_path)
        self.vectorstore = self._create_vectorstore(self.documents)
        
        # Set up memory and chain
        self.memory = ConversationBufferMemory(
            memory_key="chat_history",
            return_messages=True
        )
        
        self.qa_chain = ConversationalRetrievalChain.from_llm(
            llm=self.llm,
            retriever=self.vectorstore.as_retriever(),
            memory=self.memory,
            return_source_documents=True
        )
    
    def _load_documents(self, document_path):
        # Load the document
        loader = TextLoader(document_path)
        documents = loader.load()
        
        # Split the text into chunks
        text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
        return text_splitter.split_documents(documents)
    
    def _create_vectorstore(self, documents):
        # Create embeddings and store in vector database
        embeddings = OpenAIEmbeddings()
        return Chroma.from_documents(documents, embeddings)
    
    def ask(self, question):
        # Get response from QA chain
        result = self.qa_chain({"question": question})
        
        # Extract answer and sources
        answer = result["answer"]
        source_docs = result["source_documents"]
        
        # Format sources
        sources = []
        for i, doc in enumerate(source_docs):
            source = f"Source {i+1}: " + doc.page_content[:100] + "..."
            sources.append(source)
        
        return {
            "answer": answer,
            "sources": sources
        }

# Create a sample document
with open("knowledge_base.txt", "w") as f:
    f.write("""
LangChain is a framework for developing applications powered by language models.
It enables applications that are context-aware and reason-driven.

LangChain consists of several core modules:
1. Models: Interfaces to language models from various providers
2. Prompts: Templates and techniques for effective prompting
3. Memory: State persistence between chain or agent calls
4. Indexes: Techniques to structure documents for efficient retrieval
5. Chains: Sequences of operations for common tasks
6. Agents: LLMs that can use tools based on user inputs

One of the key advantages of LangChain is its flexibility. Developers can use the entire framework or just the needed components.

LangChain applications can connect to various data sources including APIs, databases, and file systems.

The framework supports both Python and JavaScript/TypeScript languages.

LangChain was created by Harrison Chase and has gained significant popularity in the AI developer community.

Recent updates to LangChain have improved its documentation, added more integrations, and enhanced its agent capabilities.

To get started with LangChain, users typically install the package, set up their API keys, and begin with simple chains before moving to more complex applications.
"""
    )

# Initialize our QA system
qa_system = DocumentQASystem("knowledge_base.txt")

# Test with some questions
print("\nQuestion 1: What is LangChain?")
response = qa_system.ask("What is LangChain?")
print("\nSources:")
for source in response["sources"]:
    print(f"- {source}")

print("\nQuestion 2: What are the core modules of LangChain?")
response = qa_system.ask("What are the core modules of LangChain?")
print("\nSources:")
for source in response["sources"]:
    print(f"- {source}")

print("\nQuestion 3: Who created LangChain?")
response = qa_system.ask("Who created LangChain?")
print("\nSources:")
for source in response["sources"]:
    print(f"- {source}")

## Conclusion

In this comprehensive course, we've covered the fundamentals of LangChain and how to use it to build powerful LLM-powered applications. We started with the basics of LangChain components and gradually moved to more complex applications.

Here's what we've learned:

1. **LangChain Components**:
   - Models (LLMs, Chat models)
   - Prompts and prompt engineering
   - Memory for maintaining context
   - Chains for combining operations
   - Agents and tools for complex reasoning

2. **Building Applications**:
   - Question-answering systems
   - Chatbots with memory
   - Document analysis and summarization
   - Multi-tool agents

3. **Best Practices**:
   - Effective prompt engineering
   - Error handling and rate limiting
   - Evaluation of LLM outputs

### Next Steps

To continue your LangChain journey:

1. Explore the [official LangChain documentation](https://python.langchain.com/docs/get_started/introduction)
2. Join the LangChain Discord community
3. Experiment with different models (Anthropic, Hugging Face, etc.)
4. Build more complex applications with custom agents and tools

Remember that LangChain is rapidly evolving, so keep an eye on updates and new features!