# Practical LangChain Course for Beginners: Real-life Examples (Python)

## Table of Contents

- Introduction to LangChain
- Setting Up Your Environment
- LangChain Core Concepts
- Working with LLMs
- Prompt Engineering with LangChain
- Building Conversational Agents
- Document Processing and Retrieval
- Real-life Example: Building a Personal Knowledge Base
- Real-life Example: Creating a Customer Support Bot
- Real-life Example: Data Analysis Assistant
- Troubleshooting and Best Practices
- Next Steps and Advanced Concepts

## Introduction to LangChain

### What is LangChain?

LangChain is a framework designed to simplify the development of applications powered by language models. It provides a standard interface for chains, lots of integrations with other tools, and end-to-end chains for common applications.

### Why Use LangChain?

- **Unified Interface:** Consistent ways to interact with different language models
- **Components:** Reusable modules for working with language models
- **Pre-built Chains:** Standard implementations for common use cases
- **Integration:** Easy connection with external data sources and tools

### Evolution and Current State

LangChain has evolved from a simple tool for chaining prompts to a robust framework for building complex LLM applications. The latest version emphasizes composability, modularity, and integration with various tools and data sources.


### Setting Up Your Environment
Installation

In [26]:
# Create a virtual environment (recommended)
#python -m venv langchain-env
#source langchain-env/bin/activate  # On Windows, use: langchain-env\Scripts\activate

# Install LangChain and related packages
#! pip install langchain openai tiktoken unstructured

In [39]:
#!pip install langchain-community

### API Key Configuration

In [29]:
#!pip install python-dotenv

In [31]:
# Set up API keys (best to use environment variables)
import os
from dotenv import load_dotenv

load_dotenv()

api_key = os.getenv("OPENAI_API_KEY")
if api_key is None:
    raise ValueError("OPENAI_API_KEY is not set. Check your .env file.")
else:
    print("API key loaded successfully!")


API key loaded successfully!


### Testing Your Setup

In [None]:
# Basic example of using LangChain
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

# Initialize the language model
llm = OpenAI(temperature=0.7)

# Create a prompt template
prompt = PromptTemplate(
    input_variables=["product"],
    template="What is a good name for a company that makes {product}?",
)

# Create a chain
chain = LLMChain(llm=llm, prompt=prompt)

# Run the chain
result = chain.run("eco-friendly water bottles")
print(result)

### LangChain Core Concepts

#### Components Architecture

LangChain's modular architecture consists of several key components:

- **Models:** Wrappers around LLMs (OpenAI, Anthropic, etc.)
- **Prompts:** Templates and management for model inputs
- **Indexes:** Tools for structuring document data
- **Chains:** Sequences of operations for specific tasks
- **Agents:** Dynamic chains that use LLMs to determine actions
- **Memory:** State retention between chain runs

#### Data Connections

LangChain provides connectors for various data sources:


In [None]:
# Example of loading data from a text file
from langchain.document_loaders import TextLoader

loader = TextLoader("./data/sample_text.txt")
documents = loader.load()
print(f"Loaded {len(documents)} document(s)")
print(f"Preview: {documents[0].page_content[:100]}...")

#### Understanding Chains

Chains combine multiple components to create end-to-end applications:


In [None]:
from langchain.chains import SimpleSequentialChain
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI

# First chain: Generate a product idea
llm = OpenAI(temperature=0.7)
prompt_template_1 = PromptTemplate(
    input_variables=["subject"],
    template="Generate an innovative product idea related to {subject}.",
)
chain_1 = LLMChain(llm=llm, prompt=prompt_template_1)

# Second chain: Generate marketing copy for the product
prompt_template_2 = PromptTemplate(
    input_variables=["product_idea"],
    template="Write a short marketing description for this product: {product_idea}",
)
chain_2 = LLMChain(llm=llm, prompt=prompt_template_2)

# Combine chains in a sequence
overall_chain = SimpleSequentialChain(chains=[chain_1, chain_2], verbose=True)

# Run the chain
result = overall_chain.run("renewable energy")

### Working with LLMs

#### Supported Models

LangChain supports various LLM providers:

- OpenAI (GPT models)
- Anthropic (Claude)
- Hugging Face models
- Local LLMs (via APIs like LlamaCpp)


In [None]:
# Using OpenAI
from langchain.llms import OpenAI
openai_llm = OpenAI(model_name="gpt-3.5-turbo", temperature=0)

# Using Anthropic
from langchain.llms import Anthropic
anthropic_llm = Anthropic(model="claude-2", temperature=0)

# Using a Hugging Face model
from langchain.llms import HuggingFaceHub
huggingface_llm = HuggingFaceHub(
    repo_id="google/flan-t5-xl", 
    model_kwargs={"temperature": 0.5, "max_length": 64}
)

### Model Parameters
Understanding key parameters for controlling LLM behavior:

In [None]:
from langchain.llms import OpenAI

# Temperature controls randomness (0 = deterministic, 1 = creative)
deterministic_llm = OpenAI(temperature=0)
creative_llm = OpenAI(temperature=0.9)

# Max tokens controls response length
concise_llm = OpenAI(max_tokens=50)
detailed_llm = OpenAI(max_tokens=500)

# Examples
prompt = "Explain quantum computing"
print("Deterministic:", deterministic_llm(prompt))
print("Creative:", creative_llm(prompt))
print("Concise:", concise_llm(prompt))
print("Detailed:", detailed_llm(prompt))

### Streaming Responses
For better user experience with longer outputs:

In [None]:
from langchain.llms import OpenAI
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

# Initialize LLM with streaming
llm = OpenAI(
    streaming=True, 
    callbacks=[StreamingStdOutCallbackHandler()],
    temperature=0.7
)

# This will stream the response token by token
llm("Write a short story about a robot learning to paint")

### Prompt Engineering with LangChain
- Prompt Templates
Creating reusable, parameterized prompts:

In [None]:
from langchain.prompts import PromptTemplate

# Simple template with one variable
template = PromptTemplate(
    input_variables=["topic"],
    template="Provide three interesting facts about {topic}."
)

# Generate a prompt
prompt = template.format(topic="deep sea creatures")
print(prompt)

# Multiple variables
multi_template = PromptTemplate(
    input_variables=["product", "audience", "tone"],
    template="Write a {tone} advertisement for {product} targeted at {audience}."
)

ad_prompt = multi_template.format(
    product="smart water bottle",
    audience="fitness enthusiasts",
    tone="energetic"
)
print(ad_prompt)

### Few-Shot Learning
Using examples to guide model responses:

In [None]:
from langchain.prompts import FewShotPromptTemplate, PromptTemplate

# Define example formatter
example_formatter_template = """
Input: {query}
Output: {answer}
"""
example_prompt = PromptTemplate(
    input_variables=["query", "answer"],
    template=example_formatter_template
)

# Define examples
examples = [
    {"query": "What is the capital of France?", "answer": "The capital of France is Paris."},
    {"query": "Who wrote Romeo and Juliet?", "answer": "William Shakespeare wrote Romeo and Juliet."},
    {"query": "What is the boiling point of water?", "answer": "The boiling point of water is 100 degrees Celsius."}
]

# Create the few-shot prompt template
few_shot_prompt = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix="Answer the following questions based on these examples:",
    suffix="Input: {query}\nOutput:",
    input_variables=["query"],
    example_separator="\n\n"
)

# Format the prompt with our query
query = "Who painted the Mona Lisa?"
print(few_shot_prompt.format(query=query))

# Use with an LLM
from langchain.llms import OpenAI
llm = OpenAI(temperature=0)
response = llm(few_shot_prompt.format(query=query))
print(response)

### Output Parsers
Structuring LLM outputs:

In [None]:
from langchain.output_parsers import PydanticOutputParser
from langchain.prompts import PromptTemplate
from pydantic import BaseModel, Field
from typing import List

# Define the structure we want
class Movie(BaseModel):
    title: str = Field(description="Title of the movie")
    director: str = Field(description="Director of the movie")
    year: int = Field(description="Year the movie was released")
    genres: List[str] = Field(description="Genres of the movie")

# Create a parser
parser = PydanticOutputParser(pydantic_object=Movie)

# Create a prompt template
template = """
Provide information about a movie based on this description: {description}

{format_instructions}
"""
prompt = PromptTemplate(
    template=template,
    input_variables=["description"],
    partial_variables={"format_instructions": parser.get_format_instructions()}
)

# Format the prompt
input_text = prompt.format(description="A sci-fi film directed by Christopher Nolan featuring dreams within dreams")

# Get the LLM response
from langchain.llms import OpenAI
llm = OpenAI(temperature=0)
output = llm(input_text)

# Parse the output
movie = parser.parse(output)
print(f"Title: {movie.title}")
print(f"Director: {movie.director}")
print(f"Year: {movie.year}")
print(f"Genres: {', '.join(movie.genres)}")

### Building Conversational Agents
#### Implementing Memory
Adding context retention to conversations:

In [None]:
from langchain.memory import ConversationBufferMemory
from langchain.llms import OpenAI
from langchain.chains import ConversationChain

# Initialize conversation with memory
conversation = ConversationChain(
    llm=OpenAI(temperature=0.7),
    memory=ConversationBufferMemory(),
    verbose=True
)

# First interaction
response1 = conversation.predict(input="Hi there! My name is Alice.")
print(response1)

# Second interaction (the model remembers the name)
response2 = conversation.predict(input="What's my name?")
print(response2)

# Check what's in memory
print(conversation.memory.buffer)

### Different Memory Types
#### LangChain offers various memory implementations:

In [None]:
# Buffer Memory (stores all interactions)
from langchain.memory import ConversationBufferMemory
buffer_memory = ConversationBufferMemory()

# Summary Memory (stores a summary of conversation)
from langchain.memory import ConversationSummaryMemory
from langchain.llms import OpenAI
summary_memory = ConversationSummaryMemory(llm=OpenAI())

# Entity Memory (remembers specific entities)
from langchain.memory import ConversationEntityMemory
entity_memory = ConversationEntityMemory(llm=OpenAI())

# Example with entity memory
from langchain.chains import ConversationChain
conversation = ConversationChain(
    llm=OpenAI(temperature=0.7),
    memory=entity_memory,
    verbose=True
)

response1 = conversation.predict(input="My favorite food is pizza")
response2 = conversation.predict(input="I have a pet dog named Max")
response3 = conversation.predict(input="What's my favorite food? And what's my pet's name?")
print(response3)

### Tools and Agents
Enabling LLMs to use tools and make decisions:

In [None]:
from langchain.agents import load_tools, initialize_agent, AgentType
from langchain.llms import OpenAI

# Load tools (requires SERPAPI_API_KEY for search)
tools = load_tools(["serpapi", "llm-math"], llm=OpenAI(temperature=0))

# Initialize agent
agent = initialize_agent(
    tools, 
    OpenAI(temperature=0), 
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)

# Run the agent
agent.run("What was the highest grossing movie of 2022, and what is the square root of its worldwide box office gross in billions?")

### Document Processing and Retrieval
#### Document Loading
Loading documents from various sources:

In [None]:
# Text files
from langchain.document_loaders import TextLoader
text_loader = TextLoader("./data/sample.txt")
text_docs = text_loader.load()

# PDFs
from langchain.document_loaders import PyPDFLoader
pdf_loader = PyPDFLoader("./data/sample.pdf")
pdf_docs = pdf_loader.load()

# Web pages
from langchain.document_loaders import WebBaseLoader
web_loader = WebBaseLoader("https://www.example.com")
web_docs = web_loader.load()

# CSV files
from langchain.document_loaders.csv_loader import CSVLoader
csv_loader = CSVLoader("./data/sample.csv")
csv_docs = csv_loader.load()

#### Document Splitting
Breaking documents into manageable chunks:

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Initialize the text splitter
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,  # Number of characters per chunk
    chunk_overlap=200,  # Overlap between chunks to maintain context
    length_function=len
)

# Load a document
from langchain.document_loaders import TextLoader
loader = TextLoader("./data/long_document.txt")
document = loader.load()[0]

# Split the document into chunks
chunks = text_splitter.split_documents([document])
print(f"Split into {len(chunks)} chunks")

# Preview the first chunk
print(f"First chunk: {chunks[0].page_content[:100]}...")

#### Vector Stores
Creating searchable document embeddings:

In [None]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain.document_loaders import TextLoader

# Load document
loader = TextLoader("./data/state_of_the_union.txt")
documents = loader.load()

# Split document into chunks
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

# Create embeddings and store in vector database
embeddings = OpenAIEmbeddings()
db = Chroma.from_documents(docs, embeddings)

# Perform a similarity search
query = "What did the president say about healthcare?"
docs = db.similarity_search(query)
print(docs[0].page_content)

#### Retrieval QA Chains
Combining document retrieval with LLM question answering:

In [None]:
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI
from langchain.document_loaders import TextLoader
from langchain.indexes import VectorstoreIndexCreator

# Load document and create an index
loader = TextLoader("./data/company_info.txt")
index = VectorstoreIndexCreator().from_loaders([loader])

# Create a question-answering chain
qa = RetrievalQA.from_chain_type(
    llm=OpenAI(),
    chain_type="stuff",
    retriever=index.vectorstore.as_retriever()
)

# Ask questions
query = "What is the company's mission statement?"
response = qa.run(query)
print(response)

### Real-life Example: Building a Personal Knowledge Base
Let's build a personal knowledge base that can process multiple document types and answer questions:

In [None]:
import os
from langchain.document_loaders import TextLoader, PyPDFLoader, CSVLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI

def build_knowledge_base(directory_path):
    # List to store all documents
    all_documents = []
    
    # Process all files in the directory
    for file in os.listdir(directory_path):
        file_path = os.path.join(directory_path, file)
        
        # Skip directories
        if os.path.isdir(file_path):
            continue
            
        # Process based on file extension
        if file.endswith(".txt"):
            loader = TextLoader(file_path)
            all_documents.extend(loader.load())
        elif file.endswith(".pdf"):
            loader = PyPDFLoader(file_path)
            all_documents.extend(loader.load())
        elif file.endswith(".csv"):
            loader = CSVLoader(file_path)
            all_documents.extend(loader.load())
    
    # Split documents into chunks
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=1000,
        chunk_overlap=200
    )
    split_documents = text_splitter.split_documents(all_documents)
    
    # Create embeddings and vector store
    embeddings = OpenAIEmbeddings()
    vector_store = Chroma.from_documents(
        documents=split_documents,
        embedding=embeddings,
        persist_directory="./knowledge_base_data"
    )
    
    # Create a retrieval QA chain
    qa_chain = RetrievalQA.from_chain_type(
        llm=OpenAI(),
        chain_type="stuff",
        retriever=vector_store.as_retriever()
    )
    
    return qa_chain

# Example usage
if __name__ == "__main__":
    kb = build_knowledge_base("./documents")
    
    # Interactive query loop
    while True:
        query = input("\nAsk a question (or type 'exit' to quit): ")
        if query.lower() == 'exit':
            break
        
        response = kb.run(query)
        print("\nAnswer:", response)

To use this system:

- Create a directory called documents
- Add various files (text, PDF, CSV) containing your personal knowledge
- Run the script and ask questions about your documents

### Real-life Example: Creating a Customer Support Bot
Let's create a support bot for a fictional product:

In [None]:
from langchain.prompts import PromptTemplate
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain
from langchain.llms import OpenAI
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA

# Load product documentation
loader = TextLoader("./data/product_manual.txt")
documents = loader.load()

# Split documents
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

# Create embeddings and vector store
embeddings = OpenAIEmbeddings()
db = Chroma.from_documents(docs, embeddings)

# Create a retrieval QA system
qa = RetrievalQA.from_chain_type(
    llm=OpenAI(temperature=0),
    chain_type="stuff",
    retriever=db.as_retriever()
)

# Define the support bot template
template = """
You are a helpful customer support agent for our product. 
Use the following information to answer the customer's question. 
If you don't know the answer based on the provided information, 
say "I'm not sure about that, but I can connect you with a human agent."

Previous conversation:
{chat_history}

Customer: {input}
AI Assistant:"""

prompt = PromptTemplate(
    input_variables=["chat_history", "input"],
    template=template
)

memory = ConversationBufferMemory(memory_key="chat_history")
conversation = ConversationChain(
    llm=OpenAI(temperature=0.7),
    prompt=prompt,
    memory=memory,
    verbose=True
)

# Function to handle customer queries
def handle_query(query):
    # First try to find information from documentation
    try:
        doc_answer = qa.run(query)
        
        # Use the retrieved information to enhance the conversation
        enhanced_query = f"Based on our product information: {doc_answer}\n\nCustomer query: {query}"
        response = conversation.predict(input=enhanced_query)
    except:
        # If retrieval fails, fall back to conversation only
        response = conversation.predict(input=query)
        
    return response

# Example interaction
if __name__ == "__main__":
    print("Customer Support Bot (type 'exit' to quit)")
    print("=" * 50)
    
    while True:
        user_input = input("\nCustomer: ")
        if user_input.lower() == 'exit':
            break
            
        response = handle_query(user_input)
        print(f"\nSupport Agent: {response}")

For this example, create a product_manual.txt file with information about your product's features, troubleshooting, and FAQs.

### Real-life Example: Data Analysis Assistant
Let's create an assistant that can help analyze CSV data:

In [None]:
import pandas as pd
from langchain.agents import create_pandas_dataframe_agent
from langchain.llms import OpenAI

def create_data_analysis_assistant(csv_file_path):
    # Load the CSV file into a pandas DataFrame
    df = pd.read_csv(csv_file_path)
    
    # Print basic information about the dataset
    print(f"Loaded CSV with {len(df)} rows and {len(df.columns)} columns")
    print("Columns:", df.columns.tolist())
    
    # Create a pandas DataFrame agent
    agent = create_pandas_dataframe_agent(
        llm=OpenAI(temperature=0),
        df=df,
        verbose=True
    )
    
    return agent

# Example usage
if __name__ == "__main__":
    agent = create_data_analysis_assistant("./data/sales_data.csv")
    
    print("\nData Analysis Assistant Ready!")
    print("You can ask questions about your CSV data.")
    print("Example: 'What is the total revenue?' or 'Show me trends by month'")
    print("=" * 60)
    
    while True:
        question = input("\nYour question (or type 'exit' to quit): ")
        if question.lower() == 'exit':
            break
            
        try:
            response = agent.run(question)
            print("\nResult:", response)
        except Exception as e:
            print(f"Error: {str(e)}")

This assistant can answer questions like:

- "What is the average sales amount?"
- "Which product category had the highest revenue?"
- "Show me a breakdown of sales by region"
- "What was the trend in monthly sales for 2022?"