[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain-academy/blob/main/session-4/why-langchain-and-langgraph.ipynb) [![Open in LangChain Academy](https://cdn.prod.website-files.com/65b8cd72835ceeacd4449a53/66e9eba12c7b7688aa3dbb5e_LCA-badge-green.svg)](https://academy.langchain.com/courses/take/intro-to-langgraph/lessons/58239974-lesson-4-why-langchain-and-langgraph)

# Why LangChain and LangGraph?

## 🎯 Learning Goals

This notebook demonstrates the **evolution from raw LLM calls to sophisticated AI workflows**:

1. **Plain LLM Calls** → **LangChain**: See how LangChain simplifies common patterns
2. **LangChain** → **LangGraph**: Understand when you need stateful, multi-step workflows
3. **Real-world Examples**: Document Q&A, Research Assistant, and Multi-Agent Systems

## 🧠 The Problem

As AI applications grow more complex, developers face increasing challenges:

- **Repetitive boilerplate code** for common patterns
- **No standardization** across teams and projects
- **Complex state management** for multi-step workflows
- **Difficulty in testing, debugging, and monitoring** AI pipelines

Let's see how LangChain and LangGraph solve these problems step by step.

In [1]:
%%capture --no-stderr
%pip install --quiet -U langgraph langchain_openai langchain_community langchain_core tavily-python wikipedia faiss-cpu scikit-learn

## Setup

In [2]:
# Load environment variables from .env file
from dotenv import load_dotenv
import os

# Load the .env file
load_dotenv()

# Verify the keys are loaded (optional - remove in production)
print("✅ Environment variables loaded:")
print(f"OPENAI_API_KEY: {'✓ Set' if os.environ.get('OPENAI_API_KEY') else '✗ Missing'}")


✅ Environment variables loaded:
OPENAI_API_KEY: ✓ Set


In [3]:
from langchain_openai import ChatOpenAI
import openai

# We'll use both direct OpenAI client and LangChain for comparison
openai_client = openai.OpenAI()
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

# 📚 Example 1: Plain LLM Call vs LangChain

## 🧠 Task: Document Q&A
"Answer a question about a document uploaded by the user."

This is one of the most common AI use cases. Let's see how the approach evolves from naive implementation to LangChain.

## ⚙️ A. Plain LLM Call (Naive Implementation)

### 😫 What you have to do manually:

In [4]:
# First, let's create some sample documents
sample_documents = [
    "LangChain is a framework for developing applications powered by language models. It enables applications that are data-aware and agentic.",
    "LangGraph is a library for building stateful, multi-actor applications with LLMs. It extends LangChain with the ability to coordinate multiple chains across multiple steps.",
    "Vector databases store high-dimensional vectors and allow for efficient similarity search. They are essential for RAG (Retrieval Augmented Generation) applications.",
    "RAG combines the power of retrieval systems with generative models. It retrieves relevant documents and uses them as context for generating responses."
]

print("Sample documents loaded:")
for i, doc in enumerate(sample_documents, 1):
    print(f"{i}. {doc[:50]}...")

Sample documents loaded:
1. LangChain is a framework for developing applicatio...
2. LangGraph is a library for building stateful, mult...
3. Vector databases store high-dimensional vectors an...
4. RAG combines the power of retrieval systems with g...


In [5]:
# Naive implementation - lots of manual work!
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

class NaiveDocumentQA:
    def __init__(self, documents):
        self.documents = documents
        # Manual embedding with TF-IDF (not even proper embeddings!)
        self.vectorizer = TfidfVectorizer()
        self.doc_vectors = self.vectorizer.fit_transform(documents)
    
    def answer_question(self, question):
        # Manual retrieval
        question_vector = self.vectorizer.transform([question])
        similarities = cosine_similarity(question_vector, self.doc_vectors)
        best_doc_idx = np.argmax(similarities)
        
        # Manual context formatting
        context = self.documents[best_doc_idx]
        
        # Manual prompt construction
        prompt = f"""Answer the question based on the context below.
        
Context: {context}

Question: {question}

Answer:"""
        
        # Manual LLM call
        response = openai_client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": prompt}],
            temperature=0
        )
        
        return response.choices[0].message.content

# Test the naive implementation
naive_qa = NaiveDocumentQA(sample_documents)
result = naive_qa.answer_question("What is LangGraph?")
print("Naive Implementation Result:")
print(result)

Naive Implementation Result:
LangGraph is a library designed for building stateful, multi-actor applications using large language models (LLMs). It enhances LangChain by enabling the coordination of multiple chains across various steps.


### 😫 Pain Points with Naive Implementation:

1. **Manual glue logic** - You write all the embedding, retrieval, and formatting code
2. **No prompt templating** - Hard-coded strings everywhere
3. **Poor embeddings** - Using TF-IDF instead of proper semantic embeddings
4. **No memory, retry, logging** - Zero observability or error handling
5. **Not modular** - Hard to swap components or test individual parts
6. **No optimization** - No chunk sizing, overlap, or retrieval tuning

## ⚙️ B. LangChain Version

### ✅ Let's see how LangChain simplifies this:

In [6]:
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain.chains import RetrievalQA
from langchain.docstore.document import Document

# Convert strings to Document objects
docs = [Document(page_content=doc) for doc in sample_documents]

# Create embeddings and vector store - one line!
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(docs, embeddings)

# Create QA chain - just a few lines!
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever()
)

# Ask the same question
result = qa_chain.run("What is LangGraph?")
print("LangChain Implementation Result:")
print(result)

  result = qa_chain.run("What is LangGraph?")


LangChain Implementation Result:
LangGraph is a library for building stateful, multi-actor applications with large language models (LLMs). It extends LangChain by enabling the coordination of multiple chains across multiple steps.


### ✅ Benefits of LangChain:

1. **Powerful Abstractions**: `RetrievalQA`, `Retriever`, `VectorStore`
2. **Modular Design**: Easy to swap LLMs, embeddings, or vector stores
3. **Built-in Best Practices**: Proper embeddings, prompt templates, error handling
4. **Observability**: Built-in logging, tracing, and evaluation tools
5. **Community**: Pre-built integrations with 100+ services
6. **Maintenance**: Framework handles updates and optimizations

**🧠 Why it's intuitive:** Everyone has tried to make LLMs answer questions about docs. LangChain shows how 2–3 lines replace a full pipeline.

# 🎯 Summary: The Evolution

## 📈 Complexity vs Capability

| Approach | Complexity | Capabilities | Best For |
|----------|------------|-------------|----------|
| **Plain LLM** | 🟢 Low | 🔴 Limited | Simple, one-off queries |
| **LangChain** | 🟡 Medium | 🟡 Good | Standard patterns, moderate complexity |
| **LangGraph** | 🔴 High | 🟢 Excellent | Complex workflows, multi-step processes |

## 🚀 When to Use What?

### Use **Plain LLM Calls** when:
- Simple, one-time queries
- Prototyping and experimentation
- Full control over every aspect
- Minimal dependencies

### Use **LangChain** when:
- Document Q&A, summarization, translation
- Need standard patterns and abstractions
- Want community integrations
- Linear workflows are sufficient

### Use **LangGraph** when:
- Multi-step, conditional workflows
- Need state management between steps
- Human-in-the-loop requirements
- Multiple agents working together
- Error recovery and retry logic
- Complex business processes

# 🛠️ Next Steps

## Try It Yourself:

1. **Start with LangChain**: Build a document Q&A system for your domain
2. **Add Complexity**: When you need multi-step workflows, explore LangGraph
3. **Production**: Add monitoring, error handling, and human oversight

## Resources:

- [LangChain Documentation](https://docs.langchain.com/)
- [LangGraph Documentation](https://langchain-ai.github.io/langgraph/)
- [LangChain Academy](https://academy.langchain.com/)
- [Example Applications](https://github.com/langchain-ai/langchain/tree/master/templates)

**Remember**: Choose the right tool for the job. Sometimes a simple LLM call is all you need!