# LangChain & LangGraph Practice Notebook

This notebook is for hands-on practice with LangChain and LangGraph concepts.


In [1]:
## Enabling LangSmith tracing
import os
import getpass

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
os.environ["LANGCHAIN_PROJECT"] = "RAG-Assignment"

# Verify setup (uncomment to check)
print("LangSmith tracing enabled:", os.getenv("LANGCHAIN_TRACING_V2", "false"))
print("Project name:", os.getenv("LANGCHAIN_PROJECT", "Not set"))

LangSmith tracing enabled: true
Project name: RAG-Assignment


In [2]:
## Setup and Imports
from langgraph.graph import START, StateGraph
from typing import TypedDict
from langchain_core.documents import Document

class State(TypedDict):
  question: str
  context: list[Document]
  answer: str

In [3]:
# NOTE: We'll be using an async loader during our document ingesting - but our Jupyter Kernel 
# is already running in an asyc loop! This means we'll want the ability to *nest* async loops.
import nest_asyncio
nest_asyncio.apply()

In [4]:
# Load our documents through the PyMuPDFLoader
from langchain_community.document_loaders import DirectoryLoader
from langchain_community.document_loaders import PyMuPDFLoader

directory_loader = DirectoryLoader("data", glob="**/*.pdf", loader_cls=PyMuPDFLoader)
ai_usage_knowledge_resources = directory_loader.load()

# Check if the document was loaded
ai_usage_knowledge_resources[0].page_content[:1000] # show the first 1000 characters of the first page of the document

'NBER WORKING PAPER SERIES\nHOW PEOPLE USE CHATGPT\nAaron Chatterji\nThomas Cunningham\nDavid J. Deming\nZoe Hitzig\nChristopher Ong\nCarl Yan Shan\nKevin Wadman\nWorking Paper 34255\nhttp://www.nber.org/papers/w34255\nNATIONAL BUREAU OF ECONOMIC RESEARCH\n1050 Massachusetts Avenue\nCambridge, MA 02138\nSeptember 2025\nWe acknowledge help and comments from Joshua Achiam, Hemanth Asirvatham, Ryan \nBeiermeister, Rachel Brown, Cassandra Duchan Solis, Jason Kwon, Elliott Mokski, Kevin Rao, \nHarrison Satcher, Gawesha Weeratunga, Hannah Wong, and Analytics & Insights team. We \nespecially thank Tyna Eloundou and Pamela Mishkin who in several ways laid the foundation for \nthis work. This study was approved by Harvard IRB (IRB25-0983). A repository containing all \ncode run to produce the analyses in this paper is available on request. The views expressed herein \nare those of the authors and do not necessarily reflect the views of the National Bureau of \nEconomic Research.\nAt least one c

In [5]:
"""
Chunking: we'll use 'RecursiveCharacterTextSplitter' for the text splitting
It will split based on the following rules:

  - Each chunk has a maximum size of 1000 tokens
  - It will try and split first on the `\n\n` character, then on the `\n`, then on the `<SPACE>` character, and finally it will split on individual tokens.
"""
import tiktoken
from langchain.text_splitter import RecursiveCharacterTextSplitter


def tiktoken_len(text):
    # Using cl100k_base encoding which is a good general-purpose tokenizer
    # This works well for estimating token counts even with Ollama models
    tokens = tiktoken.get_encoding("cl100k_base").encode(
        text,
    )
    return len(tokens)


text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=750,
    chunk_overlap=0,
    length_function=tiktoken_len,
)

In [6]:
ai_usage_knowledge_chunks = text_splitter.split_documents(ai_usage_knowledge_resources)

In [7]:
"""
We'll be using Ollama's `embeddinggemma` model as our embedding model today! This is a powerful open-source embedding model that runs locally.
"""
from langchain_ollama import OllamaEmbeddings

# Using embeddinggemma which is a powerful open-source embedding model
embedding_model = OllamaEmbeddings(model="embeddinggemma:latest")
embedding_dim = 768

In [8]:
"""
Adding a vector DB - Qdrant in our case. LangChain is using the term "Vector Store" or "Vector Library".

We are going to be using an "in-memory" Qdrant client, which means that our vectors will be held in our system's memory (RAM) - 
this is useful for prototyping and developement at smaller scales - but would need to be modified when moving to production. 
Luckily for us, this modification is trivial!
"""
from langchain_qdrant import QdrantVectorStore
from qdrant_client import QdrantClient
from qdrant_client.http.models import Distance, VectorParams

client = QdrantClient(":memory:")


In [9]:
"""
Next, we need to create a collection - a collection is a specific...collection of vectors within the Qdrant client.
These are useful as they allow us to create multiple different "warehouses" in a single client, which can be leveraged for personalization and more!
Also notice that we define what our vector shapes are (embedding dim) as well as our desired distance metric.
"""
client.create_collection(
  collection_name="ai_usage_knowledge_index", 
  vectors_config=VectorParams(size=embedding_dim, distance=Distance.COSINE)
)



True

In [10]:
"""
Now we can assemble our vector database! Notice that we provide our client, our created collection, and our embedding model!
"""
vector_store = QdrantVectorStore(
  client=client,
  collection_name="ai_usage_knowledge_index",
  embedding=embedding_model,
)

In [11]:
"""
Now that we have our vector database set-up, we can add our documents into it!
"""
_ = vector_store.add_documents(documents=ai_usage_knowledge_chunks)

#### Creating a Retriever

Now that we have an idea of how we're getting our most relevant information - let's see how we could create a pipeline that would automatically extract the closest chunk to our query and use it as context for our prompt!

This will involve a popular LangChain interace known as `as_retriever`!

> NOTE: We can still specify how many documents we wish to retrieve per vector.

In [12]:
retriever = vector_store.as_retriever(search_kwargs={"k": 5})

In [13]:
retriever.invoke("How do people use AI in their daily work?")

[Document(metadata={'producer': 'macOS Version 15.4.1 (Build 24E263) Quartz PDFContext, AppendMode 1.1', 'creator': 'LaTeX with hyperref', 'creationdate': '2025-09-12T20:05:32+00:00', 'source': 'data/howpeopleuseai.pdf', 'file_path': 'data/howpeopleuseai.pdf', 'total_pages': 64, 'format': 'PDF 1.6', 'title': 'How People Use ChatGPT', 'author': '', 'subject': '', 'keywords': '', 'moddate': '2025-09-15T10:32:36-04:00', 'trapped': '', 'modDate': "D:20250915103236-04'00'", 'creationDate': 'D:20250912200532Z', 'page': 34, '_id': '86b8ff6273a145468f7b43d47f3a6f91', '_collection_name': 'ai_usage_knowledge_index'}, page_content='Panel A. Work Related\nPanel B1. Asking.\nPanel B2. Doing.\nFigure 23: (continued on next page)\n33'),
 Document(metadata={'producer': 'macOS Version 15.4.1 (Build 24E263) Quartz PDFContext, AppendMode 1.1', 'creator': 'LaTeX with hyperref', 'creationdate': '2025-09-12T20:05:32+00:00', 'source': 'data/howpeopleuseai.pdf', 'file_path': 'data/howpeopleuseai.pdf', 'total_

#### Creating the Node

We're finally ready to create our node!

In [14]:
def retrieve(state: State) -> State:
  retrieved_docs = retriever.invoke(state["question"])
  return { "context": retrieved_docs }

In [15]:
from langchain_core.prompts import ChatPromptTemplate

HUMAN_TEMPLATE = """
#CONTEXT:
{context}

QUERY:
{query}

Use the provide context to answer the provided user query. Only use the provided context to answer the query. If you do not know the answer, or it's not contained in the provided context response with "I don't know"
"""

chat_prompt = ChatPromptTemplate.from_messages([("human", HUMAN_TEMPLATE)])

In [16]:
chat_prompt.invoke({"context": "OUR CONTEXT HERE", "query": "OUR QUERY HERE"}).messages[0].content

'\n#CONTEXT:\nOUR CONTEXT HERE\n\nQUERY:\nOUR QUERY HERE\n\nUse the provide context to answer the provided user query. Only use the provided context to answer the query. If you do not know the answer, or it\'s not contained in the provided context response with "I don\'t know"\n'

In [17]:
from langchain_ollama import ChatOllama

ollama_chat_model = ChatOllama(model="gpt-oss:20b", temperature=0.6)

In [18]:
ollama_chat_model.invoke(chat_prompt.invoke({"context": "Paris is the capital of France", "query": "What is the capital of France?"}))

AIMessage(content='Paris', additional_kwargs={}, response_metadata={'model': 'gpt-oss:20b', 'created_at': '2025-09-22T21:06:50.172157Z', 'done': True, 'done_reason': 'stop', 'total_duration': 32457728708, 'load_duration': 27535166958, 'prompt_eval_count': 132, 'prompt_eval_duration': 3301341916, 'eval_count': 47, 'eval_duration': 1620068917, 'model_name': 'gpt-oss:20b'}, id='run--5ca55fe4-251d-4b47-a779-6e4fdeeb427b-0', usage_metadata={'input_tokens': 132, 'output_tokens': 47, 'total_tokens': 179})

In [19]:
from langchain_core.output_parsers import StrOutputParser

generator_chain = chat_prompt | ollama_chat_model | StrOutputParser()

generator_chain.invoke(
    {
        "context": "Paris is the capital of France",
        "query": "What is the capital of France?",
    }
)


'Paris'

In [20]:
def generate(state: State) -> State:
    generator_chain = chat_prompt | ollama_chat_model | StrOutputParser()
    response = generator_chain.invoke(
        {
            "query": state["question"],
            "context": state["context"]
        }
    )
    return { "response": response }

In [21]:
graph_builder = StateGraph(State)

In [22]:
graph_builder.add_sequence([retrieve, generate])

<langgraph.graph.state.StateGraph at 0x1680a7e60>

In [23]:
graph_builder.add_edge(START, retrieve)

<langgraph.graph.state.StateGraph at 0x1680a7e60>

In [24]:
graph = graph_builder.compile()

ValueError: Found edge ending at unknown node `<function retrieve at 0x127f22840>`