# 2.5 RAG Question & Answering

![RAG - query pipeline](https://python.langchain.com/assets/images/rag_retrieval_generation-1046a4668d6bb08786ef73c56d4f228a.png)

## Setup

### Install dependencies

In [1]:
%pip install python-dotenv~=1.0 docarray~=0.40.0 pypdf~=5.1 --upgrade --quiet
%pip install chromadb~=0.5.18 sentence-transformers~=3.3 --upgrade --quiet 
%pip install langchain~=0.3.7 langchain_openai~=0.2.6 langchain_community~=0.3.5 --upgrade --quiet

# If running locally, you can do this instead:
#%pip install -r ../requirements.txt


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


### Load environment variables

In [2]:
import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

# If running in Google Colab, you can use this code instead:
# from google.colab import userdata
# os.environ["AZURE_OPENAI_API_KEY"] = userdata.get("AZURE_OPENAI_API_KEY")
# os.environ["AZURE_OPENAI_ENDPOINT"] = userdata.get("AZURE_OPENAI_ENDPOINT")

### Setup Chat Model

In [3]:
from langchain_openai import AzureChatOpenAI, AzureOpenAIEmbeddings
api_version = "2024-10-01-preview"
llm = AzureChatOpenAI(deployment_name="gpt-4o", temperature=0.0, openai_api_version=api_version)
embedding_model = AzureOpenAIEmbeddings(model="text-embedding-3-large", openai_api_version=api_version)

### Setup LangSmith tracing for this notebook

In [4]:
import os

# API key etc is in the .env file
# my_name = "Totoro"
# os.environ["LANGCHAIN_TRACING_V2"] = "true"
# os.environ["LANGCHAIN_PROJECT"] = f"tokyo24-test-{my_name}"

### Setup path to data 

In [5]:
data_path = "../data"

## Initialize VectorDB

We've discussed `Document Loading` and `Splitting` as well as `Indexing` and `Retrieval` already.

Let's load our vectorDB and set it up as in chapter 2.3. _If you already have a persisted vectorDB, you can skip to "Vector DB" below._

### Load docs

In [6]:
from langchain.document_loaders import PyPDFLoader

# Load PDFs
loaders = [
    PyPDFLoader(f"{data_path}/MachineLearning-Lecture01.pdf"),
    PyPDFLoader(f"{data_path}/MachineLearning-Lecture01.pdf"),
    PyPDFLoader(f"{data_path}/MachineLearning-Lecture03.pdf")
]
docs = []
for loader in loaders:
    docs.extend(loader.load())

### Split docs

In [7]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 1500,
    chunk_overlap = 150
)
splits = text_splitter.split_documents(docs)

### Vector DB - Indexing / Store


In [8]:
from langchain.vectorstores import Chroma

# Optional persist_directory to save the database
persist_directory = './db/chroma-ML-docs/'

vectordb = Chroma.from_documents(
    collection_name="ml_docs",
    documents=splits,
    embedding=embedding_model,
    #persist_directory=persist_directory # Optionally persist the database
)

In [9]:
print(vectordb._collection.count())

161


In [10]:
question = "What are major topics for this class?"
docs = vectordb.similarity_search(question,k=3)
len(docs)

3

## Create a RAG chain

### Simple RAG chain

In [11]:
from typing import List
from langchain_core.documents import Document
from langchain_core.output_parsers import StrOutputParser

def format_docs(docs: List[Document]) -> str:
    return "\n\n".join(doc.page_content for doc in docs)

# Setup chain using LCEL
qa_chain = (
        vectordb.as_retriever()
        | format_docs 
        | llm
        | StrOutputParser()
)

In [12]:
qa_chain.invoke(question)

'This passage seems to be from a lecture or a course introduction, possibly related to a machine learning class. The speaker shares an anecdote about a former student who found the course valuable, particularly because of learning MATLAB, which is a programming environment often used for numerical computing and data analysis. The speaker emphasizes the importance of MATLAB and mentions that there will be a tutorial for those unfamiliar with it.\n\nThe passage also outlines the structure and purpose of the discussion sections in the course. These sections, led by teaching assistants, are optional and will be recorded for those who cannot attend. Initially, they will cover prerequisites like probability, statistics, and algebra to help students refresh their knowledge. Later in the course, the discussion sections will delve into extensions of the main lecture material, providing additional insights into the vast field of machine learning.\n\nOverall, the passage highlights the practical 

### Using a prompt

In [13]:
from langchain.prompts import ChatPromptTemplate

# Build prompt
system_template = """Use the following pieces of context to answer the question. If you don't know the answer, just say that you don't know, don't try to make up an answer. Use three sentences maximum. Keep the answer as concise as possible. Always say "thanks for asking!" at the end of the answer. 
<context>
{context}
</context>
"""
q_and_a_prompt = ChatPromptTemplate([
    ("system", system_template),
    ("human", "{input}"),
])


#### Build a chain with the prompt, injecting the context and question

In [14]:
from langchain_core.runnables import RunnablePassthrough

# Setup chain using LCEL
qa_chain = (
    { # This is a shorthand for a RunnableMap / RunnableParallel
        "context": vectordb.as_retriever() | format_docs,
        "input": RunnablePassthrough(),
    }
    | q_and_a_prompt
    | llm
    | StrOutputParser()
)

In [15]:
question = "Is probability a class topic?"

In [16]:
qa_chain.invoke(question)

'Yes, probability is a class topic, as the course assumes familiarity with basic probability and statistics. Thanks for asking!'

### Alternative - using helper functions to create the chain

#### Create the alternative chain

In [17]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain

combine_docs_chain = create_stuff_documents_chain(llm, q_and_a_prompt)
alt_rag_chain = create_retrieval_chain(vectordb.as_retriever(), combine_docs_chain)

In [18]:
alt_result = alt_rag_chain.invoke({"input": question})

In [19]:
alt_result["answer"]

'Yes, probability is a class topic, as the course assumes familiarity with basic probability and statistics. Thanks for asking!'

In [20]:
# Get first source document
alt_result["context"][0]

Document(metadata={'page': 4, 'source': '../data/MachineLearning-Lecture01.pdf'}, page_content="of this class will not be very programming intensive, although we will do some \nprogramming, mostly in either MATLAB or Octave. I'll say a bit more about that later.  \nI also assume familiarity with basic probability and statistics. So most undergraduate \nstatistics class, like Stat 116 taught here at Stanford, will be more than enough. I'm gonna \nassume all of you know what random variables are, that all of you know what expectation \nis, what a variance or a random variable is. And in case of some of you, it's been a while \nsince you've seen some of this material. At some of the discussion sections, we'll actually \ngo over some of the prerequisites, sort of as a refresher course under prerequisite class. \nI'll say a bit more about that later as well.  \nLastly, I also assume familiarity with basic linear algebra. And again, most undergraduate \nlinear algebra courses are more than e

#### Have a look at the trace in LangSmith
Exammple: https://smith.langchain.com/public/6d3ebe1f-fc1e-434d-90b5-f60e2fe1d286/r

### Next step will add chat memory!