# Building RAG Apps with LangChain (OpenAI, ChromaDB)

This guide is designed to demonstrate how to build RAG applications using LangChain. It answers questions about the textbook "An Introduction to Statistical Learning" by taking content from this textbook to provide context-specific responses. Chroma is used as the vector database to store the embeddings. OpenAI is used both to perform the embeddings as well as the LLM which gives the responses.

## Setup
#### Load the API key and relevant libaries.

In [2]:
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain import hub
from langchain_community.document_loaders import PyPDFLoader
from langchain_chroma import Chroma
from langchain_core.runnables import RunnablePassthrough
from langchain_text_splitters import RecursiveCharacterTextSplitter

**In this guide, we use GPT-3.5 as our LLM model**

In [6]:
llm = ChatOpenAI(model="gpt-3.5-turbo-0125")

In [14]:
loader = PyPDFLoader("ISLR_First_Printing.pdf")

documents = loader.load()

## Create Embeddings of PDF

1. Using the RecursiveCharacterTextSplitter, split the document into chunks of size 1000 characters with a 200 character overlap between chunks and a start index added to each chunk. The chunks are split based on characters like ["\n\n", "\n", " ", ""].
2. Store the chunks in a vector db, Chroma, after using the OPENAIEmbeddings model to embed the text into vectors.

In [16]:
text_splitter = RecursiveCharacterTextSplitter(
                    chunk_size=1000, chunk_overlap=200, add_start_index=True
                    )
all_splits = text_splitter.split_documents(documents)

In [17]:
vectorstore = Chroma.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())

## Question Answering from PDF

1. Retrieve similar embeddings from the DB after embedding the prompt and compare the prompt embeddings to the embeddings of the chunk. Retrieve 10 most common embeddings based on cosine similarity.
2. Answer questions on the document

In [21]:
retriever = vectorstore.as_retriever(search_type='similarity', search_kwargs={"k": 10})

In [22]:
prompt = hub.pull("rlm/rag-prompt")

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [26]:
for chunk in rag_chain.stream("What is Linear Regression?"):
    print(chunk, end="", flush=True)

Linear regression is a simple approach for predicting a quantitative response by fitting a linear relationship between variables. It involves estimating the intercept and slope coefficients using least squares regression. The goal is to create a model that closely fits the available data points.