### Midterm Challenge: Building and Deploying a RAG Application

#### Build üèóÔ∏è

- Data: Meta 10-k Filings
- LLM: OpenAI GPT-3.5-turbo
- Embedding Model: text-3-embedding small
- Infrastructure: LangChain or LlamaIndex (you choose)
- Vector Store: Qdrant
- Deployment: Chainlit, Hugging Face

#### Ship üö¢

Evaluate your answers to the following questions
- "What was the total value of 'Cash and cash equivalents' as of December 31, 2023?"
- "Who are Meta's 'Directors' (i.e., members of the Board of Directors)?"
- Record <10 min loom video walkthrough
- Extra Credit: Baseline retrieval performance w/ RAGAS, change something about your RAG system to improve it, then show the improvement quantitatively!

### Installing Required Libraries

In [170]:
!pip install -qU langchain langchain-core langchain-community langchain-openai

In [172]:
!pip install -qU qdrant-client


In [171]:
!pip install -qU tiktoken pymupdf

#### Set Environment Variables

In [4]:
import os
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")

#### Data Collection

In [173]:
from langchain.document_loaders import PyMuPDFLoader

docs = PyMuPDFLoader("https://d18rn0p25nwr6d.cloudfront.net/CIK-0001326801/c7318154-f6ae-4866-89fa-f0c589f2ee3d.pdf").load()

#### Chunking our Meta-10k Filing Document

In [174]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
import tiktoken

enc = tiktoken.encoding_for_model("gpt-3.5-turbo")

def tiktoken_len(text):
    tokens = tiktoken.encoding_for_model("gpt-3.5-turbo").encode(
        text,
    )
    return len(tokens)

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 200,
    chunk_overlap = 0, # Overlap to ensure continuity and prevent cutoffs at chunk edges
    length_function = tiktoken_len,
)

split_chunks = text_splitter.split_documents(docs)

In [175]:
len(split_chunks)

663

Now we have 663 ~200 token long documents

#### Embeddings and Vector Storage

In [176]:
from langchain_community.vectorstores import Qdrant

from langchain_openai.embeddings import OpenAIEmbeddings

embedding_model = OpenAIEmbeddings(model="text-embedding-3-small")

qdrant_vectorstore = Qdrant.from_documents(
    split_chunks,
    embedding_model,
    location=":memory:",
    collection_name="meta_10k_filings",
)

#### Setting up our retriever using Langchain retriever method

In [177]:
qdrant_retriever = qdrant_vectorstore.as_retriever()

### Setting up our Langchain based RAG

#### Setting up our Prompt template

In [154]:
from langchain_core.prompts import ChatPromptTemplate

RAG_PROMPT = """
CONTEXT:
{context}

QUERY:
{question}

RESPONSE:
- If the QUERY is directly related to the provided CONTEXT, generate a detailed, structured answer using the information from the CONTEXT.
- If the QUERY does not pertain to the provided CONTEXT, state that the question is unrelated and suggest checking the appropriate source or document for the correct information.
"""

rag_prompt = ChatPromptTemplate.from_template(RAG_PROMPT)


#### RAG Chain

In [155]:
from operator import itemgetter
from langchain.schema.output_parser import StrOutputParser
from langchain.schema.runnable import RunnablePassthrough

retrieval_augmented_qa_chain = (
    # INVOKE CHAIN WITH: {"question" : "<>"}
    # "question" : populated by getting the value of the "question" key
    # "context"  : populated by getting the value of the "question" key and chaining it into the base_retriever
    {"context": itemgetter("question") | qdrant_retriever, "question": itemgetter("question")}
    # "context"  : is assigned to a RunnablePassthrough object (will not be called or considered in the next step)
    #              by getting the value of the "context" key from the previous step
    | RunnablePassthrough.assign(context=itemgetter("context"))
    # "response" : the "context" and "question" values are used to format our prompt object and then piped
    #              into the LLM and stored in a key called "response"
    # "context"  : populated by getting the value of the "context" key from the previous step
    | {"response": rag_prompt | openai_chat_model, "context": itemgetter("context")}
)

In [156]:
question= "What was the total value of 'Cash and cash equivalents' as of December 31, 2023?"
response = retrieval_augmented_qa_chain.invoke({"question" :question})


In [147]:
print(response["response"].content)

The total value of 'Cash and cash equivalents' as of December 31, 2023, was $41.862 billion. This information can be found in the document on page 107 under the section 'Inputs (Level 3).' 

Please verify this information on page 107 of the document provided.


In [135]:
# for context in response["context"]:
#   print("Context:")
#   print(context)
#   print("----")

In [159]:
question= "Who are Meta's 'Directors' (i.e., members of the Board of Directors)?"
response = retrieval_augmented_qa_chain.invoke({"question" :question})

In [160]:
print(response["response"].content)

The members of Meta's Board of Directors are as follows:
1. Peggy Alford
2. Marc L. Andreessen
3. Andrew W. Houston
4. Nancy Killefer
5. Robert M. Kimmitt
6. Sheryl K. Sandberg
7. Tracey T. Travis
8. Tony Xu

These names were listed on page 132 of the document provided in the CONTEXT.
