# RAG ChatBot 

## Importing necessary libraries

In [1]:

import os
import textwrap
from pathlib import Path

from IPython.display import Markdown
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import FlashrankRerank
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Qdrant
from langchain_community.document_loaders import UnstructuredMarkdownLoader
from langchain_community.embeddings.fastembed import FastEmbedEmbeddings
from langchain_core.prompts import ChatPromptTemplate
from langchain_groq import ChatGroq
from llama_parse import LlamaParse


def print_response(response):
    response_txt = response["result"]
    for chunk in response_txt.split("\n"):
        if not chunk:
            print()
            continue
        print("\n".join(textwrap.wrap(chunk, 100, break_long_words=False)))

## Parsing the pdf document

In [3]:
#instruction to fine-tune the parser
instruction = """The provided document is Meta First Quarter 2024 Results.
This form provides detailed financial information about the company's performance for a specific quarter.
It includes unaudited financial statements, management discussion and analysis, and other relevant disclosures required by the SEC.
It contains many tables.
Try to be precise while answering the questions"""     

#setting up LLamaParse to parse through the pdf document
parser = LlamaParse(                                     
    api_key="<your_api_key>", #Llamacloud api ley
    result_type="markdown",   #convert pdf to markdown
    parsing_instruction=instruction,
    max_timeout=5000,
)

llama_parse_documents = await parser.aload_data("data/meta-earnings.pdf")

Started parsing the file under job_id 3ba5d124-731a-4984-bc29-3c4f5ad0aff0


In [4]:
parsed_doc = llama_parse_documents[0]
Markdown(parsed_doc.text[0:4096])

#

# Meta First Quarter 2024 Results

# Meta Reports First Quarter 2024 Results

MENLO PARK, Calif. – April 24, 2024 – Meta Platforms, Inc. (Nasdaq: META) today reported financial results for the quarter ended March 31, 2024.

"It's been a good start to the year," said Mark Zuckerberg, Meta founder and CEO. "The new version of Meta AI with Llama 3 is another step towards building the world's leading AI. We're seeing healthy growth across our apps and we continue making steady progress building the metaverse as well."

# First Quarter 2024 Financial Highlights

| |Three Months Ended March 31, 2024|Three Months Ended March 31, 2023|% Change|
|---|---|---|---|
|Revenue|$36,455|$28,645|27%|
|Costs and expenses|$22,637|$21,418|6%|
|Income from operations|$13,818|$7,227|91%|
|Operating margin|38%|25%| |
|Provision for income taxes|$1,814|$1,598|14%|
|Effective tax rate|13%|22%| |
|Net income|$12,369|$5,709|117%|
|Diluted earnings per share (EPS)|$4.71|$2.20|114%|

# First Quarter 2024 Operational and Other Financial Highlights

- Family daily active people (DAP) – DAP was 3.24 billion on average for March 2024, an increase of 7% year-over-year.
- Ad impressions – Ad impressions delivered across our Family of Apps increased by 20% year-over-year.
- Average price per ad – Average price per ad increased by 6% year-over-year.
- Revenue – Total revenue and revenue on a constant currency basis were $36.46 billion and $36.35 billion, respectively, both of which increased by 27% year-over-year.
- Costs and expenses – Total costs and expenses were $22.64 billion, an increase of 6% year-over-year.
- Capital expenditures – Capital expenditures, including principal payments on finance leases, were $6.72 billion.
- Capital return program – Share repurchases were $14.64 billion of our Class A common stock and dividends payments were $1.27 billion.
- Cash, cash equivalents, and marketable securities – Cash, cash equivalents, and marketable securities were $58.12 billion as of March 31, 2024. Free cash flow was $12.53 billion.
- Headcount – Headcount was 69,329 as of March 31, 2024, a decrease of 10% year-over-year.
---
#

# Meta First Quarter 2024 Results

# CFO Outlook Commentary

We expect second quarter 2024 total revenue to be in the range of $36.5-39 billion. Our guidance assumes foreign currency is a 1% headwind to year-over-year total revenue growth, based on current exchange rates.

We expect full-year 2024 total expenses to be in the range of $96-99 billion, updated from our prior outlook of $94-99 billion due to higher infrastructure and legal costs. For Reality Labs, we continue to expect operating losses to increase meaningfully year-over-year due to our ongoing product development efforts and our investments to further scale our ecosystem.

We anticipate our full-year 2024 capital expenditures will be in the range of $35-40 billion, increased from our prior range of $30-37 billion as we continue to accelerate our infrastructure investments to support our artificial intelligence (AI) roadmap. While we are not providing guidance for years beyond 2024, we expect capital expenditures will continue to increase next year as we invest aggressively to support our ambitious AI research and product development efforts.

Absent any changes to our tax landscape, we expect our full-year 2024 tax rate to be in the mid-teens.

In addition, we continue to monitor an active regulatory landscape, including the increasing legal and regulatory headwinds in the EU and the U.S. that could significantly impact our business and our financial results.

Q1 was a good start to the year. We're seeing strong momentum within our Family of Apps and are making important progress on our longer-term AI and Reality Labs initiatives that have the potential to transform the way people interact with our services over the coming years.
---
#

# Meta First Quarter 2024 Results

# Webcast and Conference Call Information

Meta will host a conference call to discuss the results at 2:00 p.m. PT / 5:00 p.m. ET today. The live webcast of Meta's earnings conference call can 

## Converting the docs to markdown format

In [5]:
#create a markdown file to store parsed_doc
document_path = Path("data/parsed_document.md")
with document_path.open("a") as f:
    f.write(parsed_doc.text)

In [6]:
#Load the markdown file
loader = UnstructuredMarkdownLoader(document_path)
loaded_documents = loader.load()


## Splitting the data into chunks

In [7]:

text_splitter = RecursiveCharacterTextSplitter(chunk_size=2048, chunk_overlap=128)
docs = text_splitter.split_documents(loaded_documents)
len(docs)
     

37

In [8]:
print(docs[0].page_content)

Meta First Quarter 2024 Results

Meta Reports First Quarter 2024 Results

MENLO PARK, Calif. – April 24, 2024 – Meta Platforms, Inc. (Nasdaq: META) today reported financial results for the quarter ended March 31, 2024.

"It's been a good start to the year," said Mark Zuckerberg, Meta founder and CEO. "The new version of Meta AI with Llama 3 is another step towards building the world's leading AI. We're seeing healthy growth across our apps and we continue making steady progress building the metaverse as well."

First Quarter 2024 Financial Highlights

Three Months Ended March 31, 2024 Three Months Ended March 31, 2023 % Change Revenue $36,455 million $28,645 million 27% Costs and expenses $22,637 million $21,418 million 6% Income from operations $13,818 million $7,227 million 91% Operating margin 38% 25% Provision for income taxes $1,814 million $1,598 million 14% Effective tax rate 13% 22% Net income $12,369 million $5,709 million 117% Diluted earnings per share (EPS) $4.71 $2.20 114%

## Setting up embeddings and a vector database

In [9]:
#get the embeddings
embeddings = FastEmbedEmbeddings(model_name="BAAI/bge-base-en-v1.5")    

Fetching 5 files:   0%|          | 0/5 [00:00<?, ?it/s]

In [10]:
#Use the Qdrant vector similarity search engine
qdrant = Qdrant.from_documents(
    docs,
    embeddings,
    path="./db",
    collection_name="document_embeddings",
)
     

In [11]:

query = "What is the most important innovation from Meta?"
similar_docs = qdrant.similarity_search_with_score(query) #search the most relevant answer w.r.t the query in the vector database using cosine similarity

In [12]:
for doc, score in similar_docs:
    print(f"text: {doc.page_content[:256]}\n")
    print(f"score: {score}")
    print("-" * 80)
    print()

text: About Meta

Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram, and WhatsApp further empowered billions around the worl

score: 0.6781093167424701
--------------------------------------------------------------------------------

text: About Meta

Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram, and WhatsApp further empowered billions around the worl

score: 0.6206513248343075
--------------------------------------------------------------------------------

text: About Meta

Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram, and WhatsApp further empowered billions around th

## Retrieve the query from vector DB

In [15]:

%%time
retriever = qdrant.as_retriever(search_kwargs={"k": 5})  #using qdrant as a retriever 
retrieved_docs = retriever.invoke(query)

CPU times: user 216 ms, sys: 150 ms, total: 366 ms
Wall time: 275 ms


In [16]:
for doc in retrieved_docs:
    print(f"id: {doc.metadata['_id']}\n")
    print(f"text: {doc.page_content[:256]}\n")
    print("-" * 80)
    print()

id: 82c9cb5a172a4ff89eed6397cf674631

text: About Meta

Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram, and WhatsApp further empowered billions around the worl

--------------------------------------------------------------------------------

id: e909fc73499f427c87ba9ef5b0fe2460

text: About Meta

Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram, and WhatsApp further empowered billions around the worl

--------------------------------------------------------------------------------

id: 7ae249a5ed0f4e15843cfe9f247acab6

text: About Meta

Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger

In [18]:
compressor = FlashrankRerank(model="ms-marco-MiniLM-L-12-v2")  #compresses/filters the relevant data
compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor, base_retriever=retriever
)

In [19]:
%%time
reranked_docs = compression_retriever.invoke(query)
len(reranked_docs)

CPU times: user 3.95 s, sys: 257 ms, total: 4.21 s
Wall time: 966 ms


3

In [20]:
for doc in reranked_docs:
    print(f"id: {doc.metadata['_id']}\n")
    print(f"text: {doc.page_content[:256]}\n")
    print(f"score: {doc.metadata['relevance_score']}")
    print("-" * 80)
    print()

id: 82c9cb5a172a4ff89eed6397cf674631

text: About Meta

Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram, and WhatsApp further empowered billions around the worl

score: 0.7225201725959778
--------------------------------------------------------------------------------

id: e909fc73499f427c87ba9ef5b0fe2460

text: About Meta

Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram, and WhatsApp further empowered billions around the worl

score: 0.6118301749229431
--------------------------------------------------------------------------------

id: 7ae249a5ed0f4e15843cfe9f247acab6

text: About Meta

Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it

## Using ChatGroq as our pretrained LLM

In [21]:
llm = ChatGroq(temperature=0, model_name="llama3-70b-8192",groq_api_key="<your_chatgroq_api_key>")

## Prompt Engineering

In [22]:
prompt_template = """
Use the following pieces of information to answer the user's question.
If you don't know the answer, just say that you don't know, don't try to make up an answer.

Context: {context}
Question: {question}

Answer the question and provide additional helpful information,
based on the pieces of information, if applicable. Be succinct.

Responses should be properly formatted to be easily read.
"""

prompt = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)

## Creating our RAG pipeline
<img src="image-1.png" alt="Pipeline" style="width:750px;">


In [23]:
qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=compression_retriever,
    return_source_documents=True,
    chain_type_kwargs={"prompt": prompt, "verbose": True},
)

## Evaluation

In [24]:

%%time
response = qa.invoke("What is the most significant innovation from Meta?")



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Use the following pieces of information to answer the user's question.
If you don't know the answer, just say that you don't know, don't try to make up an answer.

Context: About Meta

Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram, and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology.

Contacts

Investors:

Kenneth Dorell

investor@meta.com / investor.fb.com

Press:

Ryan Moore

press@meta.com / about.fb.com/news/

Meta First Quarter 2024 Results

Meta First Quarter 2024 Results

About Meta

Meta builds technologies that help people connect, find communit

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"



[1m> Finished chain.[0m

[1m> Finished chain.[0m
CPU times: user 5.8 s, sys: 327 ms, total: 6.13 s
Wall time: 3.43 s


In [25]:
print_response(response)

Based on the provided information, the most significant innovation from Meta is the move towards
immersive experiences like augmented and virtual reality to help build the next evolution in social
technology.

This innovation is significant because it marks a shift beyond 2D screens and has the potential to
revolutionize the way people connect, find communities, and grow businesses.


The chatbot provided relevant and appropriate respone for the given query

In [26]:
%%time
response = qa.invoke("What is the revenue for 2023?")



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Use the following pieces of information to answer the user's question.
If you don't know the answer, just say that you don't know, don't try to make up an answer.

Context: Meta First Quarter 2024 Results

Meta First Quarter 2024 Results

Reconciliation of GAAP to Non-GAAP Results

Three Months Ended March 31, 2024 Three Months Ended March 31, 2023 GAAP revenue $36,455 $28,645 Foreign exchange effect on 2024 revenue using 2023 rates ($106) Revenue excluding foreign exchange effect $36,349 GAAP revenue year-over-year change % 27% Revenue excluding foreign exchange effect year-over-year change % 27% GAAP advertising revenue $35,635 $28,101 Foreign exchange effect on 2024 advertising revenue using 2023 rates ($105) Advertising revenue excluding foreign exchange effect $35,530 GAAP advertising revenue year-over-year change % 27% Advertising revenue excludi

INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"



[1m> Finished chain.[0m

[1m> Finished chain.[0m
CPU times: user 4.41 s, sys: 305 ms, total: 4.72 s
Wall time: 3.12 s


In [27]:
print_response(response)

The revenue for 2023 is $28,645 million.

Additional information: The revenue for 2024 is $36,455 million, which represents a 27% year-over-
year increase.


The response generated also contains info about 2024 which was not asked by the user

In [28]:
%%time
response = qa.invoke(
    "What were the add impressions in 2022?"
)



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Use the following pieces of information to answer the user's question.
If you don't know the answer, just say that you don't know, don't try to make up an answer.

Context: Family daily active people (DAP) – DAP was 3.24 billion on average for March 2024, an increase of 7% year-over-year.

Ad impressions – Ad impressions delivered across our Family of Apps increased by 20% year-over-year.

Average price per ad – Average price per ad increased by 6% year-over-year.

Revenue – Total revenue and revenue on a constant currency basis were $36.46 billion and $36.35 billion, respectively, both of which increased by 27% year-over-year.

Costs and expenses – Total costs and expenses were $22.64 billion, an increase of 6% year-over-year.

Capital expenditures – Capital expenditures, including principal payments on finance leases, were $6.72 billion.

Capital ret

INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"



[1m> Finished chain.[0m

[1m> Finished chain.[0m
CPU times: user 3.78 s, sys: 350 ms, total: 4.13 s
Wall time: 3.57 s


In [29]:
print_response(response)

The question asks about ad impressions in 2022, but the provided information only mentions that ad
impressions delivered across the Family of Apps increased by 20% year-over-year, but it does not
provide the exact number of ad impressions in 2022 or any other year.

However, I can provide some additional information that might be helpful:

* The average price per ad increased by 6% year-over-year.
* Total revenue and revenue on a constant currency basis were $36.46 billion and $36.35 billion,
respectively, both of which increased by 27% year-over-year.

If you have any further questions or need clarification on any of the provided information, feel
free to ask!


The model didn't hallucinate for the data it didn't had. Instead it told the user it does not have the required data

In [32]:
%%time
response = qa.invoke(
    "Can you elaborate the prevous question again"
)



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Use the following pieces of information to answer the user's question.
If you don't know the answer, just say that you don't know, don't try to make up an answer.

Context: We anticipate our full-year 2024 capital expenditures will be in the range of $35-40 billion, increased from our prior range of $30-37 billion as we continue to accelerate our infrastructure investments to support our artificial intelligence (AI) roadmap. While we are not providing guidance for years beyond 2024, we expect capital expenditures will continue to increase next year as we invest aggressively to support our ambitious AI research and product development efforts.

Absent any changes to our tax landscape, we expect our full-year 2024 tax rate to be in the mid-teens.

In addition, we continue to monitor an active regulatory landscape, including the increasing legal and regul

INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"



[1m> Finished chain.[0m

[1m> Finished chain.[0m
CPU times: user 3.8 s, sys: 133 ms, total: 3.93 s
Wall time: 1.89 s


In [33]:

print_response(response)

I apologize, but there is no previous question to elaborate on. This conversation just started, and
I don't have any context or previous question to refer to. If you have a specific question related
to the provided text, I'd be happy to help!


One of the major drawback of the model is that is does not remember the conversation history between itself and the user.

In [34]:
%%time
response = qa.invoke("Can you predict how meta will perform in the year 2025?")



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Use the following pieces of information to answer the user's question.
If you don't know the answer, just say that you don't know, don't try to make up an answer.

Context: We anticipate our full-year 2024 capital expenditures will be in the range of $35-40 billion, increased from our prior range of $30-37 billion as we continue to accelerate our infrastructure investments to support our artificial intelligence (AI) roadmap. While we are not providing guidance for years beyond 2024, we expect capital expenditures will continue to increase next year as we invest aggressively to support our ambitious AI research and product development efforts.

Absent any changes to our tax landscape, we expect our full-year 2024 tax rate to be in the mid-teens.

In addition, we continue to monitor an active regulatory landscape, including the increasing legal and regul

INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"



[1m> Finished chain.[0m

[1m> Finished chain.[0m
CPU times: user 3.88 s, sys: 309 ms, total: 4.19 s
Wall time: 3.71 s


In [35]:
Markdown(response["result"])

**Prediction for Meta's Performance in 2025:**

Based on the provided information, it's difficult to make a precise prediction for Meta's performance in 2025. However, we can infer some trends and insights from the Q1 2024 results:

* Meta has increased its capital expenditures to $35-40 billion in 2024, which is expected to continue growing in 2025 to support its AI roadmap and product development efforts.
* The company has reported strong growth in revenue (27% year-over-year) and net income (117% year-over-year) in Q1 2024, indicating a positive trend.
* Meta's Family of Apps is showing healthy growth, with a 7% year-over-year increase in daily active people and a 20% year-over-year increase in ad impressions.

**Additional Insights:**

* Meta is investing heavily in AI research and product development, which could lead to new revenue streams and growth opportunities in 2025.
* The company is also making progress in building the metaverse, which could be a significant growth driver in the future.
* However, Meta is also facing increasing legal and regulatory headwinds in the EU and the U.S., which could impact its business and financial results in 2025.

**Conclusion:**

While it's difficult to make a precise prediction for Meta's performance in 2025, the company's strong Q1 2024 results and investments in AI and product development suggest a positive trend. However, regulatory headwinds and increasing capital expenditures could impact its performance in 2025.

In [36]:
%%time
response = qa.invoke("Wht r da advertising incomes 4 meta in 2023 n 2024")



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Use the following pieces of information to answer the user's question.
If you don't know the answer, just say that you don't know, don't try to make up an answer.

Context: Meta First Quarter 2024 Results

Meta First Quarter 2024 Results

Reconciliation of GAAP to Non-GAAP Results

Three Months Ended March 31, 2024 Three Months Ended March 31, 2023 GAAP revenue $36,455 $28,645 Foreign exchange effect on 2024 revenue using 2023 rates ($106) Revenue excluding foreign exchange effect $36,349 GAAP revenue year-over-year change % 27% Revenue excluding foreign exchange effect year-over-year change % 27% GAAP advertising revenue $35,635 $28,101 Foreign exchange effect on 2024 advertising revenue using 2023 rates ($105) Advertising revenue excluding foreign exchange effect $35,530 GAAP advertising revenue year-over-year change % 27% Advertising revenue excludi

INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"



[1m> Finished chain.[0m

[1m> Finished chain.[0m
CPU times: user 4.11 s, sys: 247 ms, total: 4.36 s
Wall time: 1.88 s


In [37]:
Markdown(response["result"])


**Advertising Revenue for Meta in 2023 and 2024:**

* 2023: $28,101 million
* 2024: $35,635 million

The model performed well on the query which had minor spelling mistakes. It also gave consise and relevant answer.