In [1]:
!nvidia-smi

Mon Dec 23 13:34:52 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla V100-SXM2-32GB           On  |   00000000:06:00.0 Off |                    0 |
| N/A   38C    P0             71W /  300W |   16717MiB /  32768MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  Tesla V100-SXM2-32GB           On  |   00

In [2]:
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '5'

# Advanced RAG System

In [3]:
# !pip install langchain-groq langchain llama-parse qdrant-client "unstructured[md]" fastembed flashrank

In [4]:
import os
import textwrap
from pathlib import Path
 
# from google.colab import userdata
from IPython.display import Markdown
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import FlashrankRerank
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Qdrant
from langchain_community.document_loaders import UnstructuredMarkdownLoader
from langchain_community.embeddings.fastembed import FastEmbedEmbeddings
from langchain_core.prompts import ChatPromptTemplate
from langchain_groq import ChatGroq
from llama_parse import LlamaParse
from dotenv import load_dotenv

load_dotenv()

import nltk
nltk.download('punkt_tab')
nltk.download('averaged_perceptron_tagger_eng') 
 
def print_response(response):
    response_txt = response["result"]
    for chunk in response_txt.split("\n"):
        if not chunk:
            print()
            continue
        print("\n".join(textwrap.wrap(chunk, 100, break_long_words=False)))

[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger_eng to
[nltk_data]     /root/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger_eng is already up-to-
[nltk_data]       date!


## Document Parsing

In [5]:
# !mkdir data
!gdown 1VEwxk_SWVycWO4IKwAm6KumHUjXhvH84 -O "data/nvidia-report.pdf"

Downloading...
From: https://drive.google.com/uc?id=1VEwxk_SWVycWO4IKwAm6KumHUjXhvH84
To: /workspace/mlexpert/data/nvidia-report.pdf
100%|██████████████████████████████████████| 75.5k/75.5k [00:00<00:00, 2.67MB/s]


tantangan dalam melakukan document parsing:

- Bagaimana meng-ekstrak layout dari seluruh halaman
- ekstraksi konten seperti tabel, judul, paragraph, gambar ke sebuah representasi tekstual dokumen

In [7]:
instruction = """The provided document is NVIDIA Results for First Quarter Fiscal 2025.
This form provides detailed financial information about the company's
performance for a specific quarter.
It includes unaudited financial statements, management discussion and
analysis, and other relevant disclosures required by the SEC.
It contains many tables."""

parser = LlamaParse(
    api_key=os.getenv("LLAMA_PARSE"),
    result_type="markdown",
    parsing_instruction=instruction,
    max_timeout=5000,
)
 
llama_parse_documents = await parser.aload_data("./data/nvidia-report.pdf")
parsed_doc = llama_parse_documents[0]

Started parsing the file under job_id ad200447-9647-47ef-8aa4-f346e31eb5c0


In [8]:
document_path = Path("data/parsed_document.md")
with document_path.open("w") as f:
    f.write(parsed_doc.text)

## Document Chunking

In [9]:
loader = UnstructuredMarkdownLoader(document_path)

In [10]:
loaded_documents = loader.load()

In [11]:
loaded_documents[:1]

[Document(page_content="NVIDIA Announces Financial Results for First Quarter Fiscal 2025\n\nRecord quarterly revenue of $26.0 billion, up 18% from Q4 and up 262% from a year ago\n\nRecord quarterly Data Center revenue of $22.6 billion, up 23% from Q4 and up 427% from a year ago\n\nTen-for-one forward stock split effective June 7, 2024\n\nQuarterly cash dividend raised 150% to $0.01 per share on a post-split basis\n\nNVIDIA (NASDAQ: NVDA) today reported revenue for the first quarter ended April 28, 2024, of $26.0 billion, up 18% from the previous quarter and up 262% from a year ago.\n\nFor the quarter, GAAP earnings per diluted share was $5.98, up 21% from the previous quarter and up 629% from a year ago. Non-GAAP earnings per diluted share was $6.12, up 19% from the previous quarter and up 461% from a year ago.\n\n“The next industrial revolution has begun — companies and countries are partnering with NVIDIA to shift the trillion-dollar traditional data centers to accelerated computing 

In [20]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=2048, chunk_overlap=128)

In [21]:
docs = text_splitter.split_documents(loaded_documents)

In [22]:
len(docs)

12

In [25]:
print(docs[0].page_content)

NVIDIA Announces Financial Results for First Quarter Fiscal 2025

Record quarterly revenue of $26.0 billion, up 18% from Q4 and up 262% from a year ago

Record quarterly Data Center revenue of $22.6 billion, up 23% from Q4 and up 427% from a year ago

Ten-for-one forward stock split effective June 7, 2024

Quarterly cash dividend raised 150% to $0.01 per share on a post-split basis

NVIDIA (NASDAQ: NVDA) today reported revenue for the first quarter ended April 28, 2024, of $26.0 billion, up 18% from the previous quarter and up 262% from a year ago.

For the quarter, GAAP earnings per diluted share was $5.98, up 21% from the previous quarter and up 629% from a year ago. Non-GAAP earnings per diluted share was $6.12, up 19% from the previous quarter and up 461% from a year ago.

“The next industrial revolution has begun — companies and countries are partnering with NVIDIA to shift the trillion-dollar traditional data centers to accelerated computing and build a new type of data center — 

### Embedding and Vector Database

In [26]:
embeddings = FastEmbedEmbeddings(model_name="BAAI/bge-base-en-v1.5")

Fetching 5 files:   0%|          | 0/5 [00:00<?, ?it/s]

In [27]:
qdrant = Qdrant.from_documents(
    docs,
    embeddings,
    # location=":memory:",
    path="./db",
    collection_name="document_embeddings",
)

In [29]:
query = "What was NVIDIA's total revenue for Q1 FY2025?"
similar_docs = qdrant.similarity_search_with_score(query)
 
for doc, score in similar_docs:
    print(f"text: {doc.page_content[:256]}\n")
    print(f"score: {score}")
    print("-" * 80)
    print()

text: NVIDIA Announces Financial Results for First Quarter Fiscal 2025

Record quarterly revenue of $26.0 billion, up 18% from Q4 and up 262% from a year ago

Record quarterly Data Center revenue of $22.6 billion, up 23% from Q4 and up 427% from a year ago

Ten-

score: 0.8070521052124007
--------------------------------------------------------------------------------

text: NVIDIA Announces Financial Results for First Quarter Fiscal 2025

Record quarterly revenue of $26.0 billion, up 18% from Q4 and up 262% from a year ago

Record quarterly Data Center revenue of $22.6 billion, up 23% from Q4 and up 427% from a year ago

Ten-

score: 0.8070521052124007
--------------------------------------------------------------------------------

text: Non-GAAP

($ in millions, except earnings per share) Q1 FY25 Q4 FY24 Q1 FY24 Q/Q Y/Y Revenue $26,044 $22,103 $7,192 Up 18% Up 262% Gross margin 78.9% 76.7% 66.8% Up 2.2 pts Operating expenses $2,501 $2,210 $1,750 Up 13% Up 43% Operating income $18,05

### Re-ranking

In [33]:
compressor = FlashrankRerank(model="ms-marco-MiniLM-L-12-v2")
retriever = qdrant.as_retriever(search_kwargs={"k": 5})
compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor, base_retriever=retriever
)

In [34]:
reranked_docs = compression_retriever.invoke(query)
len(reranked_docs)

Running pairwise ranking..


3

In [35]:
for doc in reranked_docs:
    print(f"id: {doc.metadata['_id']}\n")
    print(f"text: {doc.page_content[:256]}\n")
    print(f"score: {doc.metadata['relevance_score']}")
    print("-" * 80)
    print()

id: 85041674c9a1470e80b4468fe9e6d5ea

text: Non-GAAP

($ in millions, except earnings per share) Q1 FY25 Q4 FY24 Q1 FY24 Q/Q Y/Y Revenue $26,044 $22,103 $7,192 Up 18% Up 262% Gross margin 78.9% 76.7% 66.8% Up 2.2 pts Operating expenses $2,501 $2,210 $1,750 Up 13% Up 43% Operating income $18,059 $14,

score: 0.9967665076255798
--------------------------------------------------------------------------------

id: a0ab2cbebff8404dba277751c3d7ff37

text: Non-GAAP

($ in millions, except earnings per share) Q1 FY25 Q4 FY24 Q1 FY24 Q/Q Y/Y Revenue $26,044 $22,103 $7,192 Up 18% Up 262% Gross margin 78.9% 76.7% 66.8% Up 2.2 pts Operating expenses $2,501 $2,210 $1,750 Up 13% Up 43% Operating income $18,059 $14,

score: 0.9967665076255798
--------------------------------------------------------------------------------

id: 25dca60bbcf24c8ba1215622335cbeb9

text: NVIDIA Announces Financial Results for First Quarter Fiscal 2025

Record quarterly revenue of $26.0 billion, up 18% from Q4 and up 262% 

## Q&A Over Document

In [37]:
llm = ChatGroq(temperature=0, model_name="llama-3.3-70b-versatile")

In [39]:
prompt_template = """
Use the following pieces of information to answer the user's question.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
 
Context: {context}
Question: {question}
 
Answer the question and provide additional helpful information,
based on the pieces of information, if applicable. Be succinct.
 
Responses should be properly formatted to be easily read.
"""
 
prompt = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)

In [40]:
qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff", # semua dokumen yang di-retrieve akan diconcat menjadi satu buah konteks
    retriever=compression_retriever,
    return_source_documents=True,
    chain_type_kwargs={"prompt": prompt, "verbose": True},
)

In [41]:
response = qa.invoke("What is NVIDIA's gross margin for Q1 FY2025??")

Running pairwise ranking..


[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Use the following pieces of information to answer the user's question.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
 
Context: Non-GAAP

($ in millions, except earnings per share) Q1 FY25 Q4 FY24 Q1 FY24 Q/Q Y/Y Revenue $26,044 $22,103 $7,192 Up 18% Up 262% Gross margin 78.9% 76.7% 66.8% Up 2.2 pts Operating expenses $2,501 $2,210 $1,750 Up 13% Up 43% Operating income $18,059 $14,749 $3,052 Up 22% Up 492% Net income $15,238 $12,839 $2,713 Up 19% Up 462% Diluted earnings per share $6.12 $5.16 $1.09 Up 19% Up 461%

Outlook

NVIDIA’s outlook for the second quarter of fiscal 2025 is as follows:

Revenue is expected to be $28.0 billion, plus or minus 2%.

GAAP and non-GAAP gross margins are expected to be 74.8% and 75.5%, respectively, plus or minus 50 basis points.

For the full year

In [43]:
response['result']

"**NVIDIA's Gross Margin for Q1 FY2025:**\n78.9%\n\n**Additional Information:**\nThis gross margin is based on non-GAAP figures and represents an increase of 2.2 percentage points from the previous quarter (Q4 FY24) and an increase of 12.1 percentage points from the same quarter last year (Q1 FY24)."

In [44]:
response = qa.invoke("What new partnerships did NVIDIA announce in the Data Center segment?")

Running pairwise ranking..


[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Use the following pieces of information to answer the user's question.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
 
Context: NVIDIA Announces Financial Results for First Quarter Fiscal 2025

Record quarterly revenue of $26.0 billion, up 18% from Q4 and up 262% from a year ago

Record quarterly Data Center revenue of $22.6 billion, up 23% from Q4 and up 427% from a year ago

Ten-for-one forward stock split effective June 7, 2024

Quarterly cash dividend raised 150% to $0.01 per share on a post-split basis

NVIDIA (NASDAQ: NVDA) today reported revenue for the first quarter ended April 28, 2024, of $26.0 billion, up 18% from the previous quarter and up 262% from a year ago.

For the quarter, GAAP earnings per diluted share was $5.98, up 21% from the previous quarter and up 629% fr

In [53]:
response["source_documents"]

[Document(page_content='Non-GAAP\n\n($ in millions, except earnings per share) Q1 FY25 Q4 FY24 Q1 FY24 Q/Q Y/Y Revenue $26,044 $22,103 $7,192 Up 18% Up 262% Gross margin 78.9% 76.7% 66.8% Up 2.2 pts Operating expenses $2,501 $2,210 $1,750 Up 13% Up 43% Operating income $18,059 $14,749 $3,052 Up 22% Up 492% Net income $15,238 $12,839 $2,713 Up 19% Up 462% Diluted earnings per share $6.12 $5.16 $1.09 Up 19% Up 461%\n\nOutlook\n\nNVIDIA’s outlook for the second quarter of fiscal 2025 is as follows:\n\nRevenue is expected to be $28.0 billion, plus or minus 2%.\n\nGAAP and non-GAAP gross margins are expected to be 74.8% and 75.5%, respectively, plus or minus 50 basis points.\n\nFor the full year, gross margins are expected to be in the mid-70% range.\n\nGAAP and non-GAAP operating expenses are expected to be approximately $4.0 billion and $2.8 billion, respectively.\n\nFull-year operating expenses are expected to grow in the low-40% range.\n\nGAAP and non-GAAP other income and expense are e

In [45]:
print_response(response)

**New Partnerships in the Data Center Segment:**

NVIDIA announced the following new partnerships in the Data Center segment:

* Expanded collaborations with AWS, Google Cloud, Microsoft, and Oracle to advance generative AI
innovation.
* Worked with Johnson & Johnson MedTech to bring AI capabilities to support surgery.
* TSMC and Synopsys are going into production with NVIDIA cuLitho to accelerate computational
lithography.
* Nine new supercomputers worldwide are using Grace Hopper Superchips to ignite a new era of AI
supercomputing.

These partnerships aim to accelerate the adoption of AI and generative AI in various industries,
including healthcare, cloud computing, and semiconductor manufacturing.
