###  Checking For GPU

In [None]:
!nvidia-smi

Tue Jan  2 18:26:39 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   75C    P8              13W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [None]:
!pip install -Uqqq pip --progress-bar off
!pip install -qqq torch==2.0.1 --progress-bar off
!pip install -qqq transformers==4.31.0 --progress-bar off
!pip install -qqq langchain==0.0.266 --progress-bar off
!pip install -qqq chromadb==0.4.5 --progress-bar off
!pip install -qqq pypdf==3.15.0 --progress-bar off
!pip install -qqq xformers==0.0.20 --progress-bar off
!pip install -qqq sentence_transformers==2.2.2 --progress-bar off
!pip install -qqq InstructorEmbedding==1.0.1 --progress-bar off
!pip install -qqq pdf2image==1.16.3 --progress-bar off

[0m[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchaudio 2.1.0+cu121 requires torch==2.1.0, but you have torch 2.0.1 which is incompatible.
torchdata 0.7.0 requires torch==2.1.0, but you have torch 2.0.1 which is incompatible.
torchtext 0.16.0 requires torch==2.1.0, but you have torch 2.0.1 which is incompatible.
torchvision 0.16.0+cu121 requires torch==2.1.0, but you have torch 2.0.1 which is incompatible.[0m[31m
[0m[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
xformers 0.0.20 requires torch==2.0.1, but you have torch 2.1.0 which is incompatible.[0m[31m
[0m

In [None]:
!pip install torch==2.1.0 torchdata==0.7.0 torchtext==0.16.0 torchvision==0.16.0 torchaudio==2.1.0+cu121

[0m

In [None]:
!pip install -qqq auto_gptq-0.4.1+cu118-cp310-cp310-linux_x86_64.whl --progress-bar off

[0m

In [None]:
!sudo apt-get install poppler-utils

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
poppler-utils is already the newest version (22.02.0-2ubuntu0.3).
0 upgraded, 0 newly installed, 0 to remove and 24 not upgraded.


In [None]:
import torch
import tensorflow as tf
from auto_gptq import AutoGPTQForCausalLM
from langchain import HuggingFacePipeline, PromptTemplate
from langchain.chains import RetrievalQA
from langchain.document_loaders import PyPDFDirectoryLoader
from langchain.embeddings import HuggingFaceInstructEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from pdf2image import convert_from_path
from transformers import AutoTokenizer, TextStreamer, pipeline

DEVICE = "cuda:0" if tf.config.list_physical_devices("GPU") else "cpu"

###  Creating a directory to save the pdf's

In [None]:
!mkdir pdfs

###  Getting the pdf's

In [None]:
!gdown 1v-Rn1FVU1pLTAQEgm0N9oB6cExMoebZr -O pdfs/tesla-earnings-report.pdf
!gdown 1Xc890jrQvCExAkryVWAttsv1DBLdVefN -O pdfs/nvidia-earnings-report.pdf
!gdown 1Epz-SQ3idPpoz75GlTzzomag8gplzLv8 -O pdfs/meta-earnings-report.pdf


In [None]:
meta_images = convert_from_path("pdfs/meta-earnings-report.pdf", dpi=88)
meta_images[0]

In [None]:
nvidia_images = convert_from_path("pdfs/nvidia-earnings-report.pdf", dpi=88)
nvidia_images[0]

In [None]:
tesla_images = convert_from_path("pdfs/tesla-earnings-report.pdf", dpi=88)
tesla_images[0]

###  Creating the Vector Database to store the data

In [None]:
!rm -rf "db"

In [None]:
loader = PyPDFDirectoryLoader("pdfs")
docs = loader.load()
len(docs)

100

In [None]:
embeddings = HuggingFaceInstructEmbeddings(
    model_name="hkunlp/instructor-large", model_kwargs={"device": DEVICE}
)


load INSTRUCTOR_Transformer
max_seq_length  512


In [None]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1024, chunk_overlap=64)
texts = text_splitter.split_documents(docs)
len(texts)

355

###  Here we are using ChromaDB vector database to store the data

In [None]:
%%time
db = Chroma.from_documents(texts, embeddings, persist_directory="db")

CPU times: user 20.1 s, sys: 333 ms, total: 20.5 s
Wall time: 23.9 s


###  Using LLAMA2 13B GPT MODEL

In [None]:
model_name_or_path = "TheBloke/Llama-2-13B-chat-GPTQ"
model_basename = "model"

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)

model = AutoGPTQForCausalLM.from_quantized(
    model_name_or_path,
    revision="gptq-4bit-128g-actorder_True",
    model_basename=model_basename,
    use_safetensors=True,
    trust_remote_code=True,
    inject_fused_attention=False,
    device=DEVICE,
    quantize_config=None,
)

In [None]:
##Checking the device GPU
!nvidia-smi

In [None]:
DEFAULT_SYSTEM_PROMPT = """
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.

If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
""".strip()


def generate_prompt(prompt: str, system_prompt: str = DEFAULT_SYSTEM_PROMPT) -> str:
    return f"""
[INST] <>
{system_prompt}
<>

{prompt} [/INST]
""".strip()

In [None]:

streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)


In [None]:
text_pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=1024,
    temperature=0,
    top_p=0.95,
    repetition_penalty=1.15,
    streamer=streamer,
)

In [None]:
llm = HuggingFacePipeline(pipeline=text_pipeline, model_kwargs={"temperature": 0})

In [None]:
SYSTEM_PROMPT = "Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer."

template = generate_prompt(
    """
{context}

Question: {question}
""",
    system_prompt=SYSTEM_PROMPT,
)


In [None]:
prompt = PromptTemplate(template=template, input_variables=["context", "question"])

In [None]:
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=db.as_retriever(search_kwargs={"k": 2}),
    return_source_documents=True,
    chain_type_kwargs={"prompt": prompt},
)

###  Testing the PDF's

In [None]:
result = qa_chain("What is the per share revenue for Meta during 2023?")

In [None]:
len(result["source_documents"])

In [None]:
print(result["source_documents"][0].page_content)

In [None]:
result = qa_chain("What is the per share revenue for Tesla during 2023?")

 I can't determine the per share revenue for Tesla in 2023 based on the information provided. The financial statements only provide total revenues and do not break down the figures by shares outstanding. Therefore, I cannot calculate the per share revenue.


In [None]:
result = qa_chain("What is the per share revenue for Nvidia during 2023?")

 Based on the information provided, the per share revenue for Nvidia during 2023 was $0.83. This is calculated by dividing the total revenue for the period ($7,192 million) by the weighted average number of shares outstanding (basic and diluted) during the period (2,470 million).


In [None]:
print(result["source_documents"][1].page_content)

PART I. FINANCIAL INFORMATION
ITEM 1.  FINANCIAL STATEMENTS (UNAUDITED)
NVIDIA CORPORATION AND SUBSIDIARIES
CONDENSED CONSOLIDATED STATEMENTS OF INCOME
(In millions, except per share data)
(Unaudited)
 Three Months Ended
 April 30, May 1,
2023 2022
Revenue $ 7,192 $ 8,288 
Cost of revenue 2,544 2,857 
Gross profit 4,648 5,431 
Operating expenses   
Research and development 1,875 1,618 
Sales, general and administrative 633 592 
Acquisition termination cost — 1,353 
Total operating expenses 2,508 3,563 
Income from operations 2,140 1,868 
Interest income 150 18 
Interest expense (66) (68)
Other, net (15) (13)
Other income (expense), net 69 (63)
Income before income tax 2,209 1,805 
Income tax expense 166 187 
Net income $ 2,043 $ 1,618 
Net income per share:
Basic $ 0.83 $ 0.65 
Diluted $ 0.82 $ 0.64 
Weighted average shares used in per share computation:
Basic 2,470 2,506 
Diluted 2,490 2,537 
See accompanying Notes to Condensed Consolidated Financial Statements.
3


In [34]:
result = qa_chain("What is the estimated YOY revenue for Meta during 2023?")

 Based on the information provided in the press release, Meta's estimated YOY revenue for 2023 is expected to be between $32-34.5 billion, which represents an increase of 11-16% over 2022.


In [35]:
result = qa_chain("What is the estimated YOY revenue for Tesla during 2023?")

 I can provide information on the estimated year-over-year (YOY) revenue for Tesla based on the provided financial statements.

To calculate the estimated YOY revenue for Tesla during 2023, we need to compare the revenues for the six months ended June 30, 2023, with those for the same period in 2022.

According to the financial statements, the total revenues for the six months ended June 30, 2023, were $24,927 million, while the total revenues for the same period in 2022 were $16,934 million. This represents a year-over-year increase of $7,993 million or approximately 47%.

Therefore, based on this analysis, the estimated YOY revenue for Tesla during 2023 is approximately 47% higher than in 2022.


In [None]:
result = qa_chain("What is the estimated YOY revenue for Nvidia during 2023?")

In [None]:
result = qa_chain(
    "Which company is more profitable during 2023 Meta, Nvidia or Tesla and why?"
)

 I cannot determine which company is more profitable during 2023 between Meta, Nvidia, and Tesla based on the information provided. The passage only discusses Nvidia's financial condition and results of operations but does not provide any information about Meta or Tesla's financial performance during 2023. Additionally, it is important to note that profitability can be influenced by various factors such as 

In [None]:
result = qa_chain(
    "Choose one company to invest (Tesla, Nvidia or Meta) to maximize your profits for the long term (10+ years)?"
)