<a href="https://colab.research.google.com/github/sainiakhil/Retrieval-Augmented-Generation-RAG-Model-for-QA-Bot-on-P-L-Data-/blob/main/Part_1_(RAG)_Model_for_QA_Bot_on_P%26L_Data.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install transformers
!pip install sentence_transformers
!pip install accelerate
!pip install bitsandbytes
!pip install llama-index pypdf
!pip install sentence-transformers
!pip install llama_index.embeddings.huggingface
!pip install llama_index.llms.huggingface

In [3]:
import os
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.huggingface import HuggingFaceLLM


from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,
    Settings
)
from llama_index.core.response.pprint_utils import pprint_response
from llama_index.core.node_parser import SentenceSplitter

In [None]:
quantization_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_compute_dtype=torch.float16,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_use_double_quant=True
    )

tokenizer = AutoTokenizer.from_pretrained("NousResearch/Meta-Llama-3.1-8B-Instruct")
model = AutoModelForCausalLM.from_pretrained(
        "NousResearch/Meta-Llama-3.1-8B-Instruct",
        quantization_config=quantization_config,
        device_map="auto"
    )

In [None]:
embed_model = HuggingFaceEmbedding(
    model_name="BAAI/bge-large-en-v1.5"
)

In [6]:
llm = HuggingFaceLLM(
    model=model,
    tokenizer=tokenizer
)

In [12]:
settings = Settings
settings.llm = llm
settings.embed_model = embed_model
settings.node_parser = SentenceSplitter(chunk_size=512, chunk_overlap=20)
settings.num_output = 50
settings.context_window = 3900
settings.genrate_kwargs = {"do_sample": False,"temperature": 0.1}

In [9]:
documents = SimpleDirectoryReader("/content/data").load_data()

vector_index = VectorStoreIndex.from_documents(
    documents,
    embed_model=Settings.embed_model,
    node_parser=settings.node_parser,
    show_progress=True
    )

Parsing nodes:   0%|          | 0/40 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/109 [00:00<?, ?it/s]

In [17]:
query_engine = vector_index.as_query_engine(
    llm=settings.llm,
    similarity_top_k=3,
    response_mode="compact",
    verbose=True,
    generate_kwargs=settings.genrate_kwargs,
    context_window=settings.context_window,
    num_output=settings.num_output,
    show_progress=True
   )

In [18]:
query_str = "What are the total expenses for Q2 2023?"
response = await query_engine.aquery(query_str)

print("Original Query:", query_str)
cleaned_response = response.response.split('\n')[0].strip()
print("Response:", cleaned_response)

print("\nRetrieved Document Chunks:")
for i, node in enumerate(response.source_nodes, 1):
    print(f"Chunk {i}:")
    print(f"Relevance Score: {node.score:.4f}")
    print(f"Text (first 300 chars): {node.text[:300]}...")
    print("-" * 50)

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Original Query: What are the total expenses for Q2 2023?
Response: 29,646

Retrieved Document Chunks:
Chunk 1:
Relevance Score: 0.7110
Text (first 300 chars): INFOSYS LIMITED AND SUBSIDIARIES
(In ₹ crore, except equity share and per equity share data)
Note No.
2024 2023 2024 2023 
Revenue from operations 2.16                             37,923                             37,441                           153,670                           146,767 
Other inc...
--------------------------------------------------
Chunk 2:
Relevance Score: 0.6753
Text (first 300 chars): Particulars
2024 2023 2024 2023 
Employee benefit expenses
Salaries including bonus                  19,527                  19,526                       79,315                 75,239 
Contribution to provident and other funds                       529                       547                      ...
--------------------------------------------------
Chunk 3:
Relevance Score: 0.6666
Text (first 300 chars): (In ₹ crore)
Part

In [19]:
query_str = "What is the gross profit for Q3 2024?"
response = await query_engine.aquery(query_str)

print("Original Query:", query_str)
cleaned_response = response.response.split('\n')[0].strip()
print("Response:", cleaned_response)

print("\nRetrieved Document Chunks:")
for i, node in enumerate(response.source_nodes, 1):
    print(f"Chunk {i}:")
    print(f"Relevance Score: {node.score:.4f}")
    print(f"Text (first 300 chars): {node.text[:300]}...")
    print("-" * 50)

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Original Query: What is the gross profit for Q3 2024?
Response: 153,670

Retrieved Document Chunks:
Chunk 1:
Relevance Score: 0.6552
Text (first 300 chars): The changes in the carrying value of property, plant and equipment for the year ended March 31, 2024 are as follows:
(In ₹ crore)
Particulars Land - 
Freehold
Buildings 
(1)
Plant and 
machinery 
Office 
Equipment 
Computer 
equipment
Furniture and 
fixtures
Leasehold 
Improvements
Vehicles Total
Gr...
--------------------------------------------------
Chunk 2:
Relevance Score: 0.6492
Text (first 300 chars): (In ₹ crore)
Particulars
2024 2023 2024 2023 
Revenue from software services                   36,064                    35,199               145,285                 137,575 
Revenue from products and platforms                     1,859                      2,242                   8,385             ...
--------------------------------------------------
Chunk 3:
Relevance Score: 0.6440
Text (first 300 chars): 921)        (2,630)

In [20]:
query_str = "How do the net income and operating expenses compare for Q1 2024?"
response = await query_engine.aquery(query_str)

print("Original Query:", query_str)
cleaned_response = response.response.split('\n')[0].strip()
print("Response:", cleaned_response)

print("\nRetrieved Document Chunks:")
for i, node in enumerate(response.source_nodes, 1):
    print(f"Chunk {i}:")
    print(f"Relevance Score: {node.score:.4f}")
    print(f"Text (first 300 chars): {node.text[:300]}...")
    print("-" * 50)

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Original Query: How do the net income and operating expenses compare for Q1 2024?
Response: The net income for Q1 2024 was ₹ 7,975 crore and the operating expenses for Q1 2024 were ₹ 30,412 crore. The operating expenses are higher than the net income. The net income is ₹ 7,975 crore, which is approximately 26.2% of the operating expenses ₹ 30,412 crore. The operating expenses are significantly higher than the net income, indicating that the company has a substantial operating cost base. However, the net income is still a substantial amount, indicating that the company is profitable. The operating expenses are higher than the net income, indicating that the company has a high operating cost base, but the net income is still a significant amount, indicating that the company is profitable. The operating expenses are higher than the net income, indicating that the company has a high operating cost base, but the net income is still a significant amount, indicating that the company is profit

In [21]:
query_str = "Show the operating margin for the past 6 months."
response = await query_engine.aquery(query_str)

print("Original Query:", query_str)
cleaned_response = response.response.split('\n')[0].strip()
print("Response:", cleaned_response)

print("\nRetrieved Document Chunks:")
for i, node in enumerate(response.source_nodes, 1):
    print(f"Chunk {i}:")
    print(f"Relevance Score: {node.score:.4f}")
    print(f"Text (first 300 chars): {node.text[:300]}...")
    print("-" * 50)

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Original Query: Show the operating margin for the past 6 months.
Response: 35.13%

Retrieved Document Chunks:
Chunk 1:
Relevance Score: 0.6523
Text (first 300 chars): (In ₹ crore)
Particulars
Financial 
Services (1)*
Retail (2) Communic
ation (3)
Energy, 
Utilities, 
Resources 
and Services 
Manufacturing Hi-Tech Life 
Sciences (4)
All other 
segments (5)
Total
Revenue from operations         42,158         22,504         17,991          20,035               22,2...
--------------------------------------------------
Chunk 2:
Relevance Score: 0.6451
Text (first 300 chars): 2.23
Business Segments
(In ₹ crore)
Particulars
Financial 
Services (1)*
Retail (2) Communic
ation (3)
Energy, 
Utilities, 
Resources 
and Services 
Manufacturing Hi-Tech Life 
Sciences (4)
All other 
segments (5)
Total
Revenue from operations         10,010           5,429           4,666          ...
--------------------------------------------------
Chunk 3:
Relevance Score: 0.6371
Text (first 300 chars): 418)
Tran

In [22]:
query_str = "What is the depreciation and amortization expenses for Q2 2023?"
response = await query_engine.aquery(query_str)

print("Original Query:", query_str)
cleaned_response = response.response.split('\n')[0].strip()
print("Response:", cleaned_response)

print("\nRetrieved Document Chunks:")
for i, node in enumerate(response.source_nodes, 1):
    print(f"Chunk {i}:")
    print(f"Relevance Score: {node.score:.4f}")
    print(f"Text (first 300 chars): {node.text[:300]}...")
    print("-" * 50)

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Original Query: What is the depreciation and amortization expenses for Q2 2023?
Response: 1,322

Retrieved Document Chunks:
Chunk 1:
Relevance Score: 0.7011
Text (first 300 chars): 2022                —         (4,100)        (2,344)         (1,150)        (6,034)             (1,779)                 (856)                         (37)                   (16,300)
Depreciation                —            (434)           (273)            (121)        (1,322)                (236)   ...
--------------------------------------------------
Chunk 2:
Relevance Score: 0.6934
Text (first 300 chars): 807)              (1,131)                         (42)                   (17,898)
Depreciation                —            (111)             (63)              (32)           (336)                  (58)                   (46)                           —                        (646)
Accumulated deprec...
--------------------------------------------------
Chunk 3:
Relevance Score: 0.6810
Text (first 300 cha

In [24]:
query_str = "What is the employee benefit expenses for Q2 2024?"
response = await query_engine.aquery(query_str)

print("Original Query:", query_str)
cleaned_response = response.response.split('\n')[0].strip()
print("Response:", cleaned_response)

print("\nRetrieved Document Chunks:")
for i, node in enumerate(response.source_nodes, 1):
    print(f"Chunk {i}:")
    print(f"Relevance Score: {node.score:.4f}")
    print(f"Text (first 300 chars): {node.text[:300]}...")
    print("-" * 50)

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Original Query: What is the employee benefit expenses for Q2 2024?
Response: 19,527

Retrieved Document Chunks:
Chunk 1:
Relevance Score: 0.6723
Text (first 300 chars): INFOSYS LIMITED AND SUBSIDIARIES
(In ₹ crore, except equity share and per equity share data)
Note No.
2024 2023 2024 2023 
Revenue from operations 2.16                             37,923                             37,441                           153,670                           146,767 
Other inc...
--------------------------------------------------
Chunk 2:
Relevance Score: 0.6610
Text (first 300 chars): net*                 —                 —                 —                —                —                —                  —                —                —                        (7)                            —                     —                       —                     (7)            ...
--------------------------------------------------
Chunk 3:
Relevance Score: 0.6608
Text (first 300 chars): Particu

In [25]:
query_str = "What is the tax expense for Q2 2023?"
response = await query_engine.aquery(query_str)

print("Original Query:", query_str)
cleaned_response = response.response.split('\n')[0].strip()
print("Response:", cleaned_response)

print("\nRetrieved Document Chunks:")
for i, node in enumerate(response.source_nodes, 1):
    print(f"Chunk {i}:")
    print(f"Relevance Score: {node.score:.4f}")
    print(f"Text (first 300 chars): {node.text[:300]}...")
    print("-" * 50)

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Original Query: What is the tax expense for Q2 2023?
Response: 2,260 (Note: The question is not clear. It seems like it is asking for the tax expense for Q2 2023, but the provided text is for the year 2024 and 2023. Assuming the question is asking for the tax expense for the year 2023, the answer would be 2,260, which is the current tax for the year 2023 mentioned in the text. If the question is asking for the tax expense for Q2 2023, the answer would be 1,173, which is the current tax for Q2 2024 mentioned in the text.)

Retrieved Document Chunks:
Chunk 1:
Relevance Score: 0.7019
Text (first 300 chars): INFOSYS LIMITED AND SUBSIDIARIES
(In ₹ crore, except equity share and per equity share data)
Note No.
2024 2023 2024 2023 
Revenue from operations 2.16                             37,923                             37,441                           153,670                           146,767 
Other inc...
--------------------------------------------------
Chunk 2:
Relevance Score: 0.6650
