In [3]:
# Set load env
import os
from dotenv import load_dotenv
load_dotenv()

True

In [4]:
# Set OpenAI API Key
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

In [5]:
from llama_index.core import SimpleDirectoryReader

documents = SimpleDirectoryReader("../datasets").load_data()
documents

[Document(id_='adc02026-0286-445c-b983-6e24a7787edc', embedding=None, metadata={'page_label': '1', 'file_name': 'attention-is-all-you-need-paper.pdf', 'file_path': '/home/laluprasadmahato@ADCNST.COM/Desktop/Learning/llamaindex-sandbox/experiments/../datasets/attention-is-all-you-need-paper.pdf', 'file_type': 'application/pdf', 'file_size': 569417, 'creation_date': '2025-02-08', 'last_modified_date': '2025-02-01'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={}, metadata_template='{key}: {value}', metadata_separator='\n', text_resource=MediaResource(embeddings=None, data=None, text='Attention Is All You Need\nAshish Vaswani∗\nGoogle Brain\navaswani@google.com\nNoam Shazeer∗\nGoogle Brain\nnoam@google.com\nNiki Parmar∗\nGoogle Research\nnikip@google.com\nJakob U

In [6]:
from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_documents(documents, show_progress=True)
index

  from .autonotebook import tqdm as notebook_tqdm
Parsing nodes: 100%|██████████| 71/71 [00:00<00:00, 1049.98it/s]
Generating embeddings: 100%|██████████| 72/72 [00:02<00:00, 30.51it/s]


<llama_index.core.indices.vector_store.base.VectorStoreIndex at 0x78c609fdb400>

In [7]:
query_engine = index.as_query_engine()
query_engine

<llama_index.core.query_engine.retriever_query_engine.RetrieverQueryEngine at 0x78c64f558430>

In [8]:
response = query_engine.query("What is transformer?")
print(response)

The Transformer is a network architecture based solely on attention mechanisms, eliminating the need for recurrence and convolutions. It consists of an encoder and a decoder connected through an attention mechanism, making it highly parallelizable and faster to train compared to models based on recurrent or convolutional neural networks. The Transformer has shown superior performance in quality on machine translation tasks and has achieved state-of-the-art results on tasks like English-to-German and English-to-French translation.


In [9]:
response = query_engine.query("What is the GDP rate of India?")
print(response)

The GDP rate of India is 4.8 percent.


### Format Response

In [10]:
from llama_index.core.response.pprint_utils import pprint_response

pprint_response(response)

Final Response: The GDP rate of India is 4.8 percent.


In [11]:
pprint_response(response, show_source=True)

Final Response: The GDP rate of India is 4.8 percent.
______________________________________________________________________
Source Node 1/2
Node ID: 9357ed2d-9073-4958-86df-2705a43248f6
Similarity: 0.8119578771711398
Text: 2     grown in this period. We see the next five years as a
unique opportunity to  realize ‘Sabka Vikas’, stimulating balanced
growth of all regions.  5. The great Telugu poet and playwright
Gurajada Appa Rao had said,  ‘Desamante Matti Kaadoi, Desamante
Manushuloi’; meaning, ‘A country is not  just its soil, a country is
its people.’ In line wi...
______________________________________________________________________
Source Node 2/2
Node ID: 9348abcd-5a04-4798-bf1c-38537875c285
Similarity: 0.8085952024844144
Text: 18     year. The objective is to strengthen trust-based economic
governance and take  transformational measures to enhance ‘ease of
doing business’, especially in  matters of inspections and
compliances.  States will be encouraged to join in  this endeavo

In [12]:
response = query_engine.query("What is attention all you need?")

pprint_response(response, show_source=True)

Final Response: The attention mechanism is a key component in the
proposed Transformer network architecture, which eliminates the need
for complex recurrent or convolutional neural networks traditionally
used in sequence transduction models. The Transformer model relies
solely on attention mechanisms for connecting the encoder and decoder,
resulting in improved quality, increased parallelizability, and
reduced training time compared to existing models.
______________________________________________________________________
Source Node 1/2
Node ID: 1b6109eb-f1e0-49d5-8bf6-d254c7a44048
Similarity: 0.8230618961326017
Text: Attention Is All You Need Ashish Vaswani∗ Google Brain
avaswani@google.com Noam Shazeer∗ Google Brain noam@google.com Niki
Parmar∗ Google Research nikip@google.com Jakob Uszkoreit∗ Google
Research usz@google.com Llion Jones∗ Google Research llion@google.com
Aidan N. Gomez∗† University of Toronto aidan@cs.toronto.edu Łukasz
Kaiser ∗ Google Brain ...
______________________

### Create query engine from retriever

In [13]:
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.query_engine import RetrieverQueryEngine

retriever = VectorIndexRetriever(index=index, similarity_top_k=4)

query_engine = RetrieverQueryEngine(retriever=retriever)

response = query_engine.query("What is attention all you need?")
pprint_response(response, show_source=True)

Final Response: Attention Is All You Need refers to a network
architecture proposed in the provided context that is based solely on
attention mechanisms, eliminating the need for recurrent or
convolutional neural networks. This architecture connects the encoder
and decoder through an attention mechanism, allowing for improved
quality, parallelizability, and reduced training time compared to
traditional models.
______________________________________________________________________
Source Node 1/4
Node ID: 1b6109eb-f1e0-49d5-8bf6-d254c7a44048
Similarity: 0.8230618961326017
Text: Attention Is All You Need Ashish Vaswani∗ Google Brain
avaswani@google.com Noam Shazeer∗ Google Brain noam@google.com Niki
Parmar∗ Google Research nikip@google.com Jakob Uszkoreit∗ Google
Research usz@google.com Llion Jones∗ Google Research llion@google.com
Aidan N. Gomez∗† University of Toronto aidan@cs.toronto.edu Łukasz
Kaiser ∗ Google Brain ...
_________________________________________________________________

In [14]:
import os.path
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, StorageContext, load_index_from_storage

# Check if storage already exists
PERSIST_DIR = "./storage"
if not os.path.exists(PERSIST_DIR):
    # Load the documents and create index
    documents = SimpleDirectoryReader("../datasets").load_data()
    index = VectorStoreIndex.from_documents(documents)
    
    # Store it for later
    index.storage_context.persist(persist_dir=PERSIST_DIR)
else:
    # Load the existing index
    storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
    index = load_index_from_storage(storage_context)
    
    
# Either way we can now query the index
query_engine = index.as_query_engine()
response = query_engine.query("What is transformers?")
print(response)

Transformers are a network architecture based solely on attention mechanisms, eliminating the need for recurrence and convolutions. They are used for sequence transduction tasks and have shown superior quality, parallelizability, and reduced training time compared to traditional models.


In [15]:
response = query_engine.query("What is transformers?")
pprint_response(response, show_source=True)

Final Response: Transformers are a network architecture based solely
on attention mechanisms, eliminating the need for recurrence and
convolutions. They are used in sequence transduction models and have
shown superior quality in tasks like machine translation. Transformers
are highly parallelizable, require less training time, and have
achieved state-of-the-art results in various natural language
processing tasks.
______________________________________________________________________
Source Node 1/2
Node ID: 0a7f25fa-44ad-42ff-9851-f9fab4bc43f0
Similarity: 0.791933734052169
Text: Table 3: Variations on the Transformer architecture. Unlisted
values are identical to those of the base model. All metrics are on
the English-to-German translation development set, newstest2013.
Listed perplexities are per-wordpiece, according to our byte-pair
encoding, and should not be compared to per-word perplexities. N d
model dff h d k dv ...
______________________________________________________________

In [17]:
response = query_engine.query("How much income tax I need to pay if my annual income is 12 Lack/Year?")
pprint_response(response, show_source=True)

Final Response: You do not need to pay any income tax if your annual
income is 12 lakh per year.
______________________________________________________________________
Source Node 1/2
Node ID: 32829307-9cef-40de-930a-a7e35a829f79
Similarity: 0.82596283290879
Text: 28     159. To tax payers upto ` 12 lakh of normal income (other
than special rate  income such as capital gains ) tax rebate is being
provided in addition to the  benefit due to slab rate reduction in
such a manner that there is no tax payable  by them. The total tax
benefit of slab rate changes and rebate at different  income levels
can be ill...
______________________________________________________________________
Source Node 2/2
Node ID: 8e2d5c6b-5ff7-4ed3-ab6f-dfff4cf18954
Similarity: 0.8109137801112181
Text: 46     Annexure to Part B  Amendments relating to Direct Taxes
(i) Personal Income-tax reforms with special focus on middle class  1.
Substantial relief is proposed under the new tax regime with new slabs
and tax r