## Steps in the RAG System ##

1. Data INgestion
2. Indexing
3. Retriver
4. Response synthesizer
5. Querying

In [9]:
%%capture
!pip install -r requirements.txt

In [10]:
import os
from dotenv import load_dotenv, find_dotenv

In [11]:
load_dotenv(".env")

True

## Step-1: Data Ingestion

In [15]:
from llama_index.core import SimpleDirectoryReader
documents = SimpleDirectoryReader(input_files=["data/transformers.pdf"]).load_data()
len(documents)

15

In [16]:
documents[0]

Document(id_='47ba02e2-20c4-444c-a6b7-a8672f3d5f9e', embedding=None, metadata={'page_label': '1', 'file_name': 'transformers.pdf', 'file_path': 'data/transformers.pdf', 'file_type': 'application/pdf', 'file_size': 2215244, 'creation_date': '2024-07-27', 'last_modified_date': '2024-03-27'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={}, text='Provided proper attribution is provided, Google hereby grants permission to\nreproduce the tables and figures in this paper solely for use in journalistic or\nscholarly works.\nAttention Is All You Need\nAshish Vaswani∗\nGoogle Brain\navaswani@google.comNoam Shazeer∗\nGoogle Brain\nnoam@google.comNiki Parmar∗\nGoogle Research\nnikip@google.comJakob Uszkoreit∗\nGoogle Research\nusz@google.com\nLlion Jones∗\nGoogle Research

In [17]:
documents[0].metadata

{'page_label': '1',
 'file_name': 'transformers.pdf',
 'file_path': 'data/transformers.pdf',
 'file_type': 'application/pdf',
 'file_size': 2215244,
 'creation_date': '2024-07-27',
 'last_modified_date': '2024-03-27'}

In [19]:
print(documents[0].text)

Provided proper attribution is provided, Google hereby grants permission to
reproduce the tables and figures in this paper solely for use in journalistic or
scholarly works.
Attention Is All You Need
Ashish Vaswani∗
Google Brain
avaswani@google.comNoam Shazeer∗
Google Brain
noam@google.comNiki Parmar∗
Google Research
nikip@google.comJakob Uszkoreit∗
Google Research
usz@google.com
Llion Jones∗
Google Research
llion@google.comAidan N. Gomez∗ †
University of Toronto
aidan@cs.toronto.eduŁukasz Kaiser∗
Google Brain
lukaszkaiser@google.com
Illia Polosukhin∗ ‡
illia.polosukhin@gmail.com
Abstract
The dominant sequence transduction models are based on complex recurrent or
convolutional neural networks that include an encoder and a decoder. The best
performing models also connect the encoder and decoder through an attention
mechanism. We propose a new simple network architecture, the Transformer,
based solely on attention mechanisms, dispensing with recurrence and convolutions
entirely. Experime

## Embedding Model

In [20]:
from llama_index.embeddings.openai import OpenAIEmbedding
embed_model = OpenAIEmbedding(model="text-embedding-3-large")

## LLM

In [21]:
from llama_index.llms.openai import OpenAI
llm = OpenAI(model="gpt-4")

## Stage-2: Indexing

In [22]:
from llama_index.core import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents=documents, embed_model=embed_model)

## Step-3: Retrieval

In [23]:
retriever = index.as_retriever()
retrieved_nodes = retriever.retrieve("what is self attention?")
retrieved_nodes[0].metadata

{'page_label': '4',
 'file_name': 'transformers.pdf',
 'file_path': 'data/transformers.pdf',
 'file_type': 'application/pdf',
 'file_size': 2215244,
 'creation_date': '2024-07-27',
 'last_modified_date': '2024-03-27'}

## Step-4: Response Synthesizer

In [24]:
from llama_index.core import get_response_synthesizer
response_synthesizer = get_response_synthesizer(llm=llm)

## Step-5: Query Engine

In [25]:
query_engine = index.as_query_engine(llm=llm, response_synthesizer=response_synthesizer)
response = query_engine.query("What is self attention?")

In [26]:
response.response

'Self-attention is a mechanism used in sequence transduction tasks that connects all positions within a sequence with a constant number of sequentially executed operations. It is faster than recurrent layers when the sequence length is large. This mechanism is beneficial for learning long-range dependencies in a sequence, as it allows for shorter paths between any combination of positions in the input and output sequences.'