# Query Engine and Chat Engine

### Setup

In [1]:
import os

In [2]:
from dotenv import load_dotenv, find_dotenv
load_dotenv('D:/.env')

In [None]:
OPENAI_API_KEY = os.environ['OPENAI_API_KEY']

### Download Data

In [2]:
!mkdir data
!wget "https://arxiv.org/pdf/1706.03762" -O 'data/transformers.pdf'

--2024-06-11 12:31:22--  https://arxiv.org/pdf/1706.03762
Resolving arxiv.org (arxiv.org)... 151.101.3.42, 151.101.67.42, 151.101.131.42, ...
Connecting to arxiv.org (arxiv.org)|151.101.3.42|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2215244 (2.1M) [application/pdf]
Saving to: ‘data/transformers.pdf’


2024-06-11 12:31:23 (4.07 MB/s) - ‘data/transformers.pdf’ saved [2215244/2215244]



In [5]:
from pathlib import Path
from llama_index.readers.file import PDFReader

In [6]:
loader = PDFReader()

In [7]:
documents = loader.load_data(file=Path('./data/transformers.pdf'))

In [8]:
from llama_index.core import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents)

In [9]:
# configure retriever
retriever = index.as_retriever()

In [10]:
#configure response synthesizer
from llama_index.core import get_response_synthesizer

response_synthesizer = get_response_synthesizer(response_mode="compact")

# Query Engine

In [11]:
query_engine = index.as_query_engine(response_synthesizer=response_synthesizer)

In [12]:
response = query_engine.query("Give me the authors of transformers paper")
print(response)

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin.


In [13]:
response.source_nodes

[NodeWithScore(node=TextNode(id_='0282e113-9099-4381-af6c-1adec861ac96', embedding=None, metadata={'page_label': '1', 'file_name': 'transformers.pdf'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='6857e802-c0a5-46ab-9bc0-c1b5b152523a', node_type=<ObjectType.DOCUMENT: '4'>, metadata={'page_label': '1', 'file_name': 'transformers.pdf'}, hash='8ee1a6b74c8119da4f2f22ebad7855b1ccacdbd9dfe663d653cdae769c33a0bb')}, text='Provided proper attribution is provided, Google hereby grants permission to\nreproduce the tables and figures in this paper solely for use in journalistic or\nscholarly works.\nAttention Is All You Need\nAshish Vaswani∗\nGoogle Brain\navaswani@google.comNoam Shazeer∗\nGoogle Brain\nnoam@google.comNiki Parmar∗\nGoogle Research\nnikip@google.comJakob Uszkoreit∗\nGoogle Research\nusz@google.com\nLlion Jones∗\nGoogle Research\nllion@google.comAidan N. Gomez∗ †\nUniversity of Toronto\naidan@

In [14]:
response.source_nodes[0].dict()

{'node': {'id_': '0282e113-9099-4381-af6c-1adec861ac96',
  'embedding': None,
  'metadata': {'page_label': '1', 'file_name': 'transformers.pdf'},
  'excluded_embed_metadata_keys': [],
  'excluded_llm_metadata_keys': [],
  'relationships': {<NodeRelationship.SOURCE: '1'>: {'node_id': '6857e802-c0a5-46ab-9bc0-c1b5b152523a',
    'node_type': <ObjectType.DOCUMENT: '4'>,
    'metadata': {'page_label': '1', 'file_name': 'transformers.pdf'},
    'hash': '8ee1a6b74c8119da4f2f22ebad7855b1ccacdbd9dfe663d653cdae769c33a0bb',
    'class_name': 'RelatedNodeInfo'}},
  'text': 'Provided proper attribution is provided, Google hereby grants permission to\nreproduce the tables and figures in this paper solely for use in journalistic or\nscholarly works.\nAttention Is All You Need\nAshish Vaswani∗\nGoogle Brain\navaswani@google.comNoam Shazeer∗\nGoogle Brain\nnoam@google.comNiki Parmar∗\nGoogle Research\nnikip@google.comJakob Uszkoreit∗\nGoogle Research\nusz@google.com\nLlion Jones∗\nGoogle Research\nllio

In [15]:
response.source_nodes[0].text

'Provided proper attribution is provided, Google hereby grants permission to\nreproduce the tables and figures in this paper solely for use in journalistic or\nscholarly works.\nAttention Is All You Need\nAshish Vaswani∗\nGoogle Brain\navaswani@google.comNoam Shazeer∗\nGoogle Brain\nnoam@google.comNiki Parmar∗\nGoogle Research\nnikip@google.comJakob Uszkoreit∗\nGoogle Research\nusz@google.com\nLlion Jones∗\nGoogle Research\nllion@google.comAidan N. Gomez∗ †\nUniversity of Toronto\naidan@cs.toronto.eduŁukasz Kaiser∗\nGoogle Brain\nlukaszkaiser@google.com\nIllia Polosukhin∗ ‡\nillia.polosukhin@gmail.com\nAbstract\nThe dominant sequence transduction models are based on complex recurrent or\nconvolutional neural networks that include an encoder and a decoder. The best\nperforming models also connect the encoder and decoder through an attention\nmechanism. We propose a new simple network architecture, the Transformer,\nbased solely on attention mechanisms, dispensing with recurrence and con

In [16]:
print(response.source_nodes[0].get_content())

Provided proper attribution is provided, Google hereby grants permission to
reproduce the tables and figures in this paper solely for use in journalistic or
scholarly works.
Attention Is All You Need
Ashish Vaswani∗
Google Brain
avaswani@google.comNoam Shazeer∗
Google Brain
noam@google.comNiki Parmar∗
Google Research
nikip@google.comJakob Uszkoreit∗
Google Research
usz@google.com
Llion Jones∗
Google Research
llion@google.comAidan N. Gomez∗ †
University of Toronto
aidan@cs.toronto.eduŁukasz Kaiser∗
Google Brain
lukaszkaiser@google.com
Illia Polosukhin∗ ‡
illia.polosukhin@gmail.com
Abstract
The dominant sequence transduction models are based on complex recurrent or
convolutional neural networks that include an encoder and a decoder. The best
performing models also connect the encoder and decoder through an attention
mechanism. We propose a new simple network architecture, the Transformer,
based solely on attention mechanisms, dispensing with recurrence and convolutions
entirely. Experime

In [17]:
response = query_engine.query("What is the use of positional encoding?")
print(response)

Positional encoding is used in the model to provide information about the relative or absolute position of tokens in a sequence. This is necessary because the model does not have recurrence or convolution, so positional encodings help the model understand the order of tokens in the sequence.


In [18]:
response = query_engine.query("What is the use of positional encoding? Answer in approx 250 characters.")
print(response)

Positional encoding is used in models without recurrence or convolution to provide information about the position of tokens in a sequence. It helps the model understand the order of tokens by injecting relative or absolute positional information into input embeddings.


In [19]:
print(response.get_formatted_sources())

> Source (Doc id: a7cf020e-a842-49f3-8336-fee1188febef): Table 1: Maximum path lengths, per-layer complexity and minimum number of sequential operations
f...

> Source (Doc id: 989dc1a9-e249-42aa-94b7-05e833496152): length nis smaller than the representation dimensionality d, which is most often the case with
se...


In [20]:
response.metadata

{'a7cf020e-a842-49f3-8336-fee1188febef': {'page_label': '6',
  'file_name': 'transformers.pdf'},
 '989dc1a9-e249-42aa-94b7-05e833496152': {'page_label': '7',
  'file_name': 'transformers.pdf'}}

In [21]:
len(response.response)

268

# Chat Engine

In [22]:
chat_engine = index.as_chat_engine(response_synthesizer=response_synthesizer)

In [24]:
response = chat_engine.chat("Give me the authors of transformers")
print(response)

The authors of Transformers are Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin.


In [25]:
response = chat_engine.chat("What is the use of positional encoding?")
print(response)

Positional encoding in Transformers is crucial for the model to understand the sequential order of the input tokens. It allows the Transformer to differentiate between tokens based on their position in the sequence, as the model does not inherently understand the order of tokens like a recurrent neural network would. This positional encoding is added to the input embeddings at the beginning of the model, providing information about the position of tokens in the sequence.
