# Query Engine

# install llma-index library

In [6]:
!pip install llama_index



# import Open AI Key

In [7]:
import os
os.environ["OPENAI_API_KEY"] = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

# create directory then load documents

In [27]:
!mkdir data
# you can use wget down load required documents here to data folder

mkdir: cannot create directory ‘data’: File exists


In [9]:
from pathlib import Path
from llama_index import download_loader

In [28]:
# pdf reader for loading document
PDFReader = download_loader("PDFReader")
loader=PDFReader()
#documents=loader.load_data(file=Path("./data/11230024851101_POLICY_DOC.pdf"))
documents=loader.load_data(file=Path("./data/TextEmbeddingswithLLM.pdf"))

In [29]:
len(documents)

15

# indexing

In [20]:
from llama_index import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents)

# Configuring Retriever

In [21]:
from llama_index.retrievers import VectorIndexRetriever
retriever=VectorIndexRetriever(
    index=index,
    similarity_top_k=10,
)

# Configuring Response Synthesizer

In [22]:
from llama_index.response_synthesizers import get_response_synthesizer
response_synthesizer = get_response_synthesizer(
    response_mode="compact"
)

# Query Engine

In [23]:
query_engine=index.as_query_engine(
    retriever=retriever,
    response_synthesizer=response_synthesizer
)

In [31]:
response = query_engine.query(
    "Give me who are the authors for this document"
)
print(response)

The authors for this document are Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, and others.


In [33]:
response.source_nodes[0].text

'Association for Computational Linguistics. doi: 10.18653/v1/N18-1074. URL https:\n//aclanthology.org/N18-1074 .\n[42] Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei,\nNikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al. Llama 2: Open\nfoundation and fine-tuned chat models. ArXiv preprint , abs/2307.09288, 2023. URL https:\n//arxiv.org/abs/2307.09288 .\n[43] Kexin Wang, Nandan Thakur, Nils Reimers, and Iryna Gurevych. GPL: Generative pseudo\nlabeling for unsupervised domain adaptation of dense retrieval. In Proceedings of the 2022\nConference of the North American Chapter of the Association for Computational Linguistics:\nHuman Language Technologies , pages 2345–2360, Seattle, United States, 2022. Association\nfor Computational Linguistics. doi: 10.18653/v1/2022.naacl-main.168. URL https://\naclanthology.org/2022.naacl-main.168 .\n[44] Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan\nMaju

In [25]:
response = query_engine.query(
    "Give me title of document"
)
print(response)

TextEmbeddingswithLLM


In [26]:
response = query_engine.query(
    "Give me what document main objective"
)
print(response)

The main objective of the document is to present a novel approach to train state-of-the-art text embeddings by exploiting the latest advances of Language Model Models (LLMs) and synthetic data.


In [30]:
response = query_engine.query(
    "Give me what are the llm models author speaking in this document"
)
print(response)

The llm models mentioned in this document are LLaMA-2, Mistral, RepLLaMA, SGPT, GTR, and Udever.


In [34]:
response = query_engine.query(
    "Give me what is the llama-2"
)
print(response)

Llama-2 is an open foundation and fine-tuned chat model. It is a notable effort in bridging the gap between proprietary and open-source Language Model (LLM) systems. Llama-2 aims to enhance the capabilities of LLMs by incorporating external information and improving text embeddings.


In [36]:
len(response.source_nodes)

2

In [37]:
response = query_engine.query(
    "summarize document using 1000 words "
)
print(response)

The document provides instructions for various tasks and datasets used for evaluation. It also mentions the performance of different models on these tasks. However, there is no specific information about summarizing a document using 1000 words.


this is the challenge with query engine it wont track conversation happend in the past

## **for solving that ChatEngine came to picture **

# Chat Engine

In [38]:
chat_engine=index.as_chat_engine(retrievr=retriever,
                                 response_synthesizer=response_synthesizer)

In [40]:
response=chat_engine.chat(
    "what are the llm models author spoke in this document"
)
print(response)

The author mentioned two LLM models in the document: LLaMA-2 and Mistral.


In [42]:
response=chat_engine.chat(
    "what are the llm models author spoke in this document, i want all llm models, u gave response is Based on the document, the author mentioned two LLM models: LLaMA-2 and Mistral. but what are all these  RepLLaMA, SGPT, GTR, and Udever. "
)
print(response)

In addition to LLaMA-2 and Mistral, the author also mentioned other LLM models in the document. These include:

1. GPT-4: GPT-4 is a highly advanced LLM model, but it is proprietary and has limited technical details disclosed.

2. GPL: GPL is another LLM model, but specific details about it are not provided in the document.

3. ChatGPT: ChatGPT is an LLM model that excels in conversational tasks and natural language understanding.

4. RepLLaMA: RepLLaMA is a model that enhances text embeddings based on LLMs.

5. SGPT: SGPT is another model that builds upon LLMs to improve text embeddings.

6. GTR: GTR is an LLM model that focuses on generating text responses.

7. Udever: Udever is an LLM model, but no specific details about it are mentioned in the document.

These are the LLM models mentioned in the document.


In [45]:
response = chat_engine.chat(
    "summarize document using 1000 words "
)
print(response)

The document discusses experiments conducted using synthetic data and the fine-tuning and evaluation of a model. The model achieved the highest average score on the MTEB benchmark, surpassing the previous state-of-the-art model by 2.4 points. The experiments involved using generated synthetic data and a collection of public datasets for training the model. The evaluation of the model on the MTEB benchmark took approximately 3 days using 8V100 GPUs. Notably, the model's performance remained competitive even when trained solely on synthetic data without any labeled data.
