# RAG Retrieval Optimization - Query Transformation (Hyde) using Amazon Bedrock and Llamaindex

HyDE is a special type of query transformation technique that enhances the retrieval process and improve the relevance of retrieved documents. Instead of directly using the original query for retrieval, HyDE leverages a language model to generate a hypothetical document that captures the essence of the query's intent. This hypothetical document is then converted into an embedding, which is used to search for similar real documents in the knowledge base. The underlying concept is that the hypothetical document may be closer in the embedding space to relevant information than the original query itself, potentially leading to more accurate and contextually appropriate retrievals. HyDE has shown promising results in improving RAG performance, especially in zero-shot, and it has demonstrated effectiveness across various languages and tasks.


This example is built on referece llamaindex [documentation](https://github.com/run-llama/llama_index/blob/main/docs/docs/examples/query_transformations/HyDEQueryTransformDemo.ipynb)

Here are the components we used:Here are the components we used:

- Vector Database (Faiss / local)
- LLM (Amazon Bedrock - Claude3 Sonnet)
- Embeddings Model (Bedrock Titan Text Embedding v2.0)
- Datasets ( Amazons SEC-10k statments )

## Pre-req
You must run the `[workshop_setup.ipynb]`(../lab00-setup/workshop_setup.ipynb) notebook in `lab00-setup` before starting this lab.

In [None]:
import warnings
warnings.warn("Warning: if you did not run lab00-setup, please go back and run the lab00 notebook") 

### > Setup

We start by importing necessary llamaindex libraries

In [None]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.callbacks import CallbackManager, LlamaDebugHandler
from llama_index.core import Settings
from llama_index.core.indices.query.query_transform import HyDEQueryTransform
from llama_index.core.query_engine import TransformQueryEngine
from IPython.display import Markdown, display, HTML
from termcolor import colored

In [None]:
from llama_index.llms.bedrock_converse import BedrockConverse

# Set your AWS profile name
profile_name = "default"

# Simple completion call
resp = BedrockConverse(
    model="us.amazon.nova-pro-v1:0",
    profile_name=profile_name,
).complete("Paul Graham is ")
print(resp)

We select Anthropic Claude3 Sonnet as our LLM. For embedding model, we are selecting Amazon Titan Text Embed v2.0. Chunk size is set at 128 for this example.

In [None]:
import json
from typing import Sequence, List
from llama_index.core.settings import Settings
from llama_index.llms.bedrock_converse import BedrockConverse
from llama_index.embeddings.bedrock import BedrockEmbedding, Models

profile_name = "default"

# define the LLM
llm = BedrockConverse(
    model="us.amazon.nova-pro-v1:0",
    profile_name=profile_name,
)

# define the embedding model
embed_model = BedrockEmbedding(model = "amazon.titan-embed-text-v2:0")

Settings.llm = llm
Settings.embed_model = embed_model
Settings.chunk_size = 512

from llama_index.core.llms import ChatMessage
from llama_index.core.tools import BaseTool, FunctionTool
import nest_asyncio
nest_asyncio.apply()

We ingest and index the data stored in data directory. The amazon folder has SEC-10k files from 2022 and 2023.

### > Document Ingestion

We ingest the documents and use [Titan Text Embeddings v2.0 model](https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html) to create the embedding for each document chunk. The amazon folder has SEC-10k files from 2022 and 2023.

In [None]:
# load data
amazon_secfiles = SimpleDirectoryReader(input_dir="../data/lab03/amazon/").load_data()

vector_index = VectorStoreIndex.from_documents(
    amazon_secfiles,
    use_async=True,
)

Define the query engine from index

In [None]:
query_engine = vector_index.as_query_engine(
    top_k=1
)

### > Test Using Naive RAG

In [None]:
query = "How may Covid impact Amazon's business model and financial outlook over the next decade?"

In [None]:
raw_response = query_engine.query(query)
print(colored(raw_response, "green"))

### > Test Using HyDEQueryTransform to generate Hypothetical 

`HyDEQueryTransform` is a llamaIndex module that uses a language model to generate a hypothetical document based on the query, which is then embedded and used for similarity search to retrieve relevant documents, potentially improving retrieval accuracy in RAG systems

In [None]:
hyde = HyDEQueryTransform(include_original=True)
hyde_query_engine = TransformQueryEngine(query_engine, hyde)

In [None]:
hyde_response = hyde_query_engine.query(query)
print(colored(hyde_response, "green"))

Here is the the hypothetic answer generated by HyDE

In [None]:
query_bundle = hyde(query)
hyde_doc = query_bundle.embedding_strs[0]
print(colored(hyde_doc, "green"))

### > Display the results side-by-side 

Notice the Hypothetic response generated by HyDE help much retrieve context that are more focused on Amazon's business model and financial outlook instead of other info that are less relevant to the question.

In [None]:
import pandas as pd

# Create the first table
df = pd.DataFrame({
    'Naive RAG': [query, raw_response],
    'RAG w/ HyDE': [query, hyde_response]
})

output=""
output += df.style.hide().set_table_attributes("style='display:inline'")._repr_html_()
output += "&nbsp;"

display(HTML(output))