# Query Reformulation Supported by Knowledge Bases on Amazon Bedrock

Optimizing quality, cost, and latency are some of the most important factors when developing RAG-based GenAI applications. Very often, input queries to an Foundation Model (FM) can be very complex with many questions and complex relationships. With such complex queries, the embedding step may mask or dilute important components of the query, resulting in retrieved chunks that may not provide context for all aspects of the query. This can produce a less than desirable response from your RAG application.

Now with query reformulation, we can take a complex input prompt and break it down into multiple sub-queries. These sub-queries will then separately go through their own retrieval steps for relevant chunks. The resulting chunks will then be pooled and ranked together before passing them to the FM to generate a response. Query reformulation is another tool we can use which can help increase accuracy for complex queries that your application may face in production.


# Notebook setup
Follow the steps below with a compatible role and compute environment to get started

In [1]:
%pip install --force-reinstall -q -r utils/requirements.txt

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
sagemaker 2.215.0 requires attrs<24,>=23.1.0, but you have attrs 24.2.0 which is incompatible.
sagemaker-datawrangler 0.4.3 requires ipywidgets<8.0.0, but you have ipywidgets 8.1.5 which is incompatible.
sagemaker-datawrangler 0.4.3 requires sagemaker-data-insights==0.4.0, but you have sagemaker-data-insights 0.3.3 which is incompatible.
sparkmagic 0.20.4 requires nest-asyncio==1.5.5, but you have nest-asyncio 1.6.0 which is incompatible.
sphinx 7.2.6 requires docutils<0.21,>=0.18.1, but you have docutils 0.16 which is incompatible.[0m[31m
[0mNote: you may need to restart the kernel to use updated packages.


In [2]:
# restart kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

In [3]:
%store -r kb_id

In [4]:
import boto3
import botocore
import os
import json
import logging
import os

# confirm we are at boto3 version 1.34.143 or above
print(boto3.__version__)

1.35.16


In [5]:
#Clients
s3_client = boto3.client('s3')
sts_client = boto3.client('sts')
session = boto3.session.Session()
region =  session.region_name
account_id = sts_client.get_caller_identity()["Account"]
bedrock_agent_client = boto3.client('bedrock-agent')
bedrock_agent_runtime_client = boto3.client('bedrock-agent-runtime') 
logging.basicConfig(format='[%(asctime)s] p%(process)s {%(filename)s:%(lineno)d} %(levelname)s - %(message)s', level=logging.INFO)
logger = logging.getLogger(__name__)
region, account_id

('us-west-2', '511021163672')


## Pre-requisites

In this notebook, we will use a already created knowledge base using Octank Financial 10K document available [here](../synthetic_dataset) as a text corpus to perform Q&A on. 

So, before exploring this notebook further, make sure that you have created the Knowledge Bases for Amazon Bedrock and ingested your documents in this knowledge base.

for more details on how to create the Knowledge Base and ingest you documents, please refer this [notebook](../01-rag-concepts/01_create_ingest_documents_test_kb_multi_ds.ipynb)

Note the Knowledge Base ID



In [6]:
# kb_id = "<<knowledge_base_id>>" # Replace with your knowledge base id here.

# Define FM to be used for generations 
foundation_model ='anthropic.claude-3-sonnet-20240229-v1:0'  # we will be using Anthropic Claude 3 Sonnet throughout the notebook

# Query Reformulation in Action

In this notebook, we will investigate a simple and a more complex query that could benefit from query reformulation and see how it affects the generated responses. 

##  Complex prompt

To demonstrate the functionality, lets take a look at a query that has a few asks being made about some information contained in the Octank 10K financial document. This query contains a few asks that are not semantically related. When this query is embedded during the retrieval step, some aspects of the query may become diluted and therefore the relevant chunks returned may not address all components of this complex query.

To query our Knowledge Base and generate a response we will use the __retrieve_and_generate__ API call. To use the query reformulation feature, we will include in our knowledge base configuration the additional information as shown below:

```
'orchestrationConfiguration': {
        'queryTransformationConfiguration': {
            'type': 'QUERY_DECOMPOSITION'
        }
    }
```

__Note:__ The output response structure is the same as a normal __retrieve_and_generate__ without query reformulation.

#### Without Query Reformulation

Let's see how the generated result looks like for the following query without using query reformulation: 

"Where is the Octank company waterfront building located and how does the whistleblower scandal hurt the company and its image?"

In [7]:
query = "What is octank tower and how does the whistleblower scandal hurt the company and its image?"

In [8]:
response_ret = bedrock_agent_runtime_client.retrieve_and_generate(
    input={
        "text": query
    },
    retrieveAndGenerateConfiguration={
        "type": "KNOWLEDGE_BASE",
        "knowledgeBaseConfiguration": {
            'knowledgeBaseId': kb_id,
            "modelArn": "arn:aws:bedrock:{}::foundation-model/{}".format(region, foundation_model),
            "retrievalConfiguration": {
                "vectorSearchConfiguration": {
                    "numberOfResults":5
                } 
            }
        }
    }
)


# generated text output

print(response_ret['output']['text'],end='\n'*2)

Octank Tower is the iconic headquarters of Octank Financial, designed by a renowned architect. It is an 800,000 square foot office building housing over 3,500 employees across various departments. The building is equipped with cutting-edge technology and has received LEED Platinum certification for its green initiatives and eco-friendly amenities. The whistleblower scandal involving the former Chief Financial Officer Person X, who was accused of insider trading and other illegal activities, has significantly hurt Octank Financial's reputation and image. The company's stock price plummeted after the scandal came to light, causing substantial losses for shareholders. The scandal has also led to a decline in employee morale and created a sense of uncertainty about the company's future. Octank has taken steps to address the situation, such as launching an internal investigation, implementing new policies to prevent illegal activities, and establishing a whistleblower hotline. However, the 

In [9]:
response_without_qr = response_ret['citations'][0]['retrievedReferences']
print("# of citations or chunks used to generate the response: ", len(response_without_qr))
def citations_rag_print(response_ret):
#structure 'retrievalResults': list of contents. Each list has content, location, score, metadata
    for num,chunk in enumerate(response_ret,1):
        print(f'Chunk {num}: ',chunk['content']['text'],end='\n'*2)
        print(f'Chunk {num} Location: ',chunk['location'],end='\n'*2)
        print(f'Chunk {num} Metadata: ',chunk['metadata'],end='\n'*2)

citations_rag_print(response_without_qr)

# of citations or chunks used to generate the response:  1
Chunk 1:  This iconic structure, designed by renowned architect PersonA, is a symbol of our commitment to innovation and sustainability. Octank Tower boasts 800,000 square feet of office space, housing more than 3,500 employees across various departments, including research and development, marketing, finance, and human resources.   The building is equipped with cutting-edge technology, including a sophisticated building management system that optimizes energy efficiency and indoor air quality. Octank Tower has received several awards for its green initiatives, including LEED Platinum certification, and features a range of eco-friendly amenities, such as solar panels, rainwater harvesting systems, and electric vehicle charging stations.   **Regional Offices: A Network of Strategic Locations**   Octank Financial has a strong presence in major cities across the globe, with 15 regional offices in North America, Europe, Asia, and O

As seen from the above citations, our retrieval with the complex query did not return any chunks relevant to the building, instead focusing on embeddings that was most similar to the whistleblower incident. 

This may indicate the embedding of the query resulted in some dilution of the semantics of that part of the query.

#### With Query Reformulation

Now let's see how query reformulation can benefit the more aligned context retrieval, which in turn, will enhace the accuracy of response generation.

In [10]:
response_ret = bedrock_agent_runtime_client.retrieve_and_generate(
    input={
        "text": query
    },
    retrieveAndGenerateConfiguration={
        "type": "KNOWLEDGE_BASE",
        "knowledgeBaseConfiguration": {
            'knowledgeBaseId': kb_id,
            "modelArn": "arn:aws:bedrock:{}::foundation-model/{}".format(region, foundation_model),
            "retrievalConfiguration": {
                "vectorSearchConfiguration": {
                    "numberOfResults":5
                } 
            },
            'orchestrationConfiguration': {
                'queryTransformationConfiguration': {
                    'type': 'QUERY_DECOMPOSITION'
                }
            }
        }
    }
)


# generated text output

print(response_ret['output']['text'],end='\n'*2)

Octank Tower is the global headquarters of Octank Financial, a 40-story state-of-the-art building located in downtown Metropolis. It is an iconic structure designed by renowned architect PersonA and symbolizes the company's commitment to innovation and sustainability. The 800,000 square foot building houses over 3,500 employees and has received LEED Platinum certification for its green initiatives like solar panels and rainwater harvesting systems. The whistleblower scandal involving the former Chief Financial Officer Person X has significantly hurt Octank Financial's reputation and image. Person X was accused of insider trading and other illegal activities, causing the company's stock price to plummet and resulting in substantial losses for shareholders. The scandal has also led to a decline in employee morale and created a sense of unease about the company's future. Octank's reputation and ability to maintain ethical standards have been called into question.



Let's take a look at the retrieved chunks with query reformulation

In [11]:
response_with_qr = response_ret['citations'][0]['retrievedReferences']
print("# of citations or chunks used to generate the response: ", len(response_with_qr))


citations_rag_print(response_with_qr)

# of citations or chunks used to generate the response:  2
Chunk 1:  For instance, we are currently involved in a legal dispute with a former employee who claims that she was wrongfully terminated.   ### Environmental, Social, and Governance (ESG) Risk   Octank Financial is exposed to ESG risk due to the possibility that our investments may be negatively affected by environmental, social, or governance issues. We have an ESG risk management program to minimize this risk, but we cannot eliminate it entirely. For example, our investment in a coal-fired power plant has exposed us to reputational and financial risks due to growing concerns about climate change.   ### Foreign Exchange Risk   Octank Financial is exposed to foreign exchange risk due to the possibility that changes in exchange rates could negatively impact the value of our investments or our financial results. We have a foreign exchange risk management program to minimize this risk, but we cannot eliminate it entirely. For ins

We can see that with query reformulation turned on, the chunks that have been retrieved now provide context for the whistlblower scandal and the location of the waterfront property components.

### Observing prompt decomposition using CloudWatch Logs

Before performing retrieval, the complex query is broken down into multiple subqueries. This can be seen for the above example query when we isolate the invocation for the decomposition action where our __standalone_question__ is our original query and the resulting subqueries are shown between __\<query\>__ tags

__Note__: You must enable invocation logging in Bedrock for the logs to be viewed in CloudWatch. Please refer [here](https://docs.aws.amazon.com/bedrock/latest/userguide/model-invocation-logging.html) for details.


```
<generated_queries>

<standalone_question>
What is octank tower and how does the whistleblower scandal hurt the company and its image?
</standalone_question>

<query>
What is octank tower?
</query>

<query>
What is the whistleblower scandal involving Octank company?
</query>

<query>
How did the whistleblower scandal affect Octank company's reputation and public image?
</query>

</generated_queries>
```


<div class="alert alert-block alert-warning">
<b>Note:</b> Remember to delete KB, OSS index and related IAM roles and policies to avoid incurring any charges.
</div>

Now that we have seen how query reformulation works and how it can improve responses to complex queries, we invite you to dive deeper and experiment with this technique to optimize your RAG worflow. 