# Query Reformulation Supported by Amazon Bedrock Knowledge Bases

Optimizing quality, cost, and latency are some of the most important factors when developing RAG-based GenAI applications. Very often, input queries to an Foundation Model (FM) can be very complex with many questions and complex relationships. With such complex queries, the embedding step may mask or dilute important components of the query, resulting in retrieved chunks that may not provide context for all aspects of the query. This can produce a less than desirable response from your RAG application.

Now with query reformulation, we can take a complex input prompt and break it down into multiple sub-queries. These sub-queries will then separately go through their own retrieval steps for relevant chunks. The resulting chunks will then be pooled and ranked together before passing them to the FM to generate a response. Query reformulation is another tool we can use which can help increase accuracy for complex queries that your application may face in production.


# Notebook setup
Follow the steps below with a compatible role and compute environment to get started

In [1]:
%pip install --upgrade pip --quiet
%pip install -r ../requirements.txt --no-deps --quiet
%pip install -r ../requirements.txt --upgrade --quiet

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [2]:
# restart kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

In [3]:
import os
import sys
import time
import boto3
import logging
import pprint
import json

# Set the path to import module
from pathlib import Path
current_path = Path().resolve()
current_path = current_path.parent
if str(current_path) not in sys.path:
    sys.path.append(str(current_path))
# Print sys.path to verify
# print(sys.path)

from utils.knowledge_base import BedrockKnowledgeBase

In [4]:
#Clients
s3_client = boto3.client('s3')
sts_client = boto3.client('sts')
session = boto3.session.Session()
region =  session.region_name
account_id = sts_client.get_caller_identity()["Account"]
bedrock_agent_client = boto3.client('bedrock-agent')
bedrock_agent_runtime_client = boto3.client('bedrock-agent-runtime') 
logging.basicConfig(format='[%(asctime)s] p%(process)s {%(filename)s:%(lineno)d} %(levelname)s - %(message)s', level=logging.INFO)
logger = logging.getLogger(__name__)
region, account_id

('us-west-2', '183631345587')

In [5]:
import time

# Get the current timestamp
current_time = time.time()

# Format the timestamp as a string
timestamp_str = time.strftime("%Y%m%d%H%M%S", time.localtime(current_time))[-7:]
# Create the suffix using the timestamp
suffix = f"{timestamp_str}"
knowledge_base_name_standard = 'standard-kb'
knowledge_base_description = "Octank 10k KB"
bucket_name = f'{knowledge_base_name_standard}-{suffix}'

foundation_model = "anthropic.claude-3-sonnet-20240229-v1:0"

data_source=[{"type": "S3", "bucket_name": bucket_name}]

## 2 - Create knowledge bases with fixed chunking strategy

In [6]:
knowledge_base_standard = BedrockKnowledgeBase(
    kb_name=f'{knowledge_base_name_standard}-{suffix}',
    kb_description=knowledge_base_description,
    data_sources=data_source,
    chunking_strategy = "FIXED_SIZE", 
    suffix = f'{suffix}-f'
)

Step 1 - Creating or retrieving S3 bucket(s) for Knowledge Base documents
['standard-kb-4172007']
buckets_to_check:  ['standard-kb-4172007']
Creating bucket standard-kb-4172007
Step 2 - Creating Knowledge Base Execution Role (AmazonBedrockExecutionRoleForKnowledgeBase_4172007-f) and Policies
Step 3a - Creating OSS encryption, network and data access policies
Step 3b - Creating OSS Collection (this step takes a couple of minutes to complete)
{ 'ResponseMetadata': { 'HTTPHeaders': { 'connection': 'keep-alive',
                                         'content-length': '320',
                                         'content-type': 'application/x-amz-json-1.0',
                                         'date': 'Thu, 04 Dec 2025 17:20:10 '
                                                 'GMT',
                                         'x-amzn-requestid': '1e6959ef-09c6-418c-b6ae-e908489a5fa1'},
                        'HTTPStatusCode': 200,
                        'RequestId': '1e6959ef-09c

[2025-12-04 17:21:41,115] p7152 {base.py:258} INFO - PUT https://jj6hc6vdcqff04judabh.us-west-2.aoss.amazonaws.com:443/bedrock-sample-rag-index-4172007-f [status:200 request:0.430s]



Creating index:
{ 'acknowledged': True,
  'index': 'bedrock-sample-rag-index-4172007-f',
  'shards_acknowledged': True}
Step 4 - Will create Lambda Function if chunking strategy selected as CUSTOM
Not creating lambda function as chunking strategy is FIXED_SIZE
Step 5 - Creating Knowledge Base
{ 'createdAt': datetime.datetime(2025, 12, 4, 17, 22, 41, 221928, tzinfo=tzlocal()),
  'description': 'Octank 10k KB',
  'knowledgeBaseArn': 'arn:aws:bedrock:us-west-2:183631345587:knowledge-base/ZOOU3JIT3K',
  'knowledgeBaseConfiguration': { 'type': 'VECTOR',
                                  'vectorKnowledgeBaseConfiguration': { 'embeddingModelArn': 'arn:aws:bedrock:us-west-2::foundation-model/amazon.titan-embed-text-v2:0'}},
  'knowledgeBaseId': 'ZOOU3JIT3K',
  'name': 'standard-kb-4172007',
  'roleArn': 'arn:aws:iam::183631345587:role/AmazonBedrockExecutionRoleForKnowledgeBase_4172007-f',
  'status': 'CREATING',
  'storageConfiguration': { 'opensearchServerlessConfiguration': { 'collectionArn

## 2.1 Upload the dataset to Amazon S3
Now that we have created the knowledge base, let's populate it with the `Octank financial 10K` report dataset. The Knowledge Base data source expects the data to be available on the S3 bucket connected to it and changes on the data can be syncronized to the knowledge base using the `StartIngestionJob` API call. In this example we will use the [boto3 abstraction](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-agent/client/start_ingestion_job.html) of the API, via our helper classe. 

Let's first upload the menu's data available on the `dataset` folder to s3.

In [7]:
import os

def upload_directory(path, bucket_name):
    for root, dirs, files in os.walk(path):
        for file in files:
            file_to_upload = os.path.join(root, file)
            if file not in ["LICENSE", "NOTICE", "README.md"]:
                print(f"uploading file {file_to_upload} to {bucket_name}")
                s3_client.upload_file(file_to_upload, bucket_name, file)
            else:
                print(f"Skipping file {file_to_upload}")

upload_directory("../synthetic_dataset", bucket_name)


Skipping file ../synthetic_dataset/LICENSE
Skipping file ../synthetic_dataset/NOTICE
Skipping file ../synthetic_dataset/README.md
uploading file ../synthetic_dataset/bda.m4v to standard-kb-4172007
uploading file ../synthetic_dataset/octank_financial_10K.pdf to standard-kb-4172007
uploading file ../synthetic_dataset/podcastdemo.mp3 to standard-kb-4172007


In [8]:
# ensure that the kb is available
time.sleep(30)
# sync knowledge base
knowledge_base_standard.start_ingestion_job()

job 1 started successfully

{ 'dataSourceId': 'NXBKUHBDLU',
  'failureReasons': [ '["Encountered error: Ignored 1 files as their file '
                      'format was not supported. [Files: '
                      's3://standard-kb-4172007/podcastdemo.mp3]. Call to '
                      'Customer Source did not succeed.","Encountered error: '
                      'Ignored 1 files as their file format was not supported. '
                      '[Files: s3://standard-kb-4172007/bda.m4v]. Call to '
                      'Customer Source did not succeed."]'],
  'ingestionJobId': '7Y01CPFODA',
  'knowledgeBaseId': 'ZOOU3JIT3K',
  'startedAt': datetime.datetime(2025, 12, 4, 17, 23, 14, 286091, tzinfo=tzlocal()),
  'statistics': { 'numberOfDocumentsDeleted': 0,
                  'numberOfDocumentsFailed': 2,
                  'numberOfDocumentsScanned': 3,
                  'numberOfMetadataDocumentsModified': 0,
                  'numberOfMetadataDocumentsScanned': 0,
                 

In [9]:
kb_id = knowledge_base_standard.get_knowledge_base_id()

'ZOOU3JIT3K'


# Query Reformulation in Action

In this notebook, we will investigate a simple and a more complex query that could benefit from query reformulation and see how it affects the generated responses. 

##  Complex prompt

To demonstrate the functionality, lets take a look at a query that has a few asks being made about some information contained in the Octank 10K financial document. This query contains a few asks that are not semantically related. When this query is embedded during the retrieval step, some aspects of the query may become diluted and therefore the relevant chunks returned may not address all components of this complex query.

To query our Knowledge Base and generate a response we will use the __retrieve_and_generate__ API call. To use the query reformulation feature, we will include in our knowledge base configuration the additional information as shown below:

```
'orchestrationConfiguration': {
        'queryTransformationConfiguration': {
            'type': 'QUERY_DECOMPOSITION'
        }
    }
```

__Note:__ The output response structure is the same as a normal __retrieve_and_generate__ without query reformulation.

#### Without Query Reformulation

Let's see how the generated result looks like for the following query without using query reformulation: 

"Where is the Octank company waterfront building located and how does the whistleblower scandal hurt the company and its image?"

In [10]:
query = "What is octank tower and how does the whistleblower scandal hurt the company and its image?"

In [11]:
response_ret = bedrock_agent_runtime_client.retrieve_and_generate(
    input={
        "text": query
    },
    retrieveAndGenerateConfiguration={
        "type": "KNOWLEDGE_BASE",
        "knowledgeBaseConfiguration": {
            'knowledgeBaseId': kb_id,
            "modelArn": "arn:aws:bedrock:{}::foundation-model/{}".format(region, foundation_model),
            "retrievalConfiguration": {
                "vectorSearchConfiguration": {
                    "numberOfResults":5
                } 
            }
        }
    }
)


# generated text output

print(response_ret['output']['text'],end='\n'*2)

The search results do not mention anything called "Octank Tower". However, they do provide details on a whistleblower scandal involving Octank Financial, a company that provides financial services. The scandal involved allegations that the company's former Chief Financial Officer, Person X, engaged in insider trading and other illegal activities. This scandal has significantly hurt Octank Financial's reputation and image in several ways: The company's stock price plummeted after news of the investigation broke, causing substantial losses for shareholders. Many began questioning Octank's ability to maintain ethical standards and protect investors. The scandal also negatively impacted employee morale, with workers feeling demoralized and uncertain about the company's future. Octank's reputation took a major hit as a result of the scandal.



In [12]:
response_without_qr = response_ret['citations'][0]['retrievedReferences']
print("# of citations or chunks used to generate the response: ", len(response_without_qr))
def citations_rag_print(response_ret):
#structure 'retrievalResults': list of contents. Each list has content, location, score, metadata
    for num,chunk in enumerate(response_ret,1):
        print(f'Chunk {num}: ',chunk['content']['text'],end='\n'*2)
        print(f'Chunk {num} Location: ',chunk['location'],end='\n'*2)
        print(f'Chunk {num} Metadata: ',chunk['metadata'],end='\n'*2)

citations_rag_print(response_without_qr)

# of citations or chunks used to generate the response:  2
Chunk 1:  The impact of the scandal on Octank has been significant. The company's stock price plummeted following the news of the investigation, causing substantial losses for shareholders. Additionally, the company's reputation has taken a major hit, with many questioning its ability to maintain ethical standards and protect investors.     To make matters worse, the scandal has also had a ripple effect on Octank's employees. Morale has taken a hit, with many feeling demoralized and uncertain about the future of the company. Furthermore, the investigation has created a sense of unease and uncertainty, as employees worry about the potential fallout and impact on their jobs.     In response to the scandal, Octank has taken swift action to address the situation. The company has launched an internal investigation and has cooperated fully with the SEC's investigation. Person X has been placed on administrative leave and has since re

As seen from the above citations, our retrieval with the complex query did not return any chunks relevant to the building, instead focusing on embeddings that was most similar to the whistleblower incident. 

This may indicate the embedding of the query resulted in some dilution of the semantics of that part of the query.

#### With Query Reformulation

Now let's see how query reformulation can benefit the more aligned context retrieval, which in turn, will enhace the accuracy of response generation.

In [13]:
response_ret = bedrock_agent_runtime_client.retrieve_and_generate(
    input={
        "text": query
    },
    retrieveAndGenerateConfiguration={
        "type": "KNOWLEDGE_BASE",
        "knowledgeBaseConfiguration": {
            'knowledgeBaseId': kb_id,
            "modelArn": "arn:aws:bedrock:{}::foundation-model/{}".format(region, foundation_model),
            "retrievalConfiguration": {
                "vectorSearchConfiguration": {
                    "numberOfResults":5
                } 
            },
            'orchestrationConfiguration': {
                'queryTransformationConfiguration': {
                    'type': 'QUERY_DECOMPOSITION'
                }
            }
        }
    }
)


# generated text output

print(response_ret['output']['text'],end='\n'*2)

Octank Tower is the company's iconic headquarters building designed by renowned architect PersonA. It is an 800,000 square foot office space housing over 3,500 employees across various departments like research and development, marketing, finance, and human resources. The building has received awards for its green initiatives and eco-friendly amenities like solar panels and rainwater harvesting systems. The whistleblower scandal involving the former Chief Financial Officer Person X engaging in insider trading and other illegal activities has significantly hurt Octank's reputation and image. The company's stock price plummeted after the news broke, causing substantial losses for shareholders. Many are questioning Octank's ability to maintain ethical standards and protect investors. Employee morale has also taken a hit, with feelings of demoralization and uncertainty about the company's future.



Let's take a look at the retrieved chunks with query reformulation

In [14]:
response_with_qr = response_ret['citations'][0]['retrievedReferences']
print("# of citations or chunks used to generate the response: ", len(response_with_qr))


citations_rag_print(response_with_qr)

# of citations or chunks used to generate the response:  2
Chunk 1:  Octank Tower boasts 800,000 square feet of office space, housing more than 3,500 employees across various departments, including research and development, marketing, finance, and human resources.     The building is equipped with cutting-edge technology, including a sophisticated building management system that optimizes energy efficiency and indoor air quality. Octank Tower has received several awards for its green initiatives, including LEED Platinum certification, and features a range of eco-friendly amenities, such as solar panels, rainwater harvesting systems, and electric vehicle charging stations.     **Regional Offices: A Network of Strategic Locations**     Octank Financial has a strong presence in major cities across the globe, with 15 regional offices in North America, Europe, Asia, and Oceania. These offices range from 50,000 to 200,000 square feet and accommodate between 250 and 1,000 employees, depending

We can see that with query reformulation turned on, the chunks that have been retrieved now provide context for the whistlblower scandal and the location of the waterfront property components.

### Observing prompt decomposition using CloudWatch Logs

Before performing retrieval, the complex query is broken down into multiple subqueries. This can be seen for the above example query when we isolate the invocation for the decomposition action where our __standalone_question__ is our original query and the resulting subqueries are shown between __\<query\>__ tags

__Note__: You must enable invocation logging in Bedrock for the logs to be viewed in CloudWatch. Please refer [here](https://docs.aws.amazon.com/bedrock/latest/userguide/model-invocation-logging.html) for details.


```
<generated_queries>

<standalone_question>
What is octank tower and how does the whistleblower scandal hurt the company and its image?
</standalone_question>

<query>
What is octank tower?
</query>

<query>
What is the whistleblower scandal involving Octank company?
</query>

<query>
How did the whistleblower scandal affect Octank company's reputation and public image?
</query>

</generated_queries>
```


<div class="alert alert-block alert-warning">
<b>Note:</b> Remember to delete KB, OSS index and related IAM roles and policies to avoid incurring any charges.
</div>

In [15]:
print("===============================Knowledge base with fixed chunking==============================\n")
knowledge_base_standard.delete_kb(delete_s3_bucket=True, delete_iam_roles_and_policies=True)


Deleted data source NXBKUHBDLU
Found bucket standard-kb-4172007
Deleted all objects in bucket standard-kb-4172007
Deleted bucket standard-kb-4172007
Found role AmazonBedrockExecutionRoleForKnowledgeBase_4172007-f
 [{'PolicyName': 'AmazonBedrockFoundationModelPolicyForKnowledgeBase_4172007-f', 'PolicyArn': 'arn:aws:iam::183631345587:policy/AmazonBedrockFoundationModelPolicyForKnowledgeBase_4172007-f'}, {'PolicyName': 'AmazonBedrockOSSPolicyForKnowledgeBase_4172007-f', 'PolicyArn': 'arn:aws:iam::183631345587:policy/AmazonBedrockOSSPolicyForKnowledgeBase_4172007-f'}, {'PolicyName': 'AmazonBedrockCloudWatchPolicyForKnowledgeBase_4172007-f', 'PolicyArn': 'arn:aws:iam::183631345587:policy/AmazonBedrockCloudWatchPolicyForKnowledgeBase_4172007-f'}, {'PolicyName': 'AmazonBedrockS3PolicyForKnowledgeBase_4172007-f', 'PolicyArn': 'arn:aws:iam::183631345587:policy/AmazonBedrockS3PolicyForKnowledgeBase_4172007-f'}]
Detached policy AmazonBedrockFoundationModelPolicyForKnowledgeBase_4172007-f from ro

Now that we have seen how query reformulation works and how it can improve responses to complex queries, we invite you to dive deeper and experiment with this technique to optimize your RAG worflow. 