# Module 2 - Build RAG-powered Q&A Application with **RetrieveAndGenerate API**

----

This notebook provides sample code and step-by-step instructions for building a fully-managed question-answering (Q&A) application using a **RetrieveAndGenerate API** of Amazon Bedrock Knowledge Bases.

----

### Introduction

In the previous notebook, we demonstrated how to create a Knowledge Base in Amazon Bedrock — including setting up an S3 data source, configuring an Amazon OpenSearch Serverless (AOSS) vector index, and ingesting documents for retrieval-augmented generation (RAG).

In this notebook, we take the next step: building a Q&A application that can query that Knowledge Base using the `RetrieveAndGenerate` API. This API allows you to retrieve the most relevant content from your Knowledge Base based on a user’s query and automatically pass that information to a foundation model (FM) to generate a grounded, context-aware response.

This is a classic example of the RAG pattern — where external data is dynamically retrieved at query time and incorporated into the model’s prompt to improve relevance, accuracy, and transparency. In this solution, retrieved knowledge base content comes with source attribution, helping end users understand the origin of the response and minimizing the risk of model hallucinations.

![BKB illustration](./images/retrieve_and_generate_api.png)

### Pre-requisites

In order to run this notebook, you should have successfully completed the previous notebook lab:
- [1_create-kb-and-ingest-documents.ipynb](./1\_create-kb-and-ingest-documents.ipynb).

Also, please make sure that you have enabled the following model access in _Amazon Bedrock Console_:
- `Amazon Nova Micro`
- `Amazon Titan Text Embeddings V2`


## 1. Setup

### 1.1 Import the required libraries

In [7]:
# Standard library imports
import os
import sys
import json
import time

# Third-party imports
import boto3
from botocore.client import Config
from botocore.exceptions import ClientError

# Local imports
import utility

# Print SDK versions
print(f"Python version: {sys.version.split()[0]}")
print(f"Boto3 SDK version: {boto3.__version__}")

Python version: 3.12.2
Boto3 SDK version: 1.36.1


### 1.2 Initial setup for clients and global variables

In [8]:
%store -r bedrock_kb_id

In [9]:
# Create boto3 session and set AWS region
boto_session = boto3.Session()
aws_region = boto_session.region_name

# Create boto3 clients for Bedrock
bedrock_config = Config(connect_timeout=120, read_timeout=120, retries={'max_attempts': 0})
bedrock_agent_client = boto3.client('bedrock-agent-runtime', config=bedrock_config)

# Set the Bedrock model to use for text generation
model_id = 'amazon.nova-micro-v1:0'
model_arn = f'arn:aws:bedrock:{aws_region}::foundation-model/{model_id}'

# Print configurations
print("AWS Region:", aws_region)
print("Bedrock Knowledge Base ID:", bedrock_kb_id)

AWS Region: us-east-1
Bedrock Knowledge Base ID: WCVT3MPU5K


## 2. Fully-managed RAG with **RetreiveAndGenerate API**

The `RetrieveAndGenerate` API provides a fully managed way to implement the Retrieval-Augmented Generation (RAG) pattern with Amazon Bedrock Knowledge Bases.

When a user submits a query, the API automatically converts the query into vector embeddings, performs a similarity search against the Knowledge Base, and retrieves the most relevant document chunks. These search results are then injected into the foundation model's prompt as additional context, enabling the model to generate more accurate and grounded responses.

For multi-turn conversations, Knowledge Bases also maintain short-term conversational memory — allowing the API to return more contextually relevant answers across a dialogue.

The output of the `RetrieveAndGenerate` API includes:

- The **generated response** from the foundation model

- **Source attribution** metadata for the retrieved content

- The **actual retrieved text chunks** from the Knowledge Base

This makes it easy to build RAG-powered applications with trusted, explainable answers — without having to manage retrieval pipelines or prompt construction yourself.

### 2.1 Retrieve and Generate Example

Let’s now see the `RetrieveAndGenerate` API in action and showcase a fully managed RAG workflow in Amazon Bedrock.

In this example, we’ll use the Knowledge Base built in the previous lab — containing Amazon Shareholder Letters — to demonstrate how the API retrieves relevant information and generates a grounded response to a user query.

In [10]:
user_query = "How does Amazon use technology to better serve its customers?"

response = bedrock_agent_client.retrieve_and_generate(
    input={
        'text': user_query
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': bedrock_kb_id,
            'modelArn': model_arn
        }
    }
)

print("Final reply:\n", response['output']['text'])

Final reply:
 Amazon employs technology extensively to enhance customer service and experience. Here are some key areas where Amazon uses technology to better serve its customers:

1. **Fulfillment Network and Delivery Speed**: Amazon has been innovating in its fulfillment network for decades, constantly trying to shorten the time to get items to customers. The regionalization redesign of its fulfillment network, new placement algorithms, and innovative same-day fulfillment centers have significantly improved delivery speeds. Amazon Prime now offers tens of millions of items available in one day or better, and an increasing number of deliveries happen same day. Amazon is also planning to use drones (Prime Air) to deliver items inside an hour.

2. **Customer and Seller Service Productivity Apps**: Amazon is building a substantial number of GenAI applications across every Amazon consumer business. These range from AI-powered shopping assistants like Rufus to customer and seller service p

In [11]:
user_query = "Why should we keep our uniqueness for the Amazon identity?"

response = bedrock_agent_client.retrieve_and_generate(
    input={
        'text': user_query
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': bedrock_kb_id,
            'modelArn': model_arn
        }
    }
)

print("Final reply:\n", response['output']['text'])

Final reply:
 It's important to maintain Amazon's unique identity because it sets the company apart from its competitors and helps to create a strong brand image. Amazon's distinctiveness is rooted in its commitment to customers, its relentless focus on innovation, and its culture of continuous improvement. By staying true to its core values and principles, Amazon can continue to attract and retain customers, employees, and partners who share its vision and mission.

The world often tries to make companies more typical and conform to the status quo, but Amazon has always resisted this pressure and instead embraced its uniqueness. This has allowed Amazon to stay ahead of the curve and continue to disrupt industries through its innovative approach. By maintaining its distinctiveness, Amazon can continue to challenge the norms and push the boundaries of what's possible.

However, it's important to note that maintaining Amazon's unique identity doesn't mean that the company should be resis

In [14]:
user_query = "Can you summarize the 2024 shareholder letter?"

response = bedrock_agent_client.retrieve_and_generate(
    input={
        'text': user_query
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': bedrock_kb_id,
            'modelArn': model_arn
        }
    }
)

print("Final reply:\n", response['output']['text'])

Final reply:
 The 2024 shareholder letter from Amazon CEO Andy Jassy focuses on the company's growth, innovation, and future opportunities. Here are the key points:

1. **Growth and Innovation**: Jassy highlights Amazon's growth over the past 25 years, from a books-only retailer to a global e-commerce giant with a vibrant third-party marketplace, Kindle, Alexa, and Amazon Web Services (AWS). He emphasizes the company's continuous innovation and its role in enabling business continuity during the pandemic.

2. **Pandemic Impact**: The letter discusses the significant role Amazon played during the pandemic, particularly in providing essential goods and services. AWS's cloud services were crucial in helping businesses adapt to remote work and changing demand.

3. **AWS and Consumer Revenue**: The letter notes the different demand trajectories of AWS and Consumer businesses during the pandemic. AWS saw slower growth initially but accelerated as companies moved to the cloud. Consumer revenu

In [15]:
user_query = "Can you tell me the differences between the 2023 and 2022 shareholder letters specifically to Andy Jassy and Jeff Bezos style of writing?"

response = bedrock_agent_client.retrieve_and_generate(
    input={
        'text': user_query
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': bedrock_kb_id,
            'modelArn': model_arn
        }
    }
)

print("Final reply:\n", response['output']['text'])

Final reply:
 The differences between the 2023 and 2022 shareholder letters from Andy Jassy and Jeff Bezos can be observed in their style of writing:

1. **Tone and Optimism**: In the 2023 letter, Andy Jassy adopts a more optimistic and energized tone, emphasizing the potential for growth and innovation despite the challenges faced in 2022. In contrast, Jeff Bezos' 2022 letter reflects a more measured and reflective tone, acknowledging the difficulties but still highlighting the progress made.

2. **Focus on Innovation**: Jassy's 2023 letter places a greater emphasis on innovation and the future of Amazon, discussing how the company has innovated in its largest businesses and the importance of long-term investments. Bezos' 2022 letter, while also discussing innovation, focuses more on the historical context and the achievements of Amazon over the years.

3. **Personal Anecdotes**: In the 2022 letter, Bezos includes personal anecdotes and reflections on the journey of Amazon, such as th

### 2.2 Understanding Citations

Citations play a critical role in retrieval-augmented generation (RAG) systems by helping users verify the accuracy of a response and providing transparency into the source of information. Let's now look at the `citations` past of the Knowledge Base response:

In [16]:
print("Citations:\n", json.dumps(response["citations"], indent=2, default=str))

Citations:
 [
  {
    "generatedResponsePart": {
      "textResponsePart": {
        "span": {
          "end": 327,
          "start": 0
        },
        "text": "The differences between the 2023 and 2022 shareholder letters from Andy Jassy and Jeff Bezos can be observed in their style of writing:\n\n1. **Tone and Optimism**: In the 2023 letter, Andy Jassy adopts a more optimistic and energized tone, emphasizing the potential for growth and innovation despite the challenges faced in 2022"
      }
    },
    "retrievedReferences": [
      {
        "content": {
          "text": "Dear shareholders:     As I sit down to write my second annual shareholder letter as CEO, I find myself optimistic and energized by what lies ahead for Amazon. Despite 2022 being one of the harder macroeconomic years in recent memory, and with some of our own operating challenges to boot, we still found a way to grow demand (on top of the unprecedented growth we experienced in the first half of the pandemic)

In [17]:
print("Citations:\n", json.dumps(response["citations"], indent=2, default=str))

Citations:
 [
  {
    "generatedResponsePart": {
      "textResponsePart": {
        "span": {
          "end": 327,
          "start": 0
        },
        "text": "The differences between the 2023 and 2022 shareholder letters from Andy Jassy and Jeff Bezos can be observed in their style of writing:\n\n1. **Tone and Optimism**: In the 2023 letter, Andy Jassy adopts a more optimistic and energized tone, emphasizing the potential for growth and innovation despite the challenges faced in 2022"
      }
    },
    "retrievedReferences": [
      {
        "content": {
          "text": "Dear shareholders:     As I sit down to write my second annual shareholder letter as CEO, I find myself optimistic and energized by what lies ahead for Amazon. Despite 2022 being one of the harder macroeconomic years in recent memory, and with some of our own operating challenges to boot, we still found a way to grow demand (on top of the unprecedented growth we experienced in the first half of the pandemic)

Here, the response includes a `generatedResponsePart` field, which contains the natural language answer generated by the model. Each `generatedResponsePart` is paired with `retrievedReferences`, which lists the specific pieces of content from the knowledge base that were used to ground that part of the response. These references include the original source text (`content.text`), as well as metadata like the source URI and page number, so users can easily trace information back to its original document. This structure ensures that answers are both helpful and verifiable, allowing users to explore the source material directly and build trust in the response.

## 3. Conclusions and Next Steps

In this notebook, we built a fully-managed RAG-powered Q&A application using the `RetrieveAndGenerate` API from Amazon Bedrock Knowledge Bases.

We demonstrated how this API simplifies the RAG workflow by automatically retrieving relevant content from a knowledge base and generating grounded, context-aware responses using a foundation model. The responses also include source references, allowing users to easily trace answers back to the original documents.

This approach enables you to quickly build reliable, transparent Q&A solutions without managing the complexity of prompt engineering or retrieval logic manually.

### Next Steps

If you are looking for more flexibility and control over your RAG workflow, Amazon Bedrock Knowledge Bases also provides a `Retrieve` API. This API allows you to perform semantic and/or keyword search over your knowledge base and retrieve the most relevant document chunks, which you can then use to build custom prompts or workflows tailored to your application needs.

To explore this approach, check out the next notebook:

&nbsp; **NEXT ▶** [3_customized-rag-with-retrieve-api](./3\_customized-rag-with-retrieve-api.ipynb)