# Module 2 - Build RAG-powered Q&A Application with **RetrieveAndGenerate API**

----

This notebook provides sample code and step-by-step instructions for building a fully-managed question-answering (Q&A) application using a **RetrieveAndGenerate API** of Amazon Bedrock Knowledge Bases.

----

### Introduction

In the previous notebook, we demonstrated how to create a Knowledge Base in Amazon Bedrock — including setting up an S3 data source, configuring an Amazon OpenSearch Serverless (AOSS) vector index, and ingesting documents for retrieval-augmented generation (RAG).

In this notebook, we take the next step: building a Q&A application that can query that Knowledge Base using the `RetrieveAndGenerate` API. This API allows you to retrieve the most relevant content from your Knowledge Base based on a user’s query and automatically pass that information to a foundation model (FM) to generate a grounded, context-aware response.

This is a classic example of the RAG pattern — where external data is dynamically retrieved at query time and incorporated into the model’s prompt to improve relevance, accuracy, and transparency. In this solution, retrieved knowledge base content comes with source attribution, helping end users understand the origin of the response and minimizing the risk of model hallucinations.

![BKB illustration](./images/retrieve_and_generate_api.png)

### Pre-requisites

In order to run this notebook, you should have successfully completed the previous notebook lab:
- [1_create-kb-and-ingest-documents.ipynb](./1\_create-kb-and-ingest-documents.ipynb).

Also, please make sure that you have enabled the following model access in _Amazon Bedrock Console_:
- `Amazon Nova Micro`
- `Amazon Titan Text Embeddings V2`


## 1. Setup

### 1.1 Import the required libraries

In [None]:
# Standard library imports
import os
import sys
import json
import time

# Third-party imports
import boto3
from botocore.client import Config
from botocore.exceptions import ClientError

# Local imports
import utility

# Print SDK versions
print(f"Python version: {sys.version.split()[0]}")
print(f"Boto3 SDK version: {boto3.__version__}")

### 1.2 Initial setup for clients and global variables

In [None]:
%store -r bedrock_kb_id

In [None]:
# Create boto3 session and set AWS region
boto_session = boto3.Session()
aws_region = boto_session.region_name

# Create boto3 clients for Bedrock
bedrock_config = Config(connect_timeout=120, read_timeout=120, retries={'max_attempts': 0})
bedrock_agent_client = boto3.client('bedrock-agent-runtime', config=bedrock_config)

# Set the Bedrock model to use for text generation
model_id = 'amazon.nova-micro-v1:0'
model_arn = f'arn:aws:bedrock:{aws_region}::foundation-model/{model_id}'

# Print configurations
print("AWS Region:", aws_region)
print("Bedrock Knowledge Base ID:", bedrock_kb_id)

## 2. Fully-managed RAG with **RetreiveAndGenerate API**

The `RetrieveAndGenerate` API provides a fully managed way to implement the Retrieval-Augmented Generation (RAG) pattern with Amazon Bedrock Knowledge Bases.

When a user submits a query, the API automatically converts the query into vector embeddings, performs a similarity search against the Knowledge Base, and retrieves the most relevant document chunks. These search results are then injected into the foundation model's prompt as additional context, enabling the model to generate more accurate and grounded responses.

For multi-turn conversations, Knowledge Bases also maintain short-term conversational memory — allowing the API to return more contextually relevant answers across a dialogue.

The output of the `RetrieveAndGenerate` API includes:

- The **generated response** from the foundation model

- **Source attribution** metadata for the retrieved content

- The **actual retrieved text chunks** from the Knowledge Base

This makes it easy to build RAG-powered applications with trusted, explainable answers — without having to manage retrieval pipelines or prompt construction yourself.

### 2.1 Retrieve and Generate Example

Let’s now see the `RetrieveAndGenerate` API in action and showcase a fully managed RAG workflow in Amazon Bedrock.

In this example, we’ll use the Knowledge Base built in the previous lab — containing Amazon Shareholder Letters — to demonstrate how the API retrieves relevant information and generates a grounded response to a user query.

In [None]:
user_query = "How does Amazon use technology to better serve its customers?"

response = bedrock_agent_client.retrieve_and_generate(
    input={
        'text': user_query
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': bedrock_kb_id,
            'modelArn': model_arn
        }
    }
)

print("Final reply:\n", response['output']['text'])

### 2.2 Understanding Citations

Citations play a critical role in retrieval-augmented generation (RAG) systems by helping users verify the accuracy of a response and providing transparency into the source of information. Let's now look at the `citations` past of the Knowledge Base response:

In [None]:
print("Citations:\n", json.dumps(response["citations"], indent=2, default=str))

Here, the response includes a `generatedResponsePart` field, which contains the natural language answer generated by the model. Each `generatedResponsePart` is paired with `retrievedReferences`, which lists the specific pieces of content from the knowledge base that were used to ground that part of the response. These references include the original source text (`content.text`), as well as metadata like the source URI and page number, so users can easily trace information back to its original document. This structure ensures that answers are both helpful and verifiable, allowing users to explore the source material directly and build trust in the response.

## 3. Conclusions and Next Steps

In this notebook, we built a fully-managed RAG-powered Q&A application using the `RetrieveAndGenerate` API from Amazon Bedrock Knowledge Bases.

We demonstrated how this API simplifies the RAG workflow by automatically retrieving relevant content from a knowledge base and generating grounded, context-aware responses using a foundation model. The responses also include source references, allowing users to easily trace answers back to the original documents.

This approach enables you to quickly build reliable, transparent Q&A solutions without managing the complexity of prompt engineering or retrieval logic manually.

### Next Steps

If you are looking for more flexibility and control over your RAG workflow, Amazon Bedrock Knowledge Bases also provides a `Retrieve` API. This API allows you to perform semantic and/or keyword search over your knowledge base and retrieve the most relevant document chunks, which you can then use to build custom prompts or workflows tailored to your application needs.

To explore this approach, check out the next notebook:

&nbsp; **NEXT ▶** [3_customized-rag-with-retrieve-api](./3\_customized-rag-with-retrieve-api.ipynb)