# Retrieval Augmented Question & Answering with Knowledge Bases for Amazon Bedrock & Amazon OpenSearch Serverless

> *This notebook should work well with the **`SageMaker Distribution 1.2`** image in SageMaker Studio* Jupyter Lab or **`SageMaker Distribution 1.2`** Image in the new SageMaker Studio JupyterLab.

### Introduction
Q&A assistants powered by generative AI are designed to have natural conversations and answer questions on a wide range of topics.
It uses the LLM foundation model to understand questions and generate relevant and helpful responses. With generative AI capabilities, the Q&A assistant can create unique responses instead of pulling from a database of pre-written responses. Overall, the goal is to have more human-like conversations that can educate, assist and to help improve user productivity.

While Q&A assistants powered by generative AI are helpful in providing assistance across general topics, they struggle in providing information / assistance that involves domain specific knowledge, such as enterprise data not exposed to the model used in the training process. In order to make the Q&A assistant understand enterprise data and to provide useful responses, 2 approaches are used in general to address the challenge:

1. Finetune the LLM model with enterprise data;
2. Integrate the LLM with enterprise knowledge through external databases (e.g. vector database). This approach is also referred as RAG (Retrieval Augmented Generation)

In this lab, we'll focus on building a Q&A assistant using the RAG approach mentioned above. In particular, we'll explore a feature within Amazon Bedrock called [Knowledge Bases For Amazon Bedrock](https://aws.amazon.com/bedrock/knowledge-bases/) to help us quickly setup a vector database using [Amazon OpenSearch Serverless](https://aws.amazon.com/opensearch-service/features/serverless/) and integrate with a Amazon Bedrock foundation model without managing any infrasturcture.

### Use Case
A typical enterprise knowledge base involves large volume of data. In this lab, we'll use some sample gaming dataset provided by [IGDB](https://www.igdb.com/) as the source of the knowledge base. The data contains information about game titles and summaries. The Q&A chatbot will be used to integrate with the Knowledge Base to provide accurate answer based on user's question. 

These documents explain topics such as:
- Storyline of the game
- Game identification
- Finding contexually similar games

#### Persona
Let's assume a persona of a game user who's looking for information / guidance about the games available in the knowledge base repository.
The model will try to answer from the documents in natural language.


## Implementation
In order to follow the RAG approach this notebook integrates with Knowledge Bases for Amazon Bedrock. Specifically, we will be using the following tools:

- **LLM (Large Language Model)**: Anthropic Claude V2 available through Amazon Bedrock

- **Embeddings Model**: Amazon Titan Embeddings available through Amazon Bedrock

- **Vector Store**: Amazon OpenSearch Serverless available through Knowledge Bases for Amazon Bedrock 

- **Knowledge Base data** - Game Dataset in CSV stored in S3 bucket.

## Setup

Before running the rest of this notebook, you'll need to run the cells below to (ensure necessary libraries are installed and) connect to Bedrock.

For more details on how the setup works and ⚠️ **whether you might need to make any changes**, refer to the [Bedrock boto3 setup notebook](../00_Intro/bedrock_boto3_setup.ipynb) notebook.

In [None]:
%pip install --no-build-isolation --force-reinstall \
    "boto3>=1.28.57" \
    "awscli>=1.29.57" \
    "botocore>=1.31.57"

In [None]:
import warnings
warnings.filterwarnings('ignore')

## Data Preparation
Let's first extract some of the files to build our knowledge base store. For this example we will be using the CSV file included in the data folder.

In [None]:
import sagemaker
sess = sagemaker.Session()
bucket = sess.default_bucket() # Set a default S3 bucket
prefix = 'bedrock/knowledgebase/'

### Download and Extract Data
Now, let's extract the games csv data, and upload the csv file to S3 bucket so that we could ingest the data into Amazon OpenSearch Serverless via Knowledge Bases for Bedrock.

1. Your instructor will provide a dataset URL.
2. cd 03_QuestionAnswering
3. mkdir data
4. cd data
6. wget "[the URL given to you]" -O games.tar.gz  (make sure to include add double quotes \"\" to the URL)


In [None]:
!cd data && tar -xvzf games.tar.gz && aws s3 cp games.csv s3://{bucket}/{prefix}

**Note:** Please note the S3 URL where the file is uploaded, we'll use it in the next section.

# Create a Knowledge Base using Amazon Bedrock
The following section describes the steps to take in order to create a knowledge base in Bedrock.
For simplicity of the workshop, we are going to use the Amazon Bedrock console to configure all required components. 
You can also use the AWS SDK to achieve the same results. For information about using the AWS SDK for Agents, please refer to this [link](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-agent.html).

## How it works
Knowledge base for Amazon Bedrock help you take advantage of Retrieval Augmented Generation (RAG), a popular technique that involves drawing information from a data store to augment the responses generated by Large Language Models (LLMs). With this approach, your application can query the knowledge base to return most relevant information found in your knowledge base to answer the query either with direct quotations from sources or with natural responses generated from the query results.

There are 2 main processes involved in carrying out RAG functionality via Knowledge Bases for Bedrock:

1. Pre-processing - Ingest source data, create embeddings for the data and populate the embeddings into a vector database.
2. Runtime Execution - Query the vectorDB for similar documents based on user query and return topk documents as the basis for the LLM to provide a response.

The following diagrams illustrate schematically how RAG is carried out. Knowledge base simplifies the setup and implementation of RAG by automating several steps in this process.

### Preprocesing Stage

![kb-architecture](images/kb-architecture-diagram-ingest.png)

### Runtime Execution Stage

![kb-architecture](images/kb-architecture-diagram-runtime.png)





## Step by Step Instructions

1. Navigate to [Bedrock Console](https://console.aws.amazon.com/bedrock):

<img src="images/bedrock-console.png" alt="bedrock_console" style="width: 600px;"/>

2. Select Knowledge base from the left pane:

<img src="images/bedrock-console-kb.png" alt="bedrock_kb" style="width: 200px;"/>

3. Create a new Knowledge base:

<img src="images/bedrock-kb-create.png" alt="create_kb" style="width: 800px;"/>

4. Provide knowledge base details as followed:

**Name**: [your name]-genai-workshop-kb

**Description**: A sample knowledge base for gen AI workshop

**IAM permissions**: Create and use a new service role

<img src="images/bedrock-kb-detail.png" alt="bedrock_kb_detail" style="width: 700px;"/>

5. Add a new data source with the following details:

**Name**: [your name]-genai-workshop-kb-data-source

**S3 URI**: [ The S3 URI where the ``games.csv`` file was uploaded. (You can find the S3 URI in the previous cell) ]

**Advanced Settings**:

**KMS Key For transient data storage**: Use default KMS key

**Chunking Strategy**: Default Chunking

<img src="images/bedrock-kb-ds-detail.png" alt="bedrock_kb_ds_detail" style="width: 700px;"/>

6. Setup a vector store database

**Embeddings Model**: Titan Embeddings G1 - Text v1.2

**Vector Database**: Quick create a new vector store

<img src="images/bedrock-kb-vector-db-detail.png" alt="bedrock_kb_v_detail" style="width: 700px;"/>


7. Review and Create the Knowledge Base

<img src="images/bedrock-kb-create-final.png" alt="bedrock_kb_create_final" style="width: 700px;"/>

You'll see a status shows up at the top of the page, this should take a few seconds:

<img src="images/bedrock-kb-create-status.png" alt="bedrock_kb_create_final" style="width: 700px;"/>

When the vector DB is created successfully. You'll see the status bar turns Green. Click on the 'sync' button to sync the data source with the vector DB.

<img src="images/bedrock-kb-sync.png" alt="bedrock_kb_sync" style="width: 700px;"/>

The sync process could take a while depending on the volume of data. For our lab, it should take about 10 minutes. 

While waiting for the sync process, this might be a good time to take a break and resume when the sync is complete! 


## Knowledge Base Retrieval
We can use the Bedrock Agent SDK to perform similarity search to process a query and return the chunks of text without any LLM generating the response. 

First, let's retrieve the knowledge base ID so we could use it with the SDK. You can the Knowledge Base ID on the overview page of the Knowledge Base when you created it. Here's a screenshot that shows where the ID is located:

<img src="images/bedrock-kb-overview.png" alt="bedrock_kb_overview" style="width: 450px;"/>

Define a runtime bedrock agent client

In [None]:
import boto3

agent_runtime_client = boto3.client('bedrock-agent-runtime')
knowledgebase_id = "D4S2OCBIBI" # use the knowledge base ID from the console shown in the previous step.

In [None]:
response = agent_runtime_client.retrieve(
    knowledgeBaseId=knowledgebase_id,
    retrievalQuery={
        'text': 'Dance dance revolution'
    },
    retrievalConfiguration={
        'vectorSearchConfiguration': {
            'numberOfResults': 3  # Shows the top 3 results
        }
    }
)

prints out the top 3 matching documents

In [None]:
for text, score in [ (x['content']['text'], x['score']) for x in response['retrievalResults'] ]:
    print(f"==> Document Text: {text}, Score: {score}")

## Generative Question Answering
In generative question-answering (GQA), we pass our question to the Claude-2 but instruct it to base the answer on the information returned from our knowledge base.
Typically, in order to integrate knowledge base with an LLM for a chatbot application, you would need to setup, build and manage a QA retriever that connects both components. With Knowledge Bases for Amazon Bedrock, you simply use Bedrock API to send the question, Bedrock is responsible for handling the connectivity between LLM and the Knowledge base components, orchestrate the interactions and returns the results. It helps improves developer productivity as there is no infrastructure to manage. 

First let's list all the model IDs available to find the Claude-2 model ARN. We'll need it for invoking the agent and knowledge base.  

In [None]:
bedrock_agent = boto3.client("bedrock")

In [None]:
for model_id, model_name, model_arn in [ (x['modelId'], x['modelName'], x['modelArn']) for x in bedrock_agent.list_foundation_models()['modelSummaries']]:
  print(f"model ID: {model_id}, model name: {model_name}, model_arn: {model_arn}")

In [None]:
modelId = "anthropic.claude-v2"
claude_v2_model_arn = list(filter(lambda x: x['modelId'] == modelId, bedrock_agent.list_foundation_models()['modelSummaries']))[0]['modelArn']

In [None]:
response = agent_runtime_client.retrieve_and_generate(
    input={
        'text': 'show me similar games like "Dance dance revolution" '
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': knowledgebase_id,
            'modelArn': claude_v2_model_arn
        }
    }
)

Here's the response

In [None]:
print(response['output']['text'])

## Conclusion
Congratulations on completing this moduel on retrieval augmented generation! This is an important technique that combines the power of large language models with the precision of retrieval methods. By augmenting generation with relevant retrieved examples, the responses we recieved become more coherent, consistent and grounded. You should feel proud of learning this innovative approach. I'm sure the knowledge you've gained will be very useful for building creative and engaging language generation systems. Well done!

In the above implementation of RAG based Question Answering we have explored the following concepts and how to implement them using Amazon Bedrock and it's LangChain integration.

- Creating a knowledge base using Knowledge Bases for Bedrock
- Loads documents and generating embeddings to create a vector store (Amazon Opensearch Serveless) managed by Amazon Bedrock Knowledge Base. 
- Retrieving similar documents to the question
- Use Bedrock agent SDK to retrieval and generate a human friendly response based on user question.

# Thank You