# Fully managed RAG with Amazon Bedrock Knowledge Bases
> *This notebook should work well with the **`conda_python3`** kernel in SageMaker Studio on ml.t3.medium instance*

In the prior notebook, we built a RAG solution from scratch using Amazon OpenSearch Service and abstractions provided by the `langchain` library. In this notebook we will looks at a fully managed alternative that will handle the entire pipeline for us including:
- **Document ingestion**: Sourcing documents from S3 and ingesting them into a vector database
- **Document processing**: Extracting text from documents and chunking them into smaller pieces
- **Vectorization**: Converting text into vector embeddings
- **Retrieval**: Searching for relevant documents based on a query
- **Generation**: Generating a response based on the retrieved documents

In [None]:
import sys
import os
module_path = "../.."
sys.path.append(os.path.abspath(module_path))
from utils.environment_validation import validate_environment, validate_model_access
validate_environment()

In [None]:
required_models = [
    "amazon.titan-embed-text-v1",
    "amazon.titan-embed-text-v2:0",
    "us.anthropic.claude-3-5-haiku-20241022-v1:0",
    "us.anthropic.claude-3-5-sonnet-20241022-v2:0",
]
validate_model_access(required_models)

In [None]:
from pathlib import Path
import boto3
from rag_utils.kb_utils import upload_document, create_kb, create_data_source, get_collection_data
from rich import print as rprint
from rich.markdown import Markdown

boto3_session = boto3.Session()
REGION = boto3_session.region_name
BEDROCK_AGENT_CLIENT = boto3.client("bedrock-agent-runtime")

Before the documents can be ingested, they must first be uploaded to an S3 bucket. The following code will upload the documents to S3 and run an ls command to confirm the upload.


In [None]:
model_governance_docs_path = Path("../data/model_risk")
s3_docs_path = upload_document(doc_path=model_governance_docs_path, s3_prefix="model-risk-docs")
!aws s3 ls {s3_docs_path} --recursive --human-readable --summarize

Before a Knowledge Base can be created, we need to configure a Vector Database (Vector DB) to store and retrieve the document embeddings. There are a number of supported options as documented [here](https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html). For out purposes, the workshop already includes a pre-provisioned [Amazon OpenSearch Serverless Collection](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-overview.html) that we can use. The function below will look up the endpoint and arn for the OpenSearch collection and return them.

In [None]:
collection_endpoint, collection_arn = get_collection_data()

We can now create the Knowledge Base and ingest the documents.

In [None]:
kb_id = create_kb(
    collection_arn=collection_arn,
    collection_endpoint=collection_endpoint,
    index_name="model-risk-docs",
    kb_name="model-risk-docs",
    kb_description="Model Risk Documents",
)

In [None]:
# create a data source for the uploaded documents
data_source_id = create_data_source(kb_id=kb_id, s3_path=s3_docs_path)

We'll create a helper function to query the KB. The function will format the output to reteurn the generated response along with specific references to the documents that were used to generate the response.


In [None]:
def query_knowledge_base(kb_id, query_text, session_id=None):
    params = {
        'input': {'text': query_text},
        'retrieveAndGenerateConfiguration': {
            'type': 'KNOWLEDGE_BASE',
            'knowledgeBaseConfiguration': {
                'knowledgeBaseId': kb_id,
                'modelArn': f'arn:aws:bedrock:{REGION}::foundation-model/anthropic.claude-3-5-haiku-20241022-v1:0'
            }
        }
    }
    if session_id is not None:
        params['sessionId'] = session_id

    response = BEDROCK_AGENT_CLIENT.retrieve_and_generate(**params)
    return response

def print_response(response):

    ref_number = 0
    ref_offset = 1
    output = response["output"]["text"]
    references = []

    for citation in response["citations"]:

        citation_end = citation["generatedResponsePart"]["textResponsePart"]["span"]["end"]

        citation_string = "["

        for n, reference in enumerate(citation["retrievedReferences"]):
            ref_location = reference["location"]["s3Location"]["uri"]
            ref_page_number = reference["metadata"]["x-amz-bedrock-kb-document-page-number"]
            ref_number += 1
            references.append(f"[{ref_number}]: {ref_location} (page {int(ref_page_number)})")

            if n == 0:
                citation_string += f"{ref_number}"
            else:
                citation_string += f", {ref_number}"
        
        citation_string += "]"

        output = output[:citation_end + ref_offset] + citation_string + output[citation_end + ref_offset:]
        ref_offset += len(citation_string)
    
    reference_str = "\n".join(references)
    rprint(output)
    rprint(reference_str)


In [None]:
query_text = "What are the best practices for model governance?"
response = query_knowledge_base(kb_id=kb_id, query_text=query_text)
session_id = response["sessionId"]
print_response(response)

In [None]:
query_text = "What are some example risk that I should document for a credit risk model?"
response = query_knowledge_base(kb_id=kb_id, query_text=query_text, session_id=session_id)

print_response(response)

In [None]:
query_text = "How about for a fraud detection model?"
response = query_knowledge_base(kb_id=kb_id, query_text=query_text, session_id=session_id)

print_response(response)