<h2>How to create index and ingest documents in Amazon Bedrock Knowledge Base</h2>

This notebook provides sample code for building an empty OpenSearch Serverless (OSS) index in Amazon Bedrock Knowledge Base and then ingest documents into it.


<h5>Notebook Walkthrough</h5>

In this notebook we create a data pipeline that ingests documents (typically stored in Amazon S3) into a knowledge base i.e. a vector database such as Amazon OpenSearch Service Serverless (AOSS) so that it is available for lookup when a question is received.
<ul>
<li>Load the documents into the knowledge base by connecting your s3 bucket (data source). </li>
<li>Ingestion - Knowledge base will split them into smaller chunks (based on the strategy selected), generate embeddings and store it in the associated vectore store.</li>
</ul>


<img src="./assets/images/data_ingestion.png" alt="Data Ingestion in Index of Knowledge Base" style="margin:auto">

<h5>Steps:</h5>
<ul>
<li>Create Amazon Bedrock Knowledge Base execution role with necessary policies for accessing data from S3 and writing embeddings into OSS.</li>
<li>Create an empty OpenSearch serverless index.</li>
<li>Download documents</li>
<li>Create Amazon Bedrock knowledge base</li>
<li>Create a data source within knowledge base which will connect to Amazon S3</li>
<li>Start an ingestion job using KB APIs which will read data from s3, chunk it, convert chunks into embeddings using Amazon Titan Embeddings model and then store these embeddings in AOSS. All of this without having to build, deploy and manage the data pipeline.</li>
</ul>

Once the data is available in the Bedrock Knowledge Base then a question answering application can be built using the Knowledge Base APIs provided by Amazon Bedrock as demonstrated by other notebooks in the same folder.

<div class="alert alert-block alert-info">
<b>Note:</b> This notebook has been tested in <strong>Mumbai (ap-south-1)</strong> in <strong>Python 3.10.14</strong>
</div>

<h5>Pre-requisites</h5>

This notebook requires permissions to:
<ul>
<li>create and delete Amazon IAM roles</li>
<li>create, update and delete Amazon S3 buckets</li>
<li>access Amazon Bedrock</li>
<li>access to Amazon OpenSearch Serverless</li>
</ul>

If running on SageMaker Studio, you should add the following managed policies to your role:
<ul>
<li>IAMFullAccess</li>
<li>AWSLambda_FullAccess</li>
<li>AmazonS3FullAccess</li>
<li>AmazonBedrockFullAccess</li>
<li>Custom policy for Amazon OpenSearch Serverless such as:
<code>
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "aoss:*",
            "Resource": "*"
        }
    ]
}
</code></li>
</ul>

<div class="alert alert-block alert-info">
<b>Note:</b> Please make sure to enable `Cohere Embed Multilingual`, `Anthropic Claude 3 Sonnet`  and `Anthropic Claude 3 Haiku` model access in Amazon Bedrock Console, as the notebook will use Cohere Embed Multilingual for creating the embeddings & Anthropic Claude 3 Sonnet and Claude 3 Haiku models for testing the knowledge base once its created.
</div>


<h3>Setup</h3>
Before running the rest of this notebook, you'll need to run the cells below to (ensure necessary libraries are installed and) connect to Bedrock.

In [1]:
%pip install -U opensearch-py==2.7.1
%pip install -U boto3==1.34.162
%pip install -U retrying==1.3.4

Collecting opensearch-py==2.7.1
  Using cached opensearch_py-2.7.1-py3-none-any.whl.metadata (6.9 kB)
Collecting Events (from opensearch-py==2.7.1)
  Using cached Events-0.5-py3-none-any.whl.metadata (3.9 kB)
Using cached opensearch_py-2.7.1-py3-none-any.whl (325 kB)
Using cached Events-0.5-py3-none-any.whl (6.8 kB)
Installing collected packages: Events, opensearch-py
Successfully installed Events-0.5 opensearch-py-2.7.1
Note: you may need to restart the kernel to use updated packages.
Collecting boto3==1.34.162
  Using cached boto3-1.34.162-py3-none-any.whl.metadata (6.6 kB)
Collecting botocore<1.35.0,>=1.34.162 (from boto3==1.34.162)
  Using cached botocore-1.34.162-py3-none-any.whl.metadata (5.7 kB)
Using cached boto3-1.34.162-py3-none-any.whl (139 kB)
Using cached botocore-1.34.162-py3-none-any.whl (12.5 MB)
Installing collected packages: botocore, boto3
  Attempting uninstall: botocore
    Found existing installation: botocore 1.35.12
    Uninstalling botocore-1.35.12:
      Succe

In [2]:
# restart kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

In [3]:
import warnings
warnings.filterwarnings('ignore')

In [4]:
import json
import os
import boto3
from botocore.exceptions import ClientError
import pprint
from utility import create_bedrock_execution_role, create_oss_policy_attach_bedrock_execution_role, create_policies_in_oss, interactive_sleep
import random
from retrying import retry


In [5]:
suffix = random.randrange(200, 900)

sts_client = boto3.client('sts')
boto3_session = boto3.session.Session()
region_name = boto3_session.region_name
bedrock_agent_client = boto3_session.client('bedrock-agent', region_name=region_name)
service = 'aoss'
s3_client = boto3.client('s3')
account_id = sts_client.get_caller_identity()["Account"]
s3_suffix = f"{region_name}-{account_id}"
bucket_name = f'bedrock-kb-{s3_suffix}' # replace it with your bucket name.
pp = pprint.PrettyPrinter(indent=2)

In [6]:
# Check if bucket exists, and if not create S3 bucket for knowledge base data source
try:
    s3_client.head_bucket(Bucket=bucket_name)
    print(f'Bucket {bucket_name} Exists')
except ClientError as e:
    print(f'Creating bucket {bucket_name}')
    if region_name == "us-east-1":
        s3bucket = s3_client.create_bucket(
            Bucket=bucket_name)
    else:
        s3bucket = s3_client.create_bucket(
        Bucket=bucket_name,
        CreateBucketConfiguration={ 'LocationConstraint': region_name }
    )

Creating bucket bedrock-kb-ap-south-1-874163252636


In [7]:
%store bucket_name

Stored 'bucket_name' (str)


<h3>Create a vector store - OpenSearch Serverless index</h3>

<h4>Step 1 - Create OSS policies and collection</h4>
First of all we have to create a vector store. In this section we will use <strong>Amazon OpenSerach serverless</strong>.

Amazon OpenSearch Serverless is a serverless option in Amazon OpenSearch Service. As a developer, you can use OpenSearch Serverless to run petabyte-scale workloads without configuring, managing, and scaling OpenSearch clusters. You get the same interactive millisecond response times as OpenSearch Service with the simplicity of a serverless environment. Pay only for what you use by automatically scaling resources to provide the right amount of capacity for your application—without impacting data ingestion.

In [8]:
import boto3
import time
vector_store_name = f'bedrock-sample-rag-{suffix}'
index_name = f"bedrock-sample-rag-index-{suffix}"
aoss_client = boto3_session.client('opensearchserverless')
bedrock_kb_execution_role = create_bedrock_execution_role(bucket_name=bucket_name)
bedrock_kb_execution_role_arn = bedrock_kb_execution_role['Role']['Arn']

In [9]:
# create security, network and data access policies within OSS
encryption_policy, network_policy, access_policy = create_policies_in_oss(vector_store_name=vector_store_name,
                       aoss_client=aoss_client,
                       bedrock_kb_execution_role_arn=bedrock_kb_execution_role_arn)
collection = aoss_client.create_collection(name=vector_store_name,type='VECTORSEARCH')

In [10]:
pp.pprint(collection)

{ 'ResponseMetadata': { 'HTTPHeaders': { 'connection': 'keep-alive',
                                         'content-length': '315',
                                         'content-type': 'application/x-amz-json-1.0',
                                         'date': 'Thu, 05 Sep 2024 15:19:34 '
                                                 'GMT',
                                         'x-amzn-requestid': '01ad6cd7-e0d6-431d-b8dc-68a75228dd78'},
                        'HTTPStatusCode': 200,
                        'RequestId': '01ad6cd7-e0d6-431d-b8dc-68a75228dd78',
                        'RetryAttempts': 0},
  'createCollectionDetail': { 'arn': 'arn:aws:aoss:ap-south-1:874163252636:collection/qucjcmpe10mr2kv6jaf2',
                              'createdDate': 1725549573961,
                              'id': 'qucjcmpe10mr2kv6jaf2',
                              'kmsKeyArn': 'auto',
                              'lastModifiedDate': 1725549573961,
                            

In [11]:
%store encryption_policy network_policy access_policy collection

Stored 'encryption_policy' (dict)
Stored 'network_policy' (dict)
Stored 'access_policy' (dict)
Stored 'collection' (dict)


In [12]:
# Get the OpenSearch serverless collection URL
collection_id = collection['createCollectionDetail']['id']
host = collection_id + '.' + region_name + '.aoss.amazonaws.com'
print(host)

qucjcmpe10mr2kv6jaf2.ap-south-1.aoss.amazonaws.com


In [13]:
# wait for collection creation
# This can take couple of minutes to finish
response = aoss_client.batch_get_collection(names=[vector_store_name])
# Periodically check collection status
while (response['collectionDetails'][0]['status']) == 'CREATING':
    print('Creating collection...')
    interactive_sleep(30)
    response = aoss_client.batch_get_collection(names=[vector_store_name])
print('\nCollection successfully created:')
pp.pprint(response["collectionDetails"])

Creating collection...
Creating collection...........
Creating collection...........
Creating collection...........
Creating collection...........
Creating collection...........
Creating collection...........
Creating collection...........
Creating collection...........
Creating collection...........
..............................
Collection successfully created:
[ { 'arn': 'arn:aws:aoss:ap-south-1:874163252636:collection/qucjcmpe10mr2kv6jaf2',
    'collectionEndpoint': 'https://qucjcmpe10mr2kv6jaf2.ap-south-1.aoss.amazonaws.com',
    'createdDate': 1725549573961,
    'dashboardEndpoint': 'https://qucjcmpe10mr2kv6jaf2.ap-south-1.aoss.amazonaws.com/_dashboards',
    'id': 'qucjcmpe10mr2kv6jaf2',
    'kmsKeyArn': 'auto',
    'lastModifiedDate': 1725549870683,
    'name': 'bedrock-sample-rag-416',
    'standbyReplicas': 'ENABLED',
    'status': 'ACTIVE',
    'type': 'VECTORSEARCH'}]


In [14]:
# create opensearch serverless access policy and attach it to Bedrock execution role
try:
    create_oss_policy_attach_bedrock_execution_role(collection_id=collection_id,
                                                    bedrock_kb_execution_role=bedrock_kb_execution_role)
    # It can take up to a minute for data access rules to be enforced
    interactive_sleep(60)
except Exception as e:
    print("Policy already exists")
    pp.pprint(e)

Opensearch serverless arn:  arn:aws:iam::874163252636:policy/AmazonBedrockOSSPolicyForKnowledgeBase_653
............................................................

<h3>Step 2 - Create vector index</h3>

In [15]:
# Create the vector index in Opensearch serverless, with the knn_vector field index mapping, specifying the dimension size, name and engine.
from opensearchpy import OpenSearch, RequestsHttpConnection, AWSV4SignerAuth, RequestError
credentials = boto3.Session().get_credentials()
awsauth = AWSV4SignerAuth(credentials, region_name, service)

index_name = f"bedrock-sample-index-{suffix}"
body_json = {
   "settings": {
      "index.knn": "true",
       "number_of_shards": 1,
       "knn.algo_param.ef_search": 512,
       "number_of_replicas": 0,
   },
   "mappings": {
      "properties": {
         "vector": {
            "type": "knn_vector",
            "dimension": 1024,
             "method": {
                 "name": "hnsw",
                 "engine": "faiss",
                 "space_type": "l2"
             },
         },
         "text": {
            "type": "text"
         },
         "text-metadata": {
            "type": "text"         
         }
      }
   }
}

# Build the OpenSearch client
oss_client = OpenSearch(
    hosts=[{'host': host, 'port': 443}],
    http_auth=awsauth,
    use_ssl=True,
    verify_certs=True,
    connection_class=RequestsHttpConnection,
    timeout=300
)


In [16]:
# Create index
try:
    response = oss_client.indices.create(index=index_name, body=json.dumps(body_json))
    print('\nCreating index:')
    pp.pprint(response)

    # index creation can take up to a minute
    interactive_sleep(60)
except RequestError as e:
    # you can delete the index if its already exists
    # oss_client.indices.delete(index=index_name)
    print(f'Error while trying to create the index, with error {e.error}\nyou may unmark the delete above to delete, and recreate the index')
    


Creating index:
{ 'acknowledged': True,
  'index': 'bedrock-sample-index-416',
  'shards_acknowledged': True}
............................................................

<h3>Download data to ingest into our knowledge base</h3>

In [17]:
# Download and prepare dataset
!mkdir -p ./assets/data

from urllib.request import urlretrieve
urls = [
    'https://s2.q4cdn.com/299287126/files/doc_financials/2023/ar/2022-Shareholder-Letter.pdf',
    'https://s2.q4cdn.com/299287126/files/doc_financials/2022/ar/2021-Shareholder-Letter.pdf',
    'https://s2.q4cdn.com/299287126/files/doc_financials/2021/ar/Amazon-2020-Shareholder-Letter-and-1997-Shareholder-Letter.pdf',
    'https://s2.q4cdn.com/299287126/files/doc_financials/2020/ar/2019-Shareholder-Letter.pdf'
]

filenames = [
    'AMZN-2022-Shareholder-Letter.pdf',
    'AMZN-2021-Shareholder-Letter.pdf',
    'AMZN-2020-Shareholder-Letter.pdf',
    'AMZN-2019-Shareholder-Letter.pdf'
]

data_root = "./assets/data/"

for idx, url in enumerate(urls):
    file_path = data_root + filenames[idx]
    urlretrieve(url, file_path)


#### Upload data to S3 Bucket data source

In [18]:
# Upload data to s3 to the bucket that was configured as a data source to the knowledge base
s3_client = boto3.client("s3")
def uploadDirectory(path,bucket_name):
        for root,dirs,files in os.walk(path):
            for file in files:
                s3_client.upload_file(os.path.join(root,file),bucket_name,file)

uploadDirectory(data_root, bucket_name)

<h3>Create Knowledge Base</h3>

Steps:
<ul>
<li>initialize Open search serverless configuration which will include collection ARN, index name, vector field, text field and metadata field.</li>
<li>initialize chunking strategy, based on which KB will split the documents into pieces of size equal to the chunk size mentioned in the <strong>chunkingStrategyConfiguration</strong>.</li>
<li>initialize the s3 configuration, which will be used to create the data source object later.</li>
<li>initialize the Cohere Embed Multilingual embeddings model ARN, as this will be used to create the embeddings for each of the text chunks.</li>
</ul>

In [19]:
opensearchServerlessConfiguration = {
            "collectionArn": collection["createCollectionDetail"]['arn'],
            "vectorIndexName": index_name,
            "fieldMapping": {
                "vectorField": "vector",
                "textField": "text",
                "metadataField": "text-metadata"
            }
        }

# Ingest strategy - How to ingest data from the data source
chunkingStrategyConfiguration = {
    "chunkingStrategy": "FIXED_SIZE",
    "fixedSizeChunkingConfiguration": {
        "maxTokens": 512,
        "overlapPercentage": 20
    }
}

# The data source to ingest documents from, into the OpenSearch serverless knowledge base index
s3Configuration = {
    "bucketArn": f"arn:aws:s3:::{bucket_name}",
    # "inclusionPrefixes":["*.*"] # you can use this if you want to create a KB using data within s3 prefixes.
}

# The embedding model used by Bedrock to embed ingested documents, and realtime prompts
embeddingModelArn = f"arn:aws:bedrock:{region_name}::foundation-model/cohere.embed-multilingual-v3"

name = f"bedrock-sample-knowledge-base-{suffix}"
description = "Amazon shareholder letter knowledge base."
roleArn = bedrock_kb_execution_role_arn


Provide the above configurations as input to the `create_knowledge_base` method, which will create the Knowledge base.

In [20]:
# Create a KnowledgeBase
from retrying import retry

@retry(wait_random_min=1000, wait_random_max=2000,stop_max_attempt_number=7)
def create_knowledge_base_func():
    create_kb_response = bedrock_agent_client.create_knowledge_base(
        name = name,
        description = description,
        roleArn = roleArn,
        knowledgeBaseConfiguration = {
            "type": "VECTOR",
            "vectorKnowledgeBaseConfiguration": {
                "embeddingModelArn": embeddingModelArn
            }
        },
        storageConfiguration = {
            "type": "OPENSEARCH_SERVERLESS",
            "opensearchServerlessConfiguration":opensearchServerlessConfiguration
        }
    )
    return create_kb_response["knowledgeBase"]

In [21]:
try:
    kb = create_knowledge_base_func()
except Exception as err:
    print(f"{err=}, {type(err)=}")

In [22]:
pp.pprint(kb)

{ 'createdAt': datetime.datetime(2024, 9, 5, 15, 26, 39, 640513, tzinfo=tzlocal()),
  'description': 'Amazon shareholder letter knowledge base.',
  'knowledgeBaseArn': 'arn:aws:bedrock:ap-south-1:874163252636:knowledge-base/KV0AYQHWPO',
  'knowledgeBaseConfiguration': { 'type': 'VECTOR',
                                  'vectorKnowledgeBaseConfiguration': { 'embeddingModelArn': 'arn:aws:bedrock:ap-south-1::foundation-model/cohere.embed-multilingual-v3'}},
  'knowledgeBaseId': 'KV0AYQHWPO',
  'name': 'bedrock-sample-knowledge-base-416',
  'roleArn': 'arn:aws:iam::874163252636:role/AmazonBedrockExecutionRoleForKnowledgeBase_653',
  'status': 'CREATING',
  'storageConfiguration': { 'opensearchServerlessConfiguration': { 'collectionArn': 'arn:aws:aoss:ap-south-1:874163252636:collection/qucjcmpe10mr2kv6jaf2',
                                                                   'fieldMapping': { 'metadataField': 'text-metadata',
                                                                

In [23]:
# Get KnowledgeBase 
get_kb_response = bedrock_agent_client.get_knowledge_base(knowledgeBaseId = kb['knowledgeBaseId'])

Next we need to create a data source, which will be associated with the knowledge base created above. Once the data source is ready, we can then start to ingest the documents.

In [24]:
# Create a DataSource in KnowledgeBase 
create_ds_response = bedrock_agent_client.create_data_source(
    name = name,
    description = description,
    knowledgeBaseId = kb['knowledgeBaseId'],
    dataSourceConfiguration = {
        "type": "S3",
        "s3Configuration":s3Configuration
    },
    vectorIngestionConfiguration = {
        "chunkingConfiguration": chunkingStrategyConfiguration
    }
)
ds = create_ds_response["dataSource"]
pp.pprint(ds)

{ 'createdAt': datetime.datetime(2024, 9, 5, 15, 26, 40, 575472, tzinfo=tzlocal()),
  'dataDeletionPolicy': 'DELETE',
  'dataSourceConfiguration': { 's3Configuration': { 'bucketArn': 'arn:aws:s3:::bedrock-kb-ap-south-1-874163252636'},
                               'type': 'S3'},
  'dataSourceId': 'QSPUTYFUTO',
  'description': 'Amazon shareholder letter knowledge base.',
  'knowledgeBaseId': 'KV0AYQHWPO',
  'name': 'bedrock-sample-knowledge-base-416',
  'status': 'AVAILABLE',
  'updatedAt': datetime.datetime(2024, 9, 5, 15, 26, 40, 575472, tzinfo=tzlocal()),
  'vectorIngestionConfiguration': { 'chunkingConfiguration': { 'chunkingStrategy': 'FIXED_SIZE',
                                                               'fixedSizeChunkingConfiguration': { 'maxTokens': 512,
                                                                                                   'overlapPercentage': 20}}}}


In [25]:
# Get DataSource 
bedrock_agent_client.get_data_source(knowledgeBaseId = kb['knowledgeBaseId'], dataSourceId = ds["dataSourceId"])

{'ResponseMetadata': {'RequestId': '347bc56b-54dc-4f44-91dd-2d6a0ae59f84',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Thu, 05 Sep 2024 15:26:40 GMT',
   'content-type': 'application/json',
   'content-length': '604',
   'connection': 'keep-alive',
   'x-amzn-requestid': '347bc56b-54dc-4f44-91dd-2d6a0ae59f84',
   'x-amz-apigw-id': 'do0ToEJnhcwEKDQ=',
   'x-amzn-trace-id': 'Root=1-66d9cdb0-246bf56070b1f22904afba54'},
  'RetryAttempts': 0},
 'dataSource': {'createdAt': datetime.datetime(2024, 9, 5, 15, 26, 40, 575472, tzinfo=tzlocal()),
  'dataDeletionPolicy': 'DELETE',
  'dataSourceConfiguration': {'s3Configuration': {'bucketArn': 'arn:aws:s3:::bedrock-kb-ap-south-1-874163252636'},
   'type': 'S3'},
  'dataSourceId': 'QSPUTYFUTO',
  'description': 'Amazon shareholder letter knowledge base.',
  'knowledgeBaseId': 'KV0AYQHWPO',
  'name': 'bedrock-sample-knowledge-base-416',
  'status': 'AVAILABLE',
  'updatedAt': datetime.datetime(2024, 9, 5, 15, 26, 40, 575472, tzinfo=tzlocal()),

<h4>Start ingestion job</h4>

Once the KB and data source is created, we can start the ingestion job.

During the ingestion job, KB will fetch the documents in the data source, pre-process it to extract text, chunk it based on the chunking size provided, create embeddings of each chunk and then write it to the vector database, in this case OSS.

In [27]:
# Start an ingestion job
start_job_response = bedrock_agent_client.start_ingestion_job(knowledgeBaseId = kb['knowledgeBaseId'], dataSourceId = ds["dataSourceId"])

In [28]:
job = start_job_response["ingestionJob"]
pp.pprint(job)

{ 'dataSourceId': 'QSPUTYFUTO',
  'ingestionJobId': 'GGKHXD6XUE',
  'knowledgeBaseId': 'KV0AYQHWPO',
  'startedAt': datetime.datetime(2024, 9, 5, 15, 30, 24, 145767, tzinfo=tzlocal()),
  'statistics': { 'numberOfDocumentsDeleted': 0,
                  'numberOfDocumentsFailed': 0,
                  'numberOfDocumentsScanned': 0,
                  'numberOfMetadataDocumentsModified': 0,
                  'numberOfMetadataDocumentsScanned': 0,
                  'numberOfModifiedDocumentsIndexed': 0,
                  'numberOfNewDocumentsIndexed': 0},
  'status': 'STARTING',
  'updatedAt': datetime.datetime(2024, 9, 5, 15, 30, 24, 145767, tzinfo=tzlocal())}


In [29]:
# Get job 
while(job['status']!='COMPLETE' ):
    get_job_response = bedrock_agent_client.get_ingestion_job(
      knowledgeBaseId = kb['knowledgeBaseId'],
        dataSourceId = ds["dataSourceId"],
        ingestionJobId = job["ingestionJobId"]
  )
    job = get_job_response["ingestionJob"]
    
    interactive_sleep(30)

pp.pprint(job)

{ 'dataSourceId': 'QSPUTYFUTO',
  'ingestionJobId': 'GGKHXD6XUE',
  'knowledgeBaseId': 'KV0AYQHWPO',
  'startedAt': datetime.datetime(2024, 9, 5, 15, 30, 24, 145767, tzinfo=tzlocal()),
  'statistics': { 'numberOfDocumentsDeleted': 0,
                  'numberOfDocumentsFailed': 0,
                  'numberOfDocumentsScanned': 4,
                  'numberOfMetadataDocumentsModified': 0,
                  'numberOfMetadataDocumentsScanned': 0,
                  'numberOfModifiedDocumentsIndexed': 0,
                  'numberOfNewDocumentsIndexed': 4},
  'status': 'COMPLETE',
  'updatedAt': datetime.datetime(2024, 9, 5, 15, 30, 40, 332743, tzinfo=tzlocal())}


In [30]:
# Print the knowledge base Id in bedrock, that corresponds to the Opensearch index in the collection we created before, we will use it for the invocation later
kb_id = kb["knowledgeBaseId"]
pp.pprint(kb_id)

'KV0AYQHWPO'


In [31]:
# keep the kb_id for invocation later in the invoke request
%store kb_id

Stored 'kb_id' (str)


<h3>Test the knowledge base</h3>

<h4>Note: If you plan to run any of the notebooks in the current folder then, you can skip this section</h4>

<h4>Using RetrieveAndGenerate API</h4>

Behind the scenes, RetrieveAndGenerate API converts queries into embeddings, searches the knowledge base, and then augments the foundation model prompt with the search results as context information and returns the FM-generated response to the question. For multi-turn conversations, Knowledge Bases manage short-term memory of the conversation to provide more contextual results.

The output of the RetrieveAndGenerate API includes the generated response, source attribution as well as the retrieved text chunks.

In [31]:
# try out KB using RetrieveAndGenerate API
bedrock_agent_runtime_client = boto3.client("bedrock-agent-runtime", region_name=region_name)
# Lets see how different Anthropic Claude 3 models responds to the input text we provide
claude_model_ids = [ ["Claude 3 Sonnet", "anthropic.claude-3-sonnet-20240229-v1:0"], ["Claude 3 Haiku", "anthropic.claude-3-haiku-20240307-v1:0"]]

In [32]:
def ask_bedrock_llm_with_knowledge_base(query: str, model_arn: str, kb_id: str) -> str:
    response = bedrock_agent_runtime_client.retrieve_and_generate(
        input={
            'text': query
        },
        retrieveAndGenerateConfiguration={
            'type': 'KNOWLEDGE_BASE',
            'knowledgeBaseConfiguration': {
                'knowledgeBaseId': kb_id,
                'modelArn': model_arn
            }
        },
    )

    return response

In [33]:
query = "What is Amazon's doing in the field of generative AI?"

for model_id in claude_model_ids:
    model_arn = f'arn:aws:bedrock:{region_name}::foundation-model/{model_id[1]}'
    response = ask_bedrock_llm_with_knowledge_base(query, model_arn, kb_id)
    generated_text = response['output']['text']
    citations = response["citations"]
    contexts = []
    for citation in citations:
        retrievedReferences = citation["retrievedReferences"]
        for reference in retrievedReferences:
            contexts.append(reference["content"]["text"])
    print(f"---------- Generated using {model_id[0]}:")
    pp.pprint(generated_text )
    print(f'---------- The citations for the response generated by {model_id[0]}:')
    pp.pprint(contexts)
    print()

---------- Generated using Claude 3 Sonnet:
('Amazon has been investing heavily in large language models (LLMs) and '
 'generative AI, which it believes will transform and improve virtually every '
 'customer experience across its consumer, seller, brand, and creator '
 'offerings. Amazon has been working on its own LLMs for a while and plans to '
 'continue investing substantially in these models. For its cloud computing '
 'service AWS, Amazon is offering machine learning chips like Trainium and '
 'Inferentia that provide cost-effective training and inference for LLMs, '
 'allowing companies of all sizes to leverage generative AI. AWS is also '
 'delivering applications like CodeWhisperer that use generative AI to improve '
 'developer productivity by generating code suggestions in real time.')
---------- The citations for the response generated by Claude 3 Sonnet:
[ 'The customer reaction to what we’ve shared thus far about Kuiper has been '
  'very positive, and we believe Kuiper 

<h4>Retrieve API</h4>

Retrieve API converts user queries into embeddings, searches the knowledge base, and returns the relevant results, giving you more control to build custom workﬂows on top of the semantic search results. The output of the Retrieve API includes the the retrieved text chunks, the location type and URI of the source data, as well as the relevance scores of the retrievals.

In [34]:
# retrieve api for fetching only the relevant context.
relevant_documents = bedrock_agent_runtime_client.retrieve(
    retrievalQuery= {
        'text': query
    },
    knowledgeBaseId=kb_id,
    retrievalConfiguration= {
        'vectorSearchConfiguration': {
            'numberOfResults': 3 # will fetch top 3 documents which matches closely with the query.
        }
    }
)

In [35]:
pp.pprint(relevant_documents["retrievalResults"])

[ { 'content': { 'text': 'Generative AI is based on very Large Language Models '
                         '(trained on up to hundreds of billions of '
                         'parameters, and growing), across expansive datasets, '
                         'and has radically general and broad recall and '
                         'learning capabilities. We have been working on our '
                         'own LLMs for a while now, believe it will transform '
                         'and improve virtually every customer experience, and '
                         'will continue to invest substantially in these '
                         'models across all of our consumer, seller, brand, '
                         'and creator experiences. Additionally, as we’ve done '
                         'for years in AWS, we’re democratizing this '
                         'technology so companies of all sizes can leverage '
                         'Generative AI. AWS is offering the most '
  

<h3>Next Steps</h3>

Proceed to the next labs to learn how to use Bedrock Knowledge bases with Open Source Libraries.

<h3>Clean Up</h3>


<div class="alert alert-block alert-warning">
In case you are done with your labs and the sample codes then remember to Clean Up the resources at the end of your session by following <a href="./3_clean_up.ipynb">3_clean_up.ipynb</a> 
</div>