# Create and Invoke Agent via Boto3 SDK (Connecting Knowledge Base with Agent)

> *This notebook should work well with the **`Data Science 3.0`** kernel in SageMaker Studio*

## Introduction

In this notebook we show you how to use the `bedrock-agent` and the `bedrock-agent-runtime` boto3 clients to:
- create an agent
- create an action group
- associate the agent with the action group and prepare the agent
- create a knowledge base
- associate the knowledge base with the agent
- create an agent alias
- invoke the agent

We will use Bedrock's Claude v2 using the Boto3 API. 

**Note:** *This notebook can be run within or outside of AWS environment.*

#### Pre-requisites
This notebook requires permissions to: 
- create and delete Amazon IAM roles
- create, update and invoke AWS Lambda functions 
- create, update and delete Amazon S3 buckets 
- access Amazon Bedrock 
- access to Amazon OpenSearch Serverless

If running on SageMaker Studio, you should add the following managed policies to your role:
- IAMFullAccess
- AWSLambda_FullAccess
- AmazonS3FullAccess
- AmazonBedrockFullAccess
- Custom policy for Amazon OpenSearch Serverless such as:
```
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "aoss:*",
            "Resource": "*"
        }
    ]
}
```

#### Context
We will demonstrate how to create and invoke an agent for Bedrock using the Boto3 SDK. We will connect an Action Group and a Knowledge Base to the Agent and show how to combine the outputs of both to generate the required customer outputs

#### Use case
For this notebook, we use an insurance claimer use case to build our Agent. The agent helps the insurance provider checking the open claims, identifying the details for a specific claim, get open documents for a claim, get the document's requirements from a knowledge base and send reminders for a claim policyholder. The following diagram illustrates the sample process flow.

![sequence-flow-agent](images/93-agent-workflow.png)

#### Architecture
The following diagram depicts a high-level architecture of this solution.

![architecture-diagram](images/93-agent-architecture.png)

The Agent created can handle the follow tasks:
- Get Open Claims
- Get Claim Details
- Get Claim Outstanding Documents
- Send Claim reminder

## Notebook setup
Before starting, let's import the required packages and configure the support variables

In [None]:
!pip install opensearch-py
!pip install requests-aws4auth
!pip install -U boto3
!pip install -U botocore
!pip install -U awscli

In [1]:
import logging
import boto3
import random
import time
import zipfile
from io import BytesIO
import json
import uuid
import pprint
import os
from opensearchpy import OpenSearch, RequestsHttpConnection
from requests_aws4auth import AWS4Auth

In [2]:
# setting logger
logging.basicConfig(format='[%(asctime)s] p%(process)s {%(filename)s:%(lineno)d} %(levelname)s - %(message)s', level=logging.INFO)
logger = logging.getLogger(__name__)

In [3]:
# getting boto3 clients for required AWS services
sts_client = boto3.client('sts')
iam_client = boto3.client('iam')
s3_client = boto3.client('s3')
lambda_client = boto3.client('lambda')
bedrock_agent_client = boto3.client('bedrock-agent')
bedrock_agent_runtime_client = boto3.client('bedrock-agent-runtime')
open_search_serverless_client = boto3.client('opensearchserverless')

[2024-06-05 07:56:07,190] p26383 {credentials.py:1075} INFO - Found credentials from IAM Role: BaseNotebookInstanceEc2InstanceRole


In [4]:
session = boto3.session.Session()
region = session.region_name
account_id = sts_client.get_caller_identity()["Account"]
region, account_id

('us-west-2', '630441275995')

In [27]:
# Generate random prefix for unique IAM roles, agent name and S3 Bucket and 
# assign variables
suffix = f"{region}-{account_id}"
agent_name = "insurance-claims-agent-kb"
agent_alias_name = "workshop-alias"
bucket_name = f'{agent_name}-{suffix}'
bucket_arn = f"arn:aws:s3:::{bucket_name}"
schema_key = f'{agent_name}-schema.json'
schema_name = 'insurance_claims_agent_openapi_schema_with_kb.json'
schema_arn = f'arn:aws:s3:::{bucket_name}/{schema_key}'
bedrock_agent_bedrock_allow_policy_name = f"ica-bedrock-allow-{suffix}"
bedrock_agent_s3_allow_policy_name = f"ica-s3-allow-{suffix}"
bedrock_agent_kb_allow_policy_name = f"ica-kb-allow-{suffix}"
lambda_role_name = f'{agent_name}-lambda-role-{suffix}'
agent_role_name = f'AmazonBedrockExecutionRoleForAgents_ica'
lambda_code_path = "lambda_function.py"
lambda_name = f'{agent_name}-{suffix}'
kb_name = f'new-insurance-claims-kb-{suffix}'
data_source_name = f'insurance-claims-kb-docs-{suffix}'
kb_files_path = 'kb_documents'
kb_key = 'kb_documents'
kb_role_name = f'AmazonBedrockExecutionRoleForKnowledgeBase_icakb'
kb_bedrock_allow_policy_name = f"ica-kb-bedrock-allow-{suffix}"
kb_aoss_allow_policy_name = f"ica-kb-aoss-allow-{suffix}"
kb_s3_allow_policy_name = f"ica-kb-s3-allow-{suffix}"
kb_collection_name = f'ica-kbc-{suffix}'
# Select Amazon titan as the embedding model
embedding_model_arn = f'arn:aws:bedrock:{region}::foundation-model/amazon.titan-embed-text-v1'
kb_vector_index_name = "bedrock-knowledge-base-index"
kb_metadataField = 'bedrock-knowledge-base-metadata'
kb_textField = 'bedrock-knowledge-base-text'
kb_vectorField = 'bedrock-knowledge-base-vector'

### Create S3 bucket and upload API Schema and Knowledge Base files

Agents require an API Schema stored on s3. Let's create an S3 bucket to store the file and upload the necessary files to the newly created bucket

In [6]:
# Create S3 bucket for Open API schema
s3bucket = s3_client.create_bucket(
    Bucket=bucket_name,
    CreateBucketConfiguration={ 'LocationConstraint': region } 
)

In [7]:
# Upload Open API schema to this s3 bucket
s3_client.upload_file(schema_name, bucket_name, schema_key)

In [8]:
# Upload Knowledge Base files to this s3 bucket
for f in os.listdir(kb_files_path):
    if f.endswith(".docx"):
        s3_client.upload_file(kb_files_path+'/'+f, bucket_name, kb_key+'/'+f)

### Create Lambda function for Action Group
Let's now create the lambda function required by the agent action group. We first need to create the lambda IAM role and it's policy. After that, we package the lambda function into a ZIP format to create the function

In [9]:
# Create IAM Role for the Lambda function
try:
    assume_role_policy_document = {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": "bedrock:InvokeModel",
                "Principal": {
                    "Service": "lambda.amazonaws.com"
                },
                "Action": "sts:AssumeRole"
            }
        ]
    }

    assume_role_policy_document_json = json.dumps(assume_role_policy_document)

    lambda_iam_role = iam_client.create_role(
        RoleName=lambda_role_name,
        AssumeRolePolicyDocument=assume_role_policy_document_json
    )

    # Pause to make sure role is created
    time.sleep(10)
except:
    lambda_iam_role = iam_client.get_role(RoleName=lambda_role_name)

iam_client.attach_role_policy(
    RoleName=lambda_role_name,
    PolicyArn='arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole'
)

{'ResponseMetadata': {'RequestId': '94f74b49-c927-4222-aa84-f5289317687a',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Wed, 05 Jun 2024 07:56:28 GMT',
   'x-amzn-requestid': '94f74b49-c927-4222-aa84-f5289317687a',
   'content-type': 'text/xml',
   'content-length': '212'},
  'RetryAttempts': 0}}

In [10]:
# Package up the lambda function code
s = BytesIO()
z = zipfile.ZipFile(s, 'w')
z.write(lambda_code_path)
z.close()
zip_content = s.getvalue()

# Create Lambda Function
lambda_function = lambda_client.create_function(
    FunctionName=lambda_name,
    Runtime='python3.12',
    Timeout=180,
    Role=lambda_iam_role['Role']['Arn'],
    Code={'ZipFile': zip_content},
    Handler='lambda_function.lambda_handler'
)

### Create Knowledge Base
We will now create the knowledge base used by the agent to gather the outstanding documents requirements. We will use [Amazon OpenSearch Serverless](https://aws.amazon.com/opensearch-service/) as the vector databse and index the files stored on the previously created S3 bucket

#### Create Knowledge Base Role
Let's first create IAM policies to allow our Knowledge Base to access Bedrock Titan Embedding Foundation model, Amazon OpenSearch Serverless and the S3 bucket with the Knowledge Base Files.

Once the policies are ready, we will create the Knowledge Base role

In [11]:
# Create IAM policies for KB to invoke embedding model
bedrock_kb_allow_fm_model_policy_statement = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AmazonBedrockAgentBedrockFoundationModelPolicy",
            "Effect": "Allow",
            "Action": "bedrock:InvokeModel",
            "Resource": [
                embedding_model_arn
            ]
        }
    ]
}

kb_bedrock_policy_json = json.dumps(bedrock_kb_allow_fm_model_policy_statement)

kb_bedrock_policy = iam_client.create_policy(
    PolicyName=kb_bedrock_allow_policy_name,
    PolicyDocument=kb_bedrock_policy_json
)

In [12]:
# Create IAM policies for KB to access OpenSearch Serverless
bedrock_kb_allow_aoss_policy_statement = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "aoss:APIAccessAll",
            "Resource": [
                f"arn:aws:aoss:{region}:{account_id}:collection/*"
            ]
        }
    ]
}


kb_aoss_policy_json = json.dumps(bedrock_kb_allow_aoss_policy_statement)

kb_aoss_policy = iam_client.create_policy(
    PolicyName=kb_aoss_allow_policy_name,
    PolicyDocument=kb_aoss_policy_json
)

In [13]:
kb_s3_allow_policy_statement = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowKBAccessDocuments",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:ListBucket"
            ],
            "Resource": [
                f"arn:aws:s3:::{bucket_name}/*",
                f"arn:aws:s3:::{bucket_name}"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:ResourceAccount": f"{account_id}"
                }
            }
        }
    ]
}


kb_s3_json = json.dumps(kb_s3_allow_policy_statement)
kb_s3_policy = iam_client.create_policy(
    PolicyName=kb_s3_allow_policy_name,
    PolicyDocument=kb_s3_json
)

In [14]:
# Create IAM Role for the agent and attach IAM policies
assume_role_policy_document = {
    "Version": "2012-10-17",
    "Statement": [{
          "Effect": "Allow",
          "Principal": {
            "Service": "bedrock.amazonaws.com"
          },
          "Action": "sts:AssumeRole"
    }]
}

assume_role_policy_document_json = json.dumps(assume_role_policy_document)
kb_role = iam_client.create_role(
    RoleName=kb_role_name,
    AssumeRolePolicyDocument=assume_role_policy_document_json
)

# Pause to make sure role is created
time.sleep(10)
    
iam_client.attach_role_policy(
    RoleName=kb_role_name,
    PolicyArn=kb_bedrock_policy['Policy']['Arn']
)

iam_client.attach_role_policy(
    RoleName=kb_role_name,
    PolicyArn=kb_aoss_policy['Policy']['Arn']
)

iam_client.attach_role_policy(
    RoleName=kb_role_name,
    PolicyArn=kb_s3_policy['Policy']['Arn']
)

{'ResponseMetadata': {'RequestId': '20bc8425-389d-41a3-95b0-964b9be01930',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Wed, 05 Jun 2024 07:56:54 GMT',
   'x-amzn-requestid': '20bc8425-389d-41a3-95b0-964b9be01930',
   'content-type': 'text/xml',
   'content-length': '212'},
  'RetryAttempts': 0}}

In [15]:
kb_role_arn = kb_role["Role"]["Arn"]
kb_role_arn

'arn:aws:iam::630441275995:role/AmazonBedrockExecutionRoleForKnowledgeBase_icakb'

#### Create Vector Data Base

Firt of all we have to create a vector store. In this section we will use *Amazon OpenSerach serverless.*

Amazon OpenSearch Serverless is a serverless option in Amazon OpenSearch Service. As a developer, you can use OpenSearch Serverless to run petabyte-scale workloads without configuring, managing, and scaling OpenSearch clusters. You get the same interactive millisecond response times as OpenSearch Service with the simplicity of a serverless environment. Pay only for what you use by automatically scaling resources to provide the right amount of capacity for your application—without impacting data ingestion.

In [16]:
# Create OpenSearch Collection
security_policy_json = {
    "Rules": [
        {
            "ResourceType": "collection",
            "Resource":[
                f"collection/{kb_collection_name}"
            ]
        }
    ],
    "AWSOwnedKey": True
}
security_policy = open_search_serverless_client.create_security_policy(
    description='security policy of aoss collection',
    name=kb_collection_name,
    policy=json.dumps(security_policy_json),
    type='encryption'
)

In [17]:
network_policy_json = [
  {
    "Rules": [
      {
        "Resource": [
          f"collection/{kb_collection_name}"
        ],
        "ResourceType": "dashboard"
      },
      {
        "Resource": [
          f"collection/{kb_collection_name}"
        ],
        "ResourceType": "collection"
      }
    ],
    "AllowFromPublic": True
  }
]

network_policy = open_search_serverless_client.create_security_policy(
    description='network policy of aoss collection',
    name=kb_collection_name,
    policy=json.dumps(network_policy_json),
    type='network'
)

In [18]:
response = sts_client.get_caller_identity()
current_role = response['Arn']
current_role

'arn:aws:sts::630441275995:assumed-role/bedrock-workshop-studio-SageMakerExecutionRole-H9qZjTPBmDwc/SageMaker'

In [19]:
data_policy_json = [
  {
    "Rules": [
      {
        "Resource": [
          f"collection/{kb_collection_name}"
        ],
        "Permission": [
          "aoss:DescribeCollectionItems",
          "aoss:CreateCollectionItems",
          "aoss:UpdateCollectionItems",
          "aoss:DeleteCollectionItems"
        ],
        "ResourceType": "collection"
      },
      {
        "Resource": [
          f"index/{kb_collection_name}/*"
        ],
        "Permission": [
            "aoss:CreateIndex",
            "aoss:DeleteIndex",
            "aoss:UpdateIndex",
            "aoss:DescribeIndex",
            "aoss:ReadDocument",
            "aoss:WriteDocument"
        ],
        "ResourceType": "index"
      }
    ],
    "Principal": [
        kb_role_arn,
        f"arn:aws:sts::{account_id}:assumed-role/Admin/*",
        current_role
    ],
    "Description": ""
  }
]

data_policy = open_search_serverless_client.create_access_policy(
    description='data access policy for aoss collection',
    name=kb_collection_name,
    policy=json.dumps(data_policy_json),
    type='data'
)

In [20]:
opensearch_collection_response = open_search_serverless_client.create_collection(
    description='OpenSearch collection for Amazon Bedrock Knowledge Base',
    name=kb_collection_name,
    standbyReplicas='DISABLED',
    type='VECTORSEARCH'
)
opensearch_collection_response

{'createCollectionDetail': {'arn': 'arn:aws:aoss:us-west-2:630441275995:collection/e7u5knq0iaa3watc8rs3',
  'createdDate': 1717574231741,
  'description': 'OpenSearch collection for Amazon Bedrock Knowledge Base',
  'id': 'e7u5knq0iaa3watc8rs3',
  'kmsKeyArn': 'auto',
  'lastModifiedDate': 1717574231741,
  'name': 'ica-kbc-us-west-2-630441275995',
  'standbyReplicas': 'DISABLED',
  'status': 'CREATING',
  'type': 'VECTORSEARCH'},
 'ResponseMetadata': {'RequestId': '39e493f2-48a9-40db-be57-8582a392a18f',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '39e493f2-48a9-40db-be57-8582a392a18f',
   'date': 'Wed, 05 Jun 2024 07:57:11 GMT',
   'content-type': 'application/x-amz-json-1.0',
   'content-length': '395',
   'connection': 'keep-alive'},
  'RetryAttempts': 0}}

In [21]:
collection_arn = opensearch_collection_response["createCollectionDetail"]["arn"]
collection_arn

'arn:aws:aoss:us-west-2:630441275995:collection/e7u5knq0iaa3watc8rs3'

In [22]:
# wait for collection creation
response = open_search_serverless_client.batch_get_collection(names=[kb_collection_name])
# Periodically check collection status
while (response['collectionDetails'][0]['status']) == 'CREATING':
    print('Creating collection...')
    time.sleep(30)
    response = open_search_serverless_client.batch_get_collection(names=[kb_collection_name])
print('\nCollection successfully created:')
print(response["collectionDetails"])
# Extract the collection endpoint from the response
host = (response['collectionDetails'][0]['collectionEndpoint'])
final_host = host.replace("https://", "")
final_host

Creating collection...

Collection successfully created:
[{'arn': 'arn:aws:aoss:us-west-2:630441275995:collection/e7u5knq0iaa3watc8rs3', 'collectionEndpoint': 'https://e7u5knq0iaa3watc8rs3.us-west-2.aoss.amazonaws.com', 'createdDate': 1717574231741, 'dashboardEndpoint': 'https://e7u5knq0iaa3watc8rs3.us-west-2.aoss.amazonaws.com/_dashboards', 'description': 'OpenSearch collection for Amazon Bedrock Knowledge Base', 'id': 'e7u5knq0iaa3watc8rs3', 'kmsKeyArn': 'auto', 'lastModifiedDate': 1717574255220, 'name': 'ica-kbc-us-west-2-630441275995', 'standbyReplicas': 'DISABLED', 'status': 'ACTIVE', 'type': 'VECTORSEARCH'}]


'e7u5knq0iaa3watc8rs3.us-west-2.aoss.amazonaws.com'

#### Create OpenSearch Index

Let's now create a vector index to index our data

In [23]:
credentials = boto3.Session().get_credentials()
service = 'aoss'
awsauth = AWS4Auth(
    credentials.access_key, 
    credentials.secret_key,
    region, 
    service, 
    session_token=credentials.token
)

# Build the OpenSearch client
open_search_client = OpenSearch(
    hosts=[{'host': final_host, 'port': 443}],
    http_auth=awsauth,
    use_ssl=True,
    verify_certs=True,
    connection_class=RequestsHttpConnection,
    timeout=300
)
# It can take up to a minute for data access rules to be enforced
time.sleep(45)
index_body = {
    "settings": {
        "index.knn": True,
        "number_of_shards": 1,
        "knn.algo_param.ef_search": 512,
        "number_of_replicas": 0,
    },
    "mappings": {
        "properties": {}
    }
}

index_body["mappings"]["properties"][kb_vectorField] = {
    "type": "knn_vector",
    "dimension": 1536,
    "method": {
         "name": "hnsw",
         "engine": "faiss"
    },
}

index_body["mappings"]["properties"][kb_textField] = {
    "type": "text"
}

index_body["mappings"]["properties"][kb_metadataField] = {
    "type": "text"
}

# Create index
response = open_search_client.indices.create(kb_vector_index_name, body=index_body)
print('\nCreating index:')
print(response)

[2024-06-05 07:58:15,388] p26383 {credentials.py:1075} INFO - Found credentials from IAM Role: BaseNotebookInstanceEc2InstanceRole
[2024-06-05 07:59:00,933] p26383 {base.py:258} INFO - PUT https://e7u5knq0iaa3watc8rs3.us-west-2.aoss.amazonaws.com:443/bedrock-knowledge-base-index [status:200 request:0.497s]



Creating index:
{'acknowledged': True, 'shards_acknowledged': True, 'index': 'bedrock-knowledge-base-index'}


In [24]:
storage_configuration = {
    'opensearchServerlessConfiguration': {
        'collectionArn': collection_arn, 
        'fieldMapping': {
            'metadataField': kb_metadataField,
            'textField': kb_textField,
            'vectorField': kb_vectorField
        },
        'vectorIndexName': kb_vector_index_name
    },
    'type': 'OPENSEARCH_SERVERLESS'
}

In [28]:
# Creating the knowledge base
try:
    # ensure the index is created and available
    time.sleep(45)
    kb_obj = bedrock_agent_client.create_knowledge_base(
        name=kb_name, 
        description='KB that contains information about documents requirements for insurance claims',
        roleArn=kb_role_arn,
        knowledgeBaseConfiguration={
            'type': 'VECTOR',  # Corrected type
            'vectorKnowledgeBaseConfiguration': {
                'embeddingModelArn': embedding_model_arn
            }
        },
        storageConfiguration=storage_configuration
    )

    # Pretty print the response
    pprint.pprint(kb_obj)

except Exception as e:
    print(f"Error occurred: {e}")

{'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',
                                      'content-length': '1050',
                                      'content-type': 'application/json',
                                      'date': 'Wed, 05 Jun 2024 08:04:17 GMT',
                                      'x-amz-apigw-id': 'Y4lQNH0oPHcEp4w=',
                                      'x-amzn-requestid': 'bf744198-05e0-4602-80c2-e07b8d283605',
                                      'x-amzn-trace-id': 'Root=1-66601c01-28663ac418d4b53740fe5160'},
                      'HTTPStatusCode': 202,
                      'RequestId': 'bf744198-05e0-4602-80c2-e07b8d283605',
                      'RetryAttempts': 0},
 'knowledgeBase': {'createdAt': datetime.datetime(2024, 6, 5, 8, 4, 17, 152855, tzinfo=tzlocal()),
                   'description': 'KB that contains information about '
                                  'documents requirements for insurance claims',
                   'knowle

#### Create a data source that you can attach to the recently created Knowledge Base

Let's create a data source for our Knowledge Base. Then we will ingest our data and convert it into embeddings.

In [29]:
# Define the S3 configuration for your data source
s3_configuration = {
    'bucketArn': bucket_arn,
    'inclusionPrefixes': [kb_key]  
}

# Define the data source configuration
data_source_configuration = {
    's3Configuration': s3_configuration,
    'type': 'S3'
}

knowledge_base_id = kb_obj["knowledgeBase"]["knowledgeBaseId"]
knowledge_base_arn = kb_obj["knowledgeBase"]["knowledgeBaseArn"]

chunking_strategy_configuration = {
    "chunkingStrategy": "FIXED_SIZE",
    "fixedSizeChunkingConfiguration": {
        "maxTokens": 512,
        "overlapPercentage": 20
    }
}

# Create the data source
try:
    # ensure that the KB is created and available
    time.sleep(45)
    data_source_response = bedrock_agent_client.create_data_source(
        knowledgeBaseId=knowledge_base_id,
        name=data_source_name,
        description='DataSource for the insurance claim documents requirements',
        dataSourceConfiguration=data_source_configuration,
        vectorIngestionConfiguration = {
            "chunkingConfiguration": chunking_strategy_configuration
        }
    )

    # Pretty print the response
    pprint.pprint(data_source_response)

except Exception as e:
    print(f"Error occurred: {e}")


{'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',
                                      'content-length': '685',
                                      'content-type': 'application/json',
                                      'date': 'Wed, 05 Jun 2024 08:05:08 GMT',
                                      'x-amz-apigw-id': 'Y4lYTFLVvHcESDA=',
                                      'x-amzn-requestid': '86d7b292-0ff0-4a9c-b41e-0af3c529e43f',
                                      'x-amzn-trace-id': 'Root=1-66601c34-510deaa47726dea65aed39d7'},
                      'HTTPStatusCode': 200,
                      'RequestId': '86d7b292-0ff0-4a9c-b41e-0af3c529e43f',
                      'RetryAttempts': 0},
 'dataSource': {'createdAt': datetime.datetime(2024, 6, 5, 8, 5, 8, 930036, tzinfo=tzlocal()),
                'dataDeletionPolicy': 'DELETE',
                'dataSourceConfiguration': {'s3Configuration': {'bucketArn': 'arn:aws:s3:::insurance-claims-agent-kb-us-west-2-630441275

#### Start ingestion job
Once the Knowledge Base and Data Source are created, we can start the ingestion job.
During the ingestion job, Knowledge Base will fetch the documents in the data source, pre-process it to extract text, chunk it based on the chunking size provided, create embeddings of each chunk and then write it to the vector database, in this case Amazon OpenSource Serverless.

In [30]:
# Start an ingestion job
data_source_id = data_source_response["dataSource"]["dataSourceId"]
start_job_response = bedrock_agent_client.start_ingestion_job(
    knowledgeBaseId=knowledge_base_id, 
    dataSourceId=data_source_id
)

### Create Agent
We will now create our agent. To do so, we first need to create the agent policies that allow bedrock model invocation  and s3 bucket access. 

In [31]:
# Create IAM policies for agent
bedrock_agent_bedrock_allow_policy_statement = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AmazonBedrockAgentBedrockFoundationModelPolicy",
            "Effect": "Allow",
            "Action": "bedrock:InvokeModel",
            "Resource": [
                f"arn:aws:bedrock:{region}::foundation-model/anthropic.claude-v2:1"
            ]
        }
    ]
}

bedrock_policy_json = json.dumps(bedrock_agent_bedrock_allow_policy_statement)

agent_bedrock_policy = iam_client.create_policy(
    PolicyName=bedrock_agent_bedrock_allow_policy_name,
    PolicyDocument=bedrock_policy_json
)

In [32]:
bedrock_agent_s3_allow_policy_statement = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowAgentAccessOpenAPISchema",
            "Effect": "Allow",
            "Action": ["s3:GetObject"],
            "Resource": [
                schema_arn
            ]
        }
    ]
}


bedrock_agent_s3_json = json.dumps(bedrock_agent_s3_allow_policy_statement)
agent_s3_schema_policy = iam_client.create_policy(
    PolicyName=bedrock_agent_s3_allow_policy_name,
    Description=f"Policy to allow invoke Lambda that was provisioned for it.",
    PolicyDocument=bedrock_agent_s3_json
)

In [33]:
bedrock_agent_kb_retrival_policy_statement = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "bedrock:Retrieve"
            ],
            "Resource": [
                knowledge_base_arn
            ]
        }
    ]
}
bedrock_agent_kb_json = json.dumps(bedrock_agent_kb_retrival_policy_statement)
agent_kb_schema_policy = iam_client.create_policy(
    PolicyName=bedrock_agent_kb_allow_policy_name,
    Description=f"Policy to allow agent to retrieve documents from knowledge base.",
    PolicyDocument=bedrock_agent_kb_json
)

In [34]:
# Create IAM Role for the agent and attach IAM policies
assume_role_policy_document = {
    "Version": "2012-10-17",
    "Statement": [{
          "Effect": "Allow",
          "Principal": {
            "Service": "bedrock.amazonaws.com"
          },
          "Action": "sts:AssumeRole"
    }]
}

assume_role_policy_document_json = json.dumps(assume_role_policy_document)
agent_role = iam_client.create_role(
    RoleName=agent_role_name,
    AssumeRolePolicyDocument=assume_role_policy_document_json
)

# Pause to make sure role is created
time.sleep(10)
    
iam_client.attach_role_policy(
    RoleName=agent_role_name,
    PolicyArn=agent_bedrock_policy['Policy']['Arn']
)

iam_client.attach_role_policy(
    RoleName=agent_role_name,
    PolicyArn=agent_s3_schema_policy['Policy']['Arn']
)

iam_client.attach_role_policy(
    RoleName=agent_role_name,
    PolicyArn=agent_kb_schema_policy['Policy']['Arn']
)

{'ResponseMetadata': {'RequestId': '91c37f8d-808f-43dc-9221-4aaee51299d3',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Wed, 05 Jun 2024 08:05:34 GMT',
   'x-amzn-requestid': '91c37f8d-808f-43dc-9221-4aaee51299d3',
   'content-type': 'text/xml',
   'content-length': '212'},
  'RetryAttempts': 0}}

#### Creating Agent
Once the needed IAM role is created, we can use the bedrock agent client to create a new agent. To do so we use the `create_agent` function. It requires an agent name, underline foundation model and instruction. You can also provide an agent description. Note that the agent created is not yet prepared. We will focus on preparing the agent and then using it to invoke actions and use other APIs

In [35]:
# Create Agent
agent_instruction = """
You are an agent that can handle various tasks related to insurance claims, including looking up claim 
details, finding what paperwork is outstanding, and sending reminders. Only send reminders if you have been 
explicitly requested to do so. If an user asks about your functionality, provide guidance in natural language 
and do not include function names on the output."""

response = bedrock_agent_client.create_agent(
    agentName=agent_name,
    agentResourceRoleArn=agent_role['Role']['Arn'],
    description="Agent for handling insurance claims.",
    idleSessionTTLInSeconds=1800,
    foundationModel="anthropic.claude-v2:1",
    instruction=agent_instruction,
)

Looking at the created agent, we can see its status and agent id

In [36]:
response

{'ResponseMetadata': {'RequestId': '17ae6f82-6f38-472c-b051-2b7f0caec609',
  'HTTPStatusCode': 202,
  'HTTPHeaders': {'date': 'Wed, 05 Jun 2024 08:05:55 GMT',
   'content-type': 'application/json',
   'content-length': '874',
   'connection': 'keep-alive',
   'x-amzn-requestid': '17ae6f82-6f38-472c-b051-2b7f0caec609',
   'x-amz-apigw-id': 'Y4lfgGljPHcEu7Q=',
   'x-amzn-trace-id': 'Root=1-66601c62-03f616205943c2cc7b4611d9'},
  'RetryAttempts': 0},
 'agent': {'agentArn': 'arn:aws:bedrock:us-west-2:630441275995:agent/J0ONUSMRVY',
  'agentId': 'J0ONUSMRVY',
  'agentName': 'insurance-claims-agent-kb',
  'agentResourceRoleArn': 'arn:aws:iam::630441275995:role/AmazonBedrockExecutionRoleForAgents_ica',
  'agentStatus': 'CREATING',
  'createdAt': datetime.datetime(2024, 6, 5, 8, 5, 55, 84009, tzinfo=tzlocal()),
  'description': 'Agent for handling insurance claims.',
  'foundationModel': 'anthropic.claude-v2:1',
  'idleSessionTTLInSeconds': 1800,
  'instruction': '\nYou are an agent that can ha

Let's now store the agent id in a local variable to use it on the next steps

In [37]:
agent_id = response['agent']['agentId']
agent_id

'J0ONUSMRVY'

### Create Agent Action Group
We will now create and agent action group that uses the lambda function and API schema files created before.
The `create_agent_action_group` function provides this functionality. We will use `DRAFT` as the agent version since we haven't yet create an agent version or alias. To inform the agent about the action group functionalities, we will provide an action group description containing the functionalities of the action group.

In [38]:
bucket_name, schema_key

('insurance-claims-agent-kb-us-west-2-630441275995',
 'insurance-claims-agent-kb-schema.json')

In [39]:
# Pause to make sure agent is created
time.sleep(30)
# Now, we can configure and create an action group here:
agent_action_group_response = bedrock_agent_client.create_agent_action_group(
    agentId=agent_id,
    agentVersion='DRAFT',
    actionGroupExecutor={
        'lambda': lambda_function['FunctionArn']
    },
    actionGroupName='ClaimManagementActionGroup',
    apiSchema={
        's3': {
            's3BucketName': bucket_name,
            's3ObjectKey': schema_key
        }
    },
    description='Actions for listing claims, identifying missing paperwork, sending reminders'
)

In [40]:
agent_action_group_response

{'ResponseMetadata': {'RequestId': '3fa4e75f-d4e7-40b8-950b-3ce0817181c2',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Wed, 05 Jun 2024 08:06:35 GMT',
   'content-type': 'application/json',
   'content-length': '628',
   'connection': 'keep-alive',
   'x-amzn-requestid': '3fa4e75f-d4e7-40b8-950b-3ce0817181c2',
   'x-amz-apigw-id': 'Y4llyEVoPHcEgew=',
   'x-amzn-trace-id': 'Root=1-66601c8b-689958436bdd5dbc02447667'},
  'RetryAttempts': 0},
 'agentActionGroup': {'actionGroupExecutor': {'lambda': 'arn:aws:lambda:us-west-2:630441275995:function:insurance-claims-agent-kb-us-west-2-630441275995'},
  'actionGroupId': 'WU4KVNKQGA',
  'actionGroupName': 'ClaimManagementActionGroup',
  'actionGroupState': 'ENABLED',
  'agentId': 'J0ONUSMRVY',
  'agentVersion': 'DRAFT',
  'apiSchema': {'s3': {'s3BucketName': 'insurance-claims-agent-kb-us-west-2-630441275995',
    's3ObjectKey': 'insurance-claims-agent-kb-schema.json'}},
  'createdAt': datetime.datetime(2024, 6, 5, 8, 6, 35, 526632, tzinfo

### Allowing Agent to invoke Action Group Lambda
Before using our action group, we need to allow our agent to invoke the lambda function associated to the action group. This is done via resource-based policy. Let's add the resource-based policy to the lambda function created

In [41]:
# Create allow invoke permission on lambda
response = lambda_client.add_permission(
    FunctionName=lambda_name,
    StatementId='allow_bedrock',
    Action='lambda:InvokeFunction',
    Principal='bedrock.amazonaws.com',
    SourceArn=f"arn:aws:bedrock:{region}:{account_id}:agent/{agent_id}",
)

### Associating the agent to a Knowledge Base


In [42]:
agent_kb_description = bedrock_agent_client.associate_agent_knowledge_base(
    agentId=agent_id,
    agentVersion='DRAFT',
    description=f'Use the information in the {kb_name} knowledge base to provide accurate responses to detail the requirements of each missing document in a insurance claim.',
    knowledgeBaseId=knowledge_base_id 
)

### Preparing Agent
Let's create a DRAFT version of the agent that can be used for internal testing.

In [43]:
agent_prepare = bedrock_agent_client.prepare_agent(agentId=agent_id)
agent_prepare

{'ResponseMetadata': {'RequestId': 'ca466eef-4c16-4597-8c2d-c145b3c71abf',
  'HTTPStatusCode': 202,
  'HTTPHeaders': {'date': 'Wed, 05 Jun 2024 08:06:52 GMT',
   'content-type': 'application/json',
   'content-length': '119',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'ca466eef-4c16-4597-8c2d-c145b3c71abf',
   'x-amz-apigw-id': 'Y4locHlwPHcECvQ=',
   'x-amzn-trace-id': 'Root=1-66601c9c-43a724922f3c93b12d088b5d'},
  'RetryAttempts': 0},
 'agentId': 'J0ONUSMRVY',
 'agentStatus': 'PREPARING',
 'agentVersion': 'DRAFT',
 'preparedAt': datetime.datetime(2024, 6, 5, 8, 6, 52, 174735, tzinfo=tzlocal())}

### Create Agent alias
We will now create an alias of the agent that can be used to deploy the agent.

In [44]:
# Pause to make sure agent is prepared
time.sleep(30)
agent_alias = bedrock_agent_client.create_agent_alias(
    agentId=agent_id,
    agentAliasName=agent_alias_name
)
# Pause to make sure agent alias is ready
time.sleep(30)

In [45]:
agent_alias

{'ResponseMetadata': {'RequestId': '34759bd8-93d8-48fb-a021-06aa4f0281eb',
  'HTTPStatusCode': 202,
  'HTTPHeaders': {'date': 'Wed, 05 Jun 2024 08:07:28 GMT',
   'content-type': 'application/json',
   'content-length': '340',
   'connection': 'keep-alive',
   'x-amzn-requestid': '34759bd8-93d8-48fb-a021-06aa4f0281eb',
   'x-amz-apigw-id': 'Y4luDEcuPHcESDA=',
   'x-amzn-trace-id': 'Root=1-66601cc0-076539265750fd507c243c55'},
  'RetryAttempts': 0},
 'agentAlias': {'agentAliasArn': 'arn:aws:bedrock:us-west-2:630441275995:agent-alias/J0ONUSMRVY/DPR3GQOEQG',
  'agentAliasId': 'DPR3GQOEQG',
  'agentAliasName': 'workshop-alias',
  'agentAliasStatus': 'CREATING',
  'agentId': 'J0ONUSMRVY',
  'createdAt': datetime.datetime(2024, 6, 5, 8, 7, 28, 400462, tzinfo=tzlocal()),
  'routingConfiguration': [{}],
  'updatedAt': datetime.datetime(2024, 6, 5, 8, 7, 28, 400462, tzinfo=tzlocal())}}

### Invoke Agent
Now that we've created the agent, let's use the `bedrock-agent-runtime` client to invoke this agent and perform some tasks.

In [46]:
# Extract the agentAliasId from the response
agent_alias_id = agent_alias['agentAlias']['agentAliasId']

## create a random id for session initiator id
session_id:str = str(uuid.uuid1())
enable_trace:bool = True
end_session:bool = False

# invoke the agent API
agentResponse = bedrock_agent_runtime_client.invoke_agent(
    inputText="send reminder to claim-006. Include the missing documents and their requirements",
    agentId=agent_id,
    agentAliasId=agent_alias_id, 
    sessionId=session_id,
    enableTrace=enable_trace, 
    endSession= end_session
)

logger.info(pprint.pprint(agentResponse))

[2024-06-05 08:08:06,233] p26383 {1082616248.py:19} INFO - None


{'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',
                                      'content-type': 'application/json',
                                      'date': 'Wed, 05 Jun 2024 08:08:06 GMT',
                                      'transfer-encoding': 'chunked',
                                      'x-amz-bedrock-agent-session-id': 'bd26f28a-2312-11ef-be88-0ae2b1fad1a9',
                                      'x-amzn-bedrock-agent-content-type': 'application/json',
                                      'x-amzn-requestid': '45681f86-7ded-4e54-b278-96a61e5c4843'},
                      'HTTPStatusCode': 200,
                      'RequestId': '45681f86-7ded-4e54-b278-96a61e5c4843',
                      'RetryAttempts': 0},
 'completion': <botocore.eventstream.EventStream object at 0x7f6b3cb96230>,
 'contentType': 'application/json',
 'sessionId': 'bd26f28a-2312-11ef-be88-0ae2b1fad1a9'}


In [47]:
%%time
event_stream = agentResponse['completion']
try:
    for event in event_stream:        
        if 'chunk' in event:
            data = event['chunk']['bytes']
            logger.info(f"Final answer ->\n{data.decode('utf8')}")
            agent_answer = data.decode('utf8')
            end_event_received = True
            # End event indicates that the request finished successfully
        elif 'trace' in event:
            logger.info(json.dumps(event['trace'], indent=2))
        else:
            raise Exception("unexpected event.", event)
except Exception as e:
    raise Exception("unexpected event.", e)

[2024-06-05 08:08:10,784] p26383 {<timed exec>:11} INFO - {
  "agentAliasId": "DPR3GQOEQG",
  "agentId": "J0ONUSMRVY",
  "agentVersion": "1",
  "sessionId": "bd26f28a-2312-11ef-be88-0ae2b1fad1a9",
  "trace": {
    "preProcessingTrace": {
      "modelInvocationInput": {
        "inferenceConfiguration": {
          "maximumLength": 2048,
          "stopSequences": [
            "\n\nHuman:"
          ],
          "temperature": 0.0,
          "topK": 250,
          "topP": 1.0
        },
        "text": "You are a classifying agent that filters user inputs into categories. Your job is to sort these inputs before they are passed along to our function calling agent. The purpose of our function calling agent is to call functions in order to answer user's questions.\n\nHere is the list of functions we are providing to our function calling agent. The agent is not allowed to call any other functions beside the ones listed here:\n<tools>\n    <tool_description>\n<tool_name>GET::ClaimManagement

CPU times: user 40.1 ms, sys: 5 ms, total: 45.1 ms
Wall time: 33 s


In [48]:
# And here is the response if you just want to see agent's reply
print(agent_answer)

Reminder sent to claim-006 with the requirements for submitting AccidentImages. The tracking ID is 50e8400-e29b-41d4-a716-446655440000.


### Clean up (optional)
The next steps are optional and demonstrate how to delete our agent. To delete the agent we need to:
1. update the action group to disable it
2. delete agent action group
3. delete agent alias
4. delete agent
5. delete lambda function
6. empty created s3 bucket
7. delete s3 bucket

In [49]:
 # This is not needed, you can delete agent successfully after deleting alias only
# Additionaly, you need to disable it first

action_group_id = agent_action_group_response['agentActionGroup']['actionGroupId']
action_group_name = agent_action_group_response['agentActionGroup']['actionGroupName']

response = bedrock_agent_client.update_agent_action_group(
    agentId=agent_id,
    agentVersion='DRAFT',
    actionGroupId= action_group_id,
    actionGroupName=action_group_name,
    actionGroupExecutor={
        'lambda': lambda_function['FunctionArn']
    },
    apiSchema={
        's3': {
            's3BucketName': bucket_name,
            's3ObjectKey': schema_key
        }
    },
    actionGroupState='DISABLED',
)

action_group_deletion = bedrock_agent_client.delete_agent_action_group(
    agentId=agent_id,
    agentVersion='DRAFT',
    actionGroupId= action_group_id
)

In [50]:
 agent_alias_deletion = bedrock_agent_client.delete_agent_alias(
    agentId=agent_id,
    agentAliasId=agent_alias['agentAlias']['agentAliasId']
)

In [51]:
 agent_deletion = bedrock_agent_client.delete_agent(
    agentId=agent_id
)

In [52]:
# Delete Lambda function
lambda_client.delete_function(
    FunctionName=lambda_name
)

{'ResponseMetadata': {'RequestId': 'f97b664a-4eca-4f4e-ad6c-e1574d72b7d5',
  'HTTPStatusCode': 204,
  'HTTPHeaders': {'date': 'Wed, 05 Jun 2024 08:08:54 GMT',
   'content-type': 'application/json',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'f97b664a-4eca-4f4e-ad6c-e1574d72b7d5'},
  'RetryAttempts': 0}}

In [53]:
# Empty and delete S3 Bucket

objects = s3_client.list_objects(Bucket=bucket_name)  
if 'Contents' in objects:
    for obj in objects['Contents']:
        s3_client.delete_object(Bucket=bucket_name, Key=obj['Key']) 
s3_client.delete_bucket(Bucket=bucket_name)

{'ResponseMetadata': {'RequestId': '73G47QNBR3FECC3Q',
  'HostId': 'akIMFJ6CtVrR6Ydc9eWO+yh/HfBhktffN0grIFafyzu93+LH3+KoPM2lbumBK1lVAcTJeHPY9Dk=',
  'HTTPStatusCode': 204,
  'HTTPHeaders': {'x-amz-id-2': 'akIMFJ6CtVrR6Ydc9eWO+yh/HfBhktffN0grIFafyzu93+LH3+KoPM2lbumBK1lVAcTJeHPY9Dk=',
   'x-amz-request-id': '73G47QNBR3FECC3Q',
   'date': 'Wed, 05 Jun 2024 08:09:02 GMT',
   'server': 'AmazonS3'},
  'RetryAttempts': 0}}

In [54]:
agent_s3_schema_policy

{'Policy': {'PolicyName': 'ica-s3-allow-us-west-2-630441275995',
  'PolicyId': 'ANPAZFSJ2OJNTR3X7YRID',
  'Arn': 'arn:aws:iam::630441275995:policy/ica-s3-allow-us-west-2-630441275995',
  'Path': '/',
  'DefaultVersionId': 'v1',
  'AttachmentCount': 0,
  'PermissionsBoundaryUsageCount': 0,
  'IsAttachable': True,
  'CreateDate': datetime.datetime(2024, 6, 5, 8, 5, 22, tzinfo=tzlocal()),
  'UpdateDate': datetime.datetime(2024, 6, 5, 8, 5, 22, tzinfo=tzlocal())},
 'ResponseMetadata': {'RequestId': 'c51221b9-a4a9-4590-b72c-87ecdaa6dfc8',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Wed, 05 Jun 2024 08:05:22 GMT',
   'x-amzn-requestid': 'c51221b9-a4a9-4590-b72c-87ecdaa6dfc8',
   'content-type': 'text/xml',
   'content-length': '805'},
  'RetryAttempts': 0}}

In [55]:
# Delete IAM Roles and policies
for policy in [
    agent_bedrock_policy, 
    agent_s3_schema_policy, 
    agent_kb_schema_policy,
    kb_bedrock_policy,
    kb_aoss_policy,
    kb_s3_policy
]:
    response = iam_client.list_entities_for_policy(
        PolicyArn=policy['Policy']['Arn'],
        EntityFilter='Role'
    )

    for role in response['PolicyRoles']:
        iam_client.detach_role_policy(
            RoleName=role['RoleName'], 
            PolicyArn=policy['Policy']['Arn']
        )

    iam_client.delete_policy(
        PolicyArn=policy['Policy']['Arn']
    )

    
iam_client.detach_role_policy(RoleName=lambda_role_name, PolicyArn='arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole')

for role_name in [
    agent_role_name, 
    lambda_role_name, 
    kb_role_name
]:
    try: 
        iam_client.delete_role(
            RoleName=role_name
        )
    except Exception as e:
        print(e)
        print("couldn't delete role", role_name)
        
    
try:

    open_search_serverless_client.delete_collection(
        id=opensearch_collection_response["createCollectionDetail"]["id"]
    )

    open_search_serverless_client.delete_access_policy(
          name=kb_collection_name,
          type='data'
    )    

    open_search_serverless_client.delete_security_policy(
          name=kb_collection_name,
          type='network'
    )   

    open_search_serverless_client.delete_security_policy(
          name=kb_collection_name,
          type='encryption'
    )    
    bedrock_agent_client.delete_knowledge_base(
        knowledgeBaseId=knowledge_base_id
    )
except Exception as e:
    print(e)

## Conclusion
We have now experimented with using `boto3` SDK to create, invoke and delete an agent.

### Take aways
- Adapt this notebook to create new agents for your application

## Thank You