# 단일 Knowledge Base만 있는 Agent 만들기

이 노트북에서는 회사 데이터를 검색하고 작업을 완료하기 위해 Amazon Bedrock용 단일 Knowledge Base에 연결되는 Amazon Bedrock Agent를 만드는 방법을 배웁니다. 

이 노트북의 사용 사례는 PDF로 저장된 Amazon Bedrock 문서 페이지입니다.이 노트북을 사용하면 Amazon Bedrock에 대해 질문하고 지식창고에서 제공되는 문서를 기반으로 답변을 얻을 수 있습니다.

이 노트북을 완성하는 단계는 다음과 같습니다:

1. 필요한 라이브러리 import
1. S3 버킷 생성과 데이터 업로드
1. Knowledge Base for Amazon Bedrock 생성과 데이터를 Knowledge Base와 동기화
1. Agent for Amazon Bedrock 생성
1. Agent 테스트
1. 생성한 리소스 정리

<img src="./images/lab4-architecture.png" alt="Create Agent with a Single Knowledge Base" style="height: 400px; width:950px;"/>


## 1. 필요한 라이브러리 import

In [1]:
!pip install --upgrade -q opensearch-py
!pip install --upgrade -q requests-aws4auth
!pip install --upgrade -q boto3
!pip install --upgrade -q botocore
!pip install --upgrade -q awscli

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
awscli 1.33.3 requires botocore==1.34.121, but you have botocore 1.34.122 which is incompatible.[0m[31m
[0m

In [2]:
import logging
import boto3
import time
import json
import uuid
import pprint
import os
from opensearchpy import OpenSearch, RequestsHttpConnection
from requests_aws4auth import AWS4Auth

In [3]:
# setting logger
logging.basicConfig(format='[%(asctime)s] p%(process)s {%(filename)s:%(lineno)d} %(levelname)s - %(message)s', level=logging.INFO)
logger = logging.getLogger(__name__)

In [4]:
# getting boto3 clients for required AWS services
sts_client = boto3.client('sts')
iam_client = boto3.client('iam')
s3_client = boto3.client('s3')
lambda_client = boto3.client('lambda')
bedrock_agent_client = boto3.client('bedrock-agent')
bedrock_agent_runtime_client = boto3.client('bedrock-agent-runtime')
open_search_serverless_client = boto3.client('opensearchserverless')

[2024-06-07 23:15:30,004] p25453 {credentials.py:1075} INFO - Found credentials from IAM Role: BaseNotebookInstanceEc2InstanceRole


In [5]:
session = boto3.session.Session()
region = session.region_name
account_id = sts_client.get_caller_identity()["Account"]
region, account_id

('us-west-2', '322537213286')

In [6]:
# Generate random prefix for unique IAM roles, agent name and S3 Bucket and 
# assign variables

from time import strftime
current_time = strftime("%m%d-%H%M%s")

suffix = f"{region}-{account_id}"
agent_name = "bedrock-docs-kb-agents"
agent_alias_name = "bedrock-docs-alias"
bucket_name = f'{agent_name}-{suffix}'
bucket_arn = f"arn:aws:s3:::{bucket_name}"
bedrock_agent_bedrock_allow_policy_name = f"bda-bedrock-allow-{current_time}"
bedrock_agent_s3_allow_policy_name = f"bda-s3-allow-{current_time}"
bedrock_agent_kb_allow_policy_name = f"bda-kb-allow-{current_time}"
agent_role_name = f'AmazonBedrockExecutionRoleForAgents_bedrock_docs'
kb_name = f'bedrock-docs-kb-{current_time}'
data_source_name = f'bedrock-docs-kb-docs-{current_time}'
kb_files_path = 'kb_documents'
kb_key = 'kb_documents'
kb_role_name = f'AmazonBedrockExecutionRoleForKnowledgeBase_bedrock_docs'
kb_bedrock_allow_policy_name = f"bd-kb-bedrock-allow-{current_time}"
kb_aoss_allow_policy_name = f"bd-kb-aoss-allow-{current_time}"
kb_s3_allow_policy_name = f"bd-kb-s3-allow-{current_time}"
kb_collection_name = f'bd-kbc-{current_time}'
# Select Amazon titan as the embedding model
embedding_model_arn = f'arn:aws:bedrock:{region}::foundation-model/amazon.titan-embed-text-v1'
kb_vector_index_name = "bedrock-knowledge-base-index"
kb_metadataField = 'bedrock-knowledge-base-metadata'
kb_textField = 'bedrock-knowledge-base-text'
kb_vectorField = 'bedrock-knowledge-base-vector'
model_id = "anthropic.claude-3-haiku-20240307-v1:0"

# agent configuration
agent_instruction = """
You are an agent that support users working with Amazon Bedrock. You have access to Bedrock's documentation in a Knowledge Base
and you can Answer questions from this documentation. Only answer questions based on the documentation and reply with 
"There is no information about your question on the Amazon Bedrock Documentation at the moment, sorry! Do you want to ask another question?" 
If the answer to the question is not available in the documentation
"""

## 2. S3 버킷 생성과 데이터 업로드
Knowledge Bases for Amazon Bedrock는 현재 데이터가 Amazon S3 버킷에 있어야 합니다. 이 섹션에서는 Amazon S3 버킷과 파일을 만들겠습니다.

### 2.1 Amazon S3 bucket 생성

In [7]:
if region != 'us-east-1':
    s3_client.create_bucket(
        Bucket=bucket_name.lower(),
        CreateBucketConfiguration={'LocationConstraint': region}
    )
else:
    s3_client.create_bucket(Bucket=bucket_name)

### 2.2 Amazon S3 bucket로 dataset 업로드

In [8]:
# Upload Knowledge Base files to this s3 bucket
for f in os.listdir(kb_files_path):
    if f.endswith(".pdf"):
        s3_client.upload_file(kb_files_path+'/'+f, bucket_name, kb_key+'/'+f)

## 3. Knowledge Base for Amazon Bedrock 생성

이 섹션에서는 Knowledge Base를 만들고 테스트하는 모든 단계를 살펴봅니다. 

완료해야 할 단계는 다음과 같습니다:
    
1. Knowledge Base 역할 및 해당 정책 생성
2. Vector Database 생성
3. OpenSearch Index 생성
4. Knowledge Base 생성
5. data source 생성 및 최근 생성된 Knowledge Base 추가
6. knowledge Base에 데이터 수집

### 3.1 Knowledge Base 역할 및 해당 정책 생성

먼저 Knowledge Base에서 Bedrock Titan Embedding 모델, Amazon OpenSearch Serverless 및 Knowledge Base 파일이 있는 S3 버킷에 액세스할 수 있도록 하는 IAM 정책을 만들어 보겠습니다.

정책이 준비되면 Knowledge Base 역할을 만듭니다.

In [9]:
# Create IAM policies for KB to invoke embedding model
bedrock_kb_allow_fm_model_policy_statement = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AmazonBedrockAgentBedrockFoundationModelPolicy",
            "Effect": "Allow",
            "Action": "bedrock:InvokeModel",
            "Resource": [
                embedding_model_arn
            ]
        }
    ]
}

kb_bedrock_policy_json = json.dumps(bedrock_kb_allow_fm_model_policy_statement)

kb_bedrock_policy = iam_client.create_policy(
    PolicyName=kb_bedrock_allow_policy_name,
    PolicyDocument=kb_bedrock_policy_json
)

In [10]:
# Create IAM policies for KB to access OpenSearch Serverless
bedrock_kb_allow_aoss_policy_statement = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "aoss:APIAccessAll",
            "Resource": [
                f"arn:aws:aoss:{region}:{account_id}:collection/*"
            ]
        }
    ]
}


kb_aoss_policy_json = json.dumps(bedrock_kb_allow_aoss_policy_statement)

kb_aoss_policy = iam_client.create_policy(
    PolicyName=kb_aoss_allow_policy_name,
    PolicyDocument=kb_aoss_policy_json
)

In [11]:
kb_s3_allow_policy_statement = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowKBAccessDocuments",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:ListBucket"
            ],
            "Resource": [
                f"arn:aws:s3:::{bucket_name}/*",
                f"arn:aws:s3:::{bucket_name}"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:ResourceAccount": f"{account_id}"
                }
            }
        }
    ]
}


kb_s3_json = json.dumps(kb_s3_allow_policy_statement)
kb_s3_policy = iam_client.create_policy(
    PolicyName=kb_s3_allow_policy_name,
    PolicyDocument=kb_s3_json
)

In [12]:
# Create IAM Role for the agent and attach IAM policies
assume_role_policy_document = {
    "Version": "2012-10-17",
    "Statement": [{
          "Effect": "Allow",
          "Principal": {
            "Service": "bedrock.amazonaws.com"
          },
          "Action": "sts:AssumeRole"
    }]
}

assume_role_policy_document_json = json.dumps(assume_role_policy_document)
kb_role = iam_client.create_role(
    RoleName=kb_role_name,
    AssumeRolePolicyDocument=assume_role_policy_document_json
)

# Pause to make sure role is created
time.sleep(10)
    
iam_client.attach_role_policy(
    RoleName=kb_role_name,
    PolicyArn=kb_bedrock_policy['Policy']['Arn']
)

iam_client.attach_role_policy(
    RoleName=kb_role_name,
    PolicyArn=kb_aoss_policy['Policy']['Arn']
)

iam_client.attach_role_policy(
    RoleName=kb_role_name,
    PolicyArn=kb_s3_policy['Policy']['Arn']
)

{'ResponseMetadata': {'RequestId': 'b2aa7ee9-6df2-4b26-8613-d00821db61de',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Fri, 07 Jun 2024 23:23:55 GMT',
   'x-amzn-requestid': 'b2aa7ee9-6df2-4b26-8613-d00821db61de',
   'content-type': 'text/xml',
   'content-length': '212'},
  'RetryAttempts': 0}}

In [13]:
kb_role_arn = kb_role["Role"]["Arn"]
kb_role_arn

'arn:aws:iam::322537213286:role/AmazonBedrockExecutionRoleForKnowledgeBase_bedrock_docs'

### 3.2 Vector Database 생성

우선 vector store를 만들어야 합니다. 이 섹션에서는 Amazon OpenSerach Serverless를 사용하겠습니다.

개발자는 OpenSearch 클러스터를 구성, 관리 및 확장할 필요 없이 OpenSearch Serverless를 사용하여 페타바이트 규모의 워크로드를 실행할 수 있습니다. 서버리스 환경의 간소함과 함께 OpenSearch Service와 동일한 대화형 밀리초 단위의 응답 시간을 얻을 수 있습니다. 데이터 수집에 영향을 주지 않고 애플리케이션에 적합한 용량을 제공하도록 리소스를 자동으로 확장하여 사용한 만큼만 비용을 지불하면 됩니다.


In [14]:
# Create OpenSearch Collection
security_policy_json = {
    "Rules": [
        {
            "ResourceType": "collection",
            "Resource":[
                f"collection/{kb_collection_name}"
            ]
        }
    ],
    "AWSOwnedKey": True
}
security_policy = open_search_serverless_client.create_security_policy(
    description='security policy of aoss collection',
    name=kb_collection_name,
    policy=json.dumps(security_policy_json),
    type='encryption'
)

In [15]:
network_policy_json = [
  {
    "Rules": [
      {
        "Resource": [
          f"collection/{kb_collection_name}"
        ],
        "ResourceType": "dashboard"
      },
      {
        "Resource": [
          f"collection/{kb_collection_name}"
        ],
        "ResourceType": "collection"
      }
    ],
    "AllowFromPublic": True
  }
]

network_policy = open_search_serverless_client.create_security_policy(
    description='network policy of aoss collection',
    name=kb_collection_name,
    policy=json.dumps(network_policy_json),
    type='network'
)

In [16]:
response = sts_client.get_caller_identity()
current_role = response['Arn']
current_role

'arn:aws:sts::322537213286:assumed-role/AmazonSageMaker-ExecutionRole-20240103T094982/SageMaker'

In [17]:
data_policy_json = [
  {
    "Rules": [
      {
        "Resource": [
          f"collection/{kb_collection_name}"
        ],
        "Permission": [
          "aoss:DescribeCollectionItems",
          "aoss:CreateCollectionItems",
          "aoss:UpdateCollectionItems",
          "aoss:DeleteCollectionItems"
        ],
        "ResourceType": "collection"
      },
      {
        "Resource": [
          f"index/{kb_collection_name}/*"
        ],
        "Permission": [
            "aoss:CreateIndex",
            "aoss:DeleteIndex",
            "aoss:UpdateIndex",
            "aoss:DescribeIndex",
            "aoss:ReadDocument",
            "aoss:WriteDocument"
        ],
        "ResourceType": "index"
      }
    ],
    "Principal": [
        kb_role_arn,
        f"arn:aws:sts::{account_id}:assumed-role/Admin/*",
        current_role
    ],
    "Description": ""
  }
]

data_policy = open_search_serverless_client.create_access_policy(
    description='data access policy for aoss collection',
    name=kb_collection_name,
    policy=json.dumps(data_policy_json),
    type='data'
)


In [18]:
opensearch_collection_response = open_search_serverless_client.create_collection(
    description='OpenSearch collection for Amazon Bedrock Knowledge Base',
    name=kb_collection_name,
    standbyReplicas='DISABLED',
    type='VECTORSEARCH'
)
opensearch_collection_response

{'createCollectionDetail': {'arn': 'arn:aws:aoss:us-west-2:322537213286:collection/nasf1401qvjwlahsoith',
  'createdDate': 1717803087611,
  'description': 'OpenSearch collection for Amazon Bedrock Knowledge Base',
  'id': 'nasf1401qvjwlahsoith',
  'kmsKeyArn': 'auto',
  'lastModifiedDate': 1717803087611,
  'name': 'bd-kbc-0607-23171717802236',
  'standbyReplicas': 'DISABLED',
  'status': 'CREATING',
  'type': 'VECTORSEARCH'},
 'ResponseMetadata': {'RequestId': '1b2ef699-1b8b-4206-9dfc-c0ccaa0c6ff2',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '1b2ef699-1b8b-4206-9dfc-c0ccaa0c6ff2',
   'date': 'Fri, 07 Jun 2024 23:31:27 GMT',
   'content-type': 'application/x-amz-json-1.0',
   'content-length': '391',
   'connection': 'keep-alive'},
  'RetryAttempts': 0}}

In [19]:
collection_arn = opensearch_collection_response["createCollectionDetail"]["arn"]
collection_arn

'arn:aws:aoss:us-west-2:322537213286:collection/nasf1401qvjwlahsoith'

In [20]:
# wait for collection creation
response = open_search_serverless_client.batch_get_collection(names=[kb_collection_name])
# Periodically check collection status
while (response['collectionDetails'][0]['status']) == 'CREATING':
    print('Creating collection...')
    time.sleep(30)
    response = open_search_serverless_client.batch_get_collection(names=[kb_collection_name])
print('\nCollection successfully created:')
print(response["collectionDetails"])
# Extract the collection endpoint from the response
host = (response['collectionDetails'][0]['collectionEndpoint'])
final_host = host.replace("https://", "")
final_host

Creating collection...

Collection successfully created:
[{'arn': 'arn:aws:aoss:us-west-2:322537213286:collection/nasf1401qvjwlahsoith', 'collectionEndpoint': 'https://nasf1401qvjwlahsoith.us-west-2.aoss.amazonaws.com', 'createdDate': 1717803087611, 'dashboardEndpoint': 'https://nasf1401qvjwlahsoith.us-west-2.aoss.amazonaws.com/_dashboards', 'description': 'OpenSearch collection for Amazon Bedrock Knowledge Base', 'id': 'nasf1401qvjwlahsoith', 'kmsKeyArn': 'auto', 'lastModifiedDate': 1717803113598, 'name': 'bd-kbc-0607-23171717802236', 'standbyReplicas': 'DISABLED', 'status': 'ACTIVE', 'type': 'VECTORSEARCH'}]


'nasf1401qvjwlahsoith.us-west-2.aoss.amazonaws.com'

### 3.3 - OpenSearch Index 생성

Let's now create a vector index to index our data


In [21]:
credentials = boto3.Session().get_credentials()
service = 'aoss'
awsauth = AWS4Auth(
    credentials.access_key, 
    credentials.secret_key,
    region, 
    service, 
    session_token=credentials.token
)

# Build the OpenSearch client
open_search_client = OpenSearch(
    hosts=[{'host': final_host, 'port': 443}],
    http_auth=awsauth,
    use_ssl=True,
    verify_certs=True,
    connection_class=RequestsHttpConnection,
    timeout=300
)
# It can take up to a minute for data access rules to be enforced
time.sleep(45)
index_body = {
    "settings": {
        "index.knn": True,
        "number_of_shards": 1,
        "knn.algo_param.ef_search": 512,
        "number_of_replicas": 0,
    },
    "mappings": {
        "properties": {}
    }
}

index_body["mappings"]["properties"][kb_vectorField] = {
    "type": "knn_vector",
    "dimension": 1536,
    "method": {
         "name": "hnsw",
         "engine": "faiss"
    },
}

index_body["mappings"]["properties"][kb_textField] = {
    "type": "text"
}

index_body["mappings"]["properties"][kb_metadataField] = {
    "type": "text"
}

# Create index
response = open_search_client.indices.create(kb_vector_index_name, body=index_body)
print('\nCreating index:')
print(response)

[2024-06-07 23:33:53,247] p25453 {credentials.py:1075} INFO - Found credentials from IAM Role: BaseNotebookInstanceEc2InstanceRole
[2024-06-07 23:34:38,871] p25453 {base.py:258} INFO - PUT https://nasf1401qvjwlahsoith.us-west-2.aoss.amazonaws.com:443/bedrock-knowledge-base-index [status:200 request:0.578s]



Creating index:
{'acknowledged': True, 'shards_acknowledged': True, 'index': 'bedrock-knowledge-base-index'}


### 3.5 - Knowledge Base 생성
OpenSearch Serverless에서 Vector database를 사용할 수 있게 되었으니, Knowledge Base를 만들고 이를 OpenSearch DB와 연결해 보겠습니다.

In [22]:
storage_configuration = {
    'opensearchServerlessConfiguration': {
        'collectionArn': collection_arn, 
        'fieldMapping': {
            'metadataField': kb_metadataField,
            'textField': kb_textField,
            'vectorField': kb_vectorField
        },
        'vectorIndexName': kb_vector_index_name
    },
    'type': 'OPENSEARCH_SERVERLESS'
}

In [23]:
# Creating the knowledge base
try:
    # ensure the index is created and available
    time.sleep(45)
    kb_obj = bedrock_agent_client.create_knowledge_base(
        name=kb_name, 
        description='KB that contains the bedrock documentation',
        roleArn=kb_role_arn,
        knowledgeBaseConfiguration={
            'type': 'VECTOR',  # Corrected type
            'vectorKnowledgeBaseConfiguration': {
                'embeddingModelArn': embedding_model_arn
            }
        },
        storageConfiguration=storage_configuration
    )

    # Pretty print the response
    pprint.pprint(kb_obj)

except Exception as e:
    print(f"Error occurred: {e}")

{'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',
                                      'content-length': '1010',
                                      'content-type': 'application/json',
                                      'date': 'Fri, 07 Jun 2024 23:37:33 GMT',
                                      'x-amz-apigw-id': 'ZBT1mFB1PHcELUw=',
                                      'x-amzn-requestid': 'd15ca029-34e2-4ed9-a925-04da3f64899e',
                                      'x-amzn-trace-id': 'Root=1-666399bd-7e715f3b79989f0b461cc3af'},
                      'HTTPStatusCode': 202,
                      'RequestId': 'd15ca029-34e2-4ed9-a925-04da3f64899e',
                      'RetryAttempts': 0},
 'knowledgeBase': {'createdAt': datetime.datetime(2024, 6, 7, 23, 37, 33, 197633, tzinfo=tzlocal()),
                   'description': 'KB that contains the bedrock documentation',
                   'knowledgeBaseArn': 'arn:aws:bedrock:us-west-2:322537213286:knowledge-base/YMU

In [24]:
# Define the S3 configuration for your data source
s3_configuration = {
    'bucketArn': bucket_arn,
    'inclusionPrefixes': [kb_key]  
}

# Define the data source configuration
data_source_configuration = {
    's3Configuration': s3_configuration,
    'type': 'S3'
}

knowledge_base_id = kb_obj["knowledgeBase"]["knowledgeBaseId"]
knowledge_base_arn = kb_obj["knowledgeBase"]["knowledgeBaseArn"]

chunking_strategy_configuration = {
    "chunkingStrategy": "FIXED_SIZE",
    "fixedSizeChunkingConfiguration": {
        "maxTokens": 512,
        "overlapPercentage": 20
    }
}

# Create the data source
try:
    # ensure that the KB is created and available
    time.sleep(45)
    data_source_response = bedrock_agent_client.create_data_source(
        knowledgeBaseId=knowledge_base_id,
        name=data_source_name,
        description='DataSource for the bedrock documentation',
        dataSourceConfiguration=data_source_configuration,
        vectorIngestionConfiguration = {
            "chunkingConfiguration": chunking_strategy_configuration
        }
    )

    # Pretty print the response
    pprint.pprint(data_source_response)

except Exception as e:
    print(f"Error occurred: {e}")


{'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',
                                      'content-length': '658',
                                      'content-type': 'application/json',
                                      'date': 'Fri, 07 Jun 2024 23:38:18 GMT',
                                      'x-amz-apigw-id': 'ZBT8uFTJPHcEvjA=',
                                      'x-amzn-requestid': '2d3b692d-0a45-4d7b-ae91-a4aa15560c12',
                                      'x-amzn-trace-id': 'Root=1-666399ea-4615c29353cfa8a824f4eb04'},
                      'HTTPStatusCode': 200,
                      'RequestId': '2d3b692d-0a45-4d7b-ae91-a4aa15560c12',
                      'RetryAttempts': 0},
 'dataSource': {'createdAt': datetime.datetime(2024, 6, 7, 23, 38, 18, 826834, tzinfo=tzlocal()),
                'dataDeletionPolicy': 'DELETE',
                'dataSourceConfiguration': {'s3Configuration': {'bucketArn': 'arn:aws:s3:::bedrock-docs-kb-agents-us-west-2-322537213

### 3.6 - ingestion job 시작

Knowledge Base와 Data Source가 만들어지면 ingestion job을 시작할 수 있습니다. ingestion job 중에 Knowledge Base는 데이터 소스에서 문서를 가져오고, 텍스트를 추출하도록 사전 처리하고, 제공된 청크 크기에 따라 청크를 만들고, 각 청크의 임베딩을 생성한 다음 벡터 데이터베이스(이 경우 Amazon OpenSource Serverless)에 기록합니다.

In [25]:
# Start an ingestion job
data_source_id = data_source_response["dataSource"]["dataSourceId"]
start_job_response = bedrock_agent_client.start_ingestion_job(
    knowledgeBaseId=knowledge_base_id, 
    dataSourceId=data_source_id
)


## 4. Agent 생성

We will now create the Agent and associate the Knowledge Base to it. To do so we need to: 
1. Agent IAM 역할과 정책 생성
2. Agent 생성
3. Agent와 Knowledge Base 연계
4. Agent 준비

### 4.1 - Agent IAM 역할 및 정책 생성
First we need to create the agent policies that allow bedrock model invocation and Knowledge Base retrieval

In [26]:
# Create IAM policies for agent
bedrock_agent_bedrock_allow_policy_statement = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AmazonBedrockAgentBedrockFoundationModelPolicy",
            "Effect": "Allow",
            "Action": "bedrock:InvokeModel",
            "Resource": [
                f"arn:aws:bedrock:{region}::foundation-model/{model_id}"
            ]
        }
    ]
}

bedrock_policy_json = json.dumps(bedrock_agent_bedrock_allow_policy_statement)

agent_bedrock_policy = iam_client.create_policy(
    PolicyName=bedrock_agent_bedrock_allow_policy_name,
    PolicyDocument=bedrock_policy_json
)

In [27]:
bedrock_agent_kb_retrival_policy_statement = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "bedrock:Retrieve"
            ],
            "Resource": [
                knowledge_base_arn
            ]
        }
    ]
}
bedrock_agent_kb_json = json.dumps(bedrock_agent_kb_retrival_policy_statement)
agent_kb_schema_policy = iam_client.create_policy(
    PolicyName=bedrock_agent_kb_allow_policy_name,
    Description=f"Policy to allow agent to retrieve documents from knowledge base.",
    PolicyDocument=bedrock_agent_kb_json
)


In [28]:

# Create IAM Role for the agent and attach IAM policies
assume_role_policy_document = {
    "Version": "2012-10-17",
    "Statement": [{
          "Effect": "Allow",
          "Principal": {
            "Service": "bedrock.amazonaws.com"
          },
          "Action": "sts:AssumeRole"
    }]
}

assume_role_policy_document_json = json.dumps(assume_role_policy_document)
agent_role = iam_client.create_role(
    RoleName=agent_role_name,
    AssumeRolePolicyDocument=assume_role_policy_document_json
)

# Pause to make sure role is created
time.sleep(10)
    
iam_client.attach_role_policy(
    RoleName=agent_role_name,
    PolicyArn=agent_bedrock_policy['Policy']['Arn']
)


iam_client.attach_role_policy(
    RoleName=agent_role_name,
    PolicyArn=agent_kb_schema_policy['Policy']['Arn']
)

{'ResponseMetadata': {'RequestId': '6da256f8-a5a9-4abc-93c9-dcf9940c7b1d',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Fri, 07 Jun 2024 23:42:37 GMT',
   'x-amzn-requestid': '6da256f8-a5a9-4abc-93c9-dcf9940c7b1d',
   'content-type': 'text/xml',
   'content-length': '212'},
  'RetryAttempts': 0}}

### 4.2 - Agent 생성
필요한 IAM 역할이 만들어지면 bedrock agent 클라이언트를 사용하여 새 에이전트를 만들 수 있습니다. 이를 위해 create_agent 함수를 사용합니다. 여기에는 상담원 이름, 밑줄 foundation model과 instruction이 필요합니다. 에이전트 설명을 제공할 수도 있습니다. 생성된 에이전트는 아직 준비되지 않은 상태입니다. 에이전트를 준비한 다음 이를 사용하여 작업을 호출하고 다른 API를 사용하는 데 중점을 두겠습니다.

In [30]:
# Create Agent
response = bedrock_agent_client.create_agent(
    agentName=agent_name,
    agentResourceRoleArn=agent_role['Role']['Arn'],
    description="Agent supporting Amazon Bedrock Developers.",
    idleSessionTTLInSeconds=1800,
    foundationModel=model_id,
    instruction=agent_instruction,
)

Let's now store the agent id in a local variable to use it on the next steps

In [31]:
agent_id = response['agent']['agentId']
agent_id

'AC6YU1H6ZD'

### 4.3 - agent Knowledge Base 연계
다음으로 생성한 에이전트를 Bedrock 문서용 Knowledge Base와 연결해야 합니다.

In [32]:
agent_kb_description = bedrock_agent_client.associate_agent_knowledge_base(
    agentId=agent_id,
    agentVersion='DRAFT',
    description=f'Use the information in the {kb_name} knowledge base to provide accurate responses to the questions about Amazon Bedrock.',
    knowledgeBaseId=knowledge_base_id 
)

### 4.4 - Prepare Agent 준비

내부 테스트에 사용되는 Agent의 DRAFT 버전을 생성합니다.

In [33]:
agent_prepare = bedrock_agent_client.prepare_agent(agentId=agent_id)
agent_prepare

{'ResponseMetadata': {'RequestId': '9ca6afa7-c5a5-4f73-aef2-effdb97c0a90',
  'HTTPStatusCode': 202,
  'HTTPHeaders': {'date': 'Sat, 08 Jun 2024 00:40:12 GMT',
   'content-type': 'application/json',
   'content-length': '119',
   'connection': 'keep-alive',
   'x-amzn-requestid': '9ca6afa7-c5a5-4f73-aef2-effdb97c0a90',
   'x-amz-apigw-id': 'ZBdA6G9gvHcEUSQ=',
   'x-amzn-trace-id': 'Root=1-6663a86b-110fc0700fff72f217096f4f'},
  'RetryAttempts': 0},
 'agentId': 'AC6YU1H6ZD',
 'agentStatus': 'PREPARING',
 'agentVersion': 'DRAFT',
 'preparedAt': datetime.datetime(2024, 6, 8, 0, 40, 12, 73508, tzinfo=tzlocal())}

## 5 - Agent 테스팅

이제 에이전트가 생겼으니 에이전트를 호출하여 Amazon Bedrock에 대한 올바른 정보를 제공하는지 테스트해 보겠습니다. 이를 위해 먼저 에이전트 alias을 만들어 보겠습니다.

In [34]:
# Pause to make sure agent is prepared
time.sleep(30)
agent_alias = bedrock_agent_client.create_agent_alias(
    agentId=agent_id,
    agentAliasName=agent_alias_name
)
# Pause to make sure agent alias is ready
time.sleep(30)

In [35]:
agent_alias

{'ResponseMetadata': {'RequestId': '52c2cea8-ad85-468a-a7cf-bb0dc1c742fa',
  'HTTPStatusCode': 202,
  'HTTPHeaders': {'date': 'Sat, 08 Jun 2024 00:42:19 GMT',
   'content-type': 'application/json',
   'content-length': '344',
   'connection': 'keep-alive',
   'x-amzn-requestid': '52c2cea8-ad85-468a-a7cf-bb0dc1c742fa',
   'x-amz-apigw-id': 'ZBdU3GvvPHcEQyg=',
   'x-amzn-trace-id': 'Root=1-6663a8eb-7eb1f38c4f33ed672d916634'},
  'RetryAttempts': 0},
 'agentAlias': {'agentAliasArn': 'arn:aws:bedrock:us-west-2:322537213286:agent-alias/AC6YU1H6ZD/YWMXDQBOPH',
  'agentAliasId': 'YWMXDQBOPH',
  'agentAliasName': 'bedrock-docs-alias',
  'agentAliasStatus': 'CREATING',
  'agentId': 'AC6YU1H6ZD',
  'createdAt': datetime.datetime(2024, 6, 8, 0, 42, 19, 785101, tzinfo=tzlocal()),
  'routingConfiguration': [{}],
  'updatedAt': datetime.datetime(2024, 6, 8, 0, 42, 19, 785101, tzinfo=tzlocal())}}


이제 에이전트를 만들었으므로 bedrock-agent-runtime 클라이언트를 사용하여 이 에이전트를 호출하고 Knowledge base에서 정보를 가져와 보겠습니다.

In [48]:
# Extract the agentAliasId from the response
agent_alias_id = agent_alias['agentAlias']['agentAliasId']

## create a random id for session initiator id
session_id:str = str(uuid.uuid1())
enable_trace:bool = True
end_session:bool = False

# invoke the agent API
agentResponse = bedrock_agent_runtime_client.invoke_agent(
    inputText="Bedrock에서 모델을 평가하려면 어떻게 해야 하나요?",
    agentId=agent_id,
    agentAliasId=agent_alias_id, 
    sessionId=session_id,
    enableTrace=enable_trace, 
    endSession= end_session
)

logger.info(pprint.pprint(agentResponse))


[2024-06-08 01:26:22,201] p25453 {3878994997.py:19} INFO - None


{'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',
                                      'content-type': 'application/json',
                                      'date': 'Sat, 08 Jun 2024 01:26:22 GMT',
                                      'transfer-encoding': 'chunked',
                                      'x-amz-bedrock-agent-session-id': '1d42a072-2536-11ef-acee-0a86a609e469',
                                      'x-amzn-bedrock-agent-content-type': 'application/json',
                                      'x-amzn-requestid': '591ff377-956c-43a2-a325-532d390b9523'},
                      'HTTPStatusCode': 200,
                      'RequestId': '591ff377-956c-43a2-a325-532d390b9523',
                      'RetryAttempts': 0},
 'completion': <botocore.eventstream.EventStream object at 0x7f7c5d58abc0>,
 'contentType': 'application/json',
 'sessionId': '1d42a072-2536-11ef-acee-0a86a609e469'}


In [49]:
%%time
event_stream = agentResponse['completion']
try:
    for event in event_stream:        
        if 'chunk' in event:
            data = event['chunk']['bytes']
            logger.info(f"Final answer ->\n{data.decode('utf8')}")
            agent_answer = data.decode('utf8')
            end_event_received = True
            # End event indicates that the request finished successfully
        elif 'trace' in event:
            logger.info(json.dumps(event['trace'], indent=2))
        else:
            raise Exception("unexpected event.", event)
except Exception as e:
    raise Exception("unexpected event.", e)

[2024-06-08 01:26:55,311] p25453 {<timed exec>:11} INFO - {
  "agentAliasId": "YWMXDQBOPH",
  "agentId": "AC6YU1H6ZD",
  "agentVersion": "1",
  "sessionId": "1d42a072-2536-11ef-acee-0a86a609e469",
  "trace": {
    "orchestrationTrace": {
      "rationale": {
        "text": "",
        "traceId": "591ff377-956c-43a2-a325-532d390b9523-0"
      }
    }
  }
}
[2024-06-08 01:26:55,312] p25453 {<timed exec>:11} INFO - {
  "agentAliasId": "YWMXDQBOPH",
  "agentId": "AC6YU1H6ZD",
  "agentVersion": "1",
  "sessionId": "1d42a072-2536-11ef-acee-0a86a609e469",
  "trace": {
    "orchestrationTrace": {
      "invocationInput": {
        "invocationType": "KNOWLEDGE_BASE",
        "knowledgeBaseLookupInput": {
          "knowledgeBaseId": "YMUCEHYCLK",
          "text": "Bedrock\uc5d0\uc11c \ubaa8\ub378\uc744 \ud3c9\uac00\ud558\ub824\uba74 \uc5b4\ub5bb\uac8c \ud574\uc57c \ud558\ub098\uc694?"
        },
        "traceId": "591ff377-956c-43a2-a325-532d390b9523-0"
      }
    }
  }
}
[2024-06-08 01:26:

CPU times: user 7.86 ms, sys: 0 ns, total: 7.86 ms
Wall time: 6.36 ms


In [38]:
# And here is the response if you just want to see agent's reply
print(agent_answer)

Bedrock에서 모델을 평가하려면 다음과 같은 단계를 거쳐야 합니다:

1. Amazon Bedrock 콘솔에 접속하여 "Model evaluation" 메뉴를 선택합니다.
2. "Create human-based evaluation"을 선택하여 평가 작업을 생성합니다.
3. 평가 작업의 이름과 설명을 입력하고, 평가할 모델을 선택합니다.
4. 평가 지표와 평가 방법을 선택하고, 프롬프트 데이터셋과 결과 저장 위치를 지정합니다.
5. 작업자를 위한 지침을 작성하고, 최종적으로 작업을 생성합니다.


In [44]:
def simple_agent_invoke(input_text, agent_id, agent_alias_id, session_id=None, enable_trace=False, end_session=False):
    if session_id is None:
        session_id:str = str(uuid.uuid1())

    agentResponse = bedrock_agent_runtime_client.invoke_agent(
        inputText="답변은 무조건 한국어로 해줘" + input_text,
        agentId=agent_id,
        agentAliasId=agent_alias_id, 
        sessionId=session_id,
        enableTrace=enable_trace, 
        endSession= end_session
    )
    logger.info(pprint.pprint(agentResponse))
    
    agent_answer = ''
    event_stream = agentResponse['completion']
    try:
        for event in event_stream:        
            if 'chunk' in event:
                data = event['chunk']['bytes']
                logger.info(f"Final answer ->\n{data.decode('utf8')}")
                agent_answer = data.decode('utf8')
                end_event_received = True
                # End event indicates that the request finished successfully
            elif 'trace' in event:
                logger.info(json.dumps(event['trace'], indent=2))
            else:
                raise Exception("unexpected event.", event)
    except Exception as e:
        raise Exception("unexpected event.", e)
    return agent_answer

In [50]:
simple_agent_invoke("Bedrock의 provisioned throughtput은 무엇인가요?", agent_id, agent_alias_id, session_id)

[2024-06-08 01:27:21,833] p25453 {1629308684.py:13} INFO - None


{'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',
                                      'content-type': 'application/json',
                                      'date': 'Sat, 08 Jun 2024 01:27:21 GMT',
                                      'transfer-encoding': 'chunked',
                                      'x-amz-bedrock-agent-session-id': '1d42a072-2536-11ef-acee-0a86a609e469',
                                      'x-amzn-bedrock-agent-content-type': 'application/json',
                                      'x-amzn-requestid': 'e18b2af8-85dd-4065-bfa3-123485714b04'},
                      'HTTPStatusCode': 200,
                      'RequestId': 'e18b2af8-85dd-4065-bfa3-123485714b04',
                      'RetryAttempts': 0},
 'completion': <botocore.eventstream.EventStream object at 0x7f7c5d58a920>,
 'contentType': 'application/json',
 'sessionId': '1d42a072-2536-11ef-acee-0a86a609e469'}


[2024-06-08 01:27:33,737] p25453 {1629308684.py:21} INFO - Final answer ->
Bedrock의 provisioned throughput은 모델의 입력 및 출력 처리 속도를 높이기 위해 구매할 수 있는 기능입니다. 사용자는 모델 단위(MU)의 수와 약정 기간을 선택하여 provisioned throughput을 구매할 수 있습니다. 이를 통해 모델의 처리 속도를 높일 수 있습니다.


'Bedrock의 provisioned throughput은 모델의 입력 및 출력 처리 속도를 높이기 위해 구매할 수 있는 기능입니다. 사용자는 모델 단위(MU)의 수와 약정 기간을 선택하여 provisioned throughput을 구매할 수 있습니다. 이를 통해 모델의 처리 속도를 높일 수 있습니다.'

In [52]:
simple_agent_invoke("Bedrock Guardrail의 구성 요소는 무엇인가요?", agent_id, agent_alias_id, session_id)

[2024-06-08 01:28:30,869] p25453 {1629308684.py:13} INFO - None


{'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',
                                      'content-type': 'application/json',
                                      'date': 'Sat, 08 Jun 2024 01:28:30 GMT',
                                      'transfer-encoding': 'chunked',
                                      'x-amz-bedrock-agent-session-id': '1d42a072-2536-11ef-acee-0a86a609e469',
                                      'x-amzn-bedrock-agent-content-type': 'application/json',
                                      'x-amzn-requestid': 'cc6a3137-b727-46bd-b62e-692013825e5f'},
                      'HTTPStatusCode': 200,
                      'RequestId': 'cc6a3137-b727-46bd-b62e-692013825e5f',
                      'RetryAttempts': 0},
 'completion': <botocore.eventstream.EventStream object at 0x7f7c5d58aa10>,
 'contentType': 'application/json',
 'sessionId': '1d42a072-2536-11ef-acee-0a86a609e469'}


[2024-06-08 01:28:37,950] p25453 {1629308684.py:21} INFO - Final answer ->
Bedrock Guardrail의 구성 요소는 다음과 같습니다: - 주제(Topics)
- 콘텐츠 필터(Content filters)
- 거부된 주제(Denied topics)
- 단어 필터(Word filters)
- 민감한 정보 필터(Sensitive information filters)


'Bedrock Guardrail의 구성 요소는 다음과 같습니다: - 주제(Topics)\n- 콘텐츠 필터(Content filters)\n- 거부된 주제(Denied topics)\n- 단어 필터(Word filters)\n- 민감한 정보 필터(Sensitive information filters)'

## 6 - Clean up (Optional)

The next steps are optional and demonstrate how to delete our agent. To delete the agent we need to:

1. delete agent alias
1. delete agent
1. delete the knowledge base
1. delete the OpenSearch Serverless vector store
1. empty created s3 bucket
1. delete s3 bucket


In [53]:
agent_alias_deletion = bedrock_agent_client.delete_agent_alias(
    agentId=agent_id,
    agentAliasId=agent_alias['agentAlias']['agentAliasId']
)

In [54]:
agent_deletion = bedrock_agent_client.delete_agent(
    agentId=agent_id
)

In [55]:
# Empty and delete S3 Bucket

objects = s3_client.list_objects(Bucket=bucket_name)  
if 'Contents' in objects:
    for obj in objects['Contents']:
        s3_client.delete_object(Bucket=bucket_name, Key=obj['Key']) 
s3_client.delete_bucket(Bucket=bucket_name)

{'ResponseMetadata': {'RequestId': 'YWP210XABNDFDD4A',
  'HostId': '3TkDYfHoCcXp/xQ2GHrW8x+A4XAUgZAlxqjd7sFdaxXhHVORkKd3rxXnMLtLQxXZcZa3UozuvcI0vbe6xVgtQw==',
  'HTTPStatusCode': 204,
  'HTTPHeaders': {'x-amz-id-2': '3TkDYfHoCcXp/xQ2GHrW8x+A4XAUgZAlxqjd7sFdaxXhHVORkKd3rxXnMLtLQxXZcZa3UozuvcI0vbe6xVgtQw==',
   'x-amz-request-id': 'YWP210XABNDFDD4A',
   'date': 'Sat, 08 Jun 2024 01:29:51 GMT',
   'server': 'AmazonS3'},
  'RetryAttempts': 0}}

In [56]:
# Delete IAM Roles and policies and Knowledge Base files
for policy in [
    agent_bedrock_policy, 
    agent_kb_schema_policy,
    kb_bedrock_policy,
    kb_aoss_policy,
    kb_s3_policy
]:
    response = iam_client.list_entities_for_policy(
        PolicyArn=policy['Policy']['Arn'],
        EntityFilter='Role'
    )

    for role in response['PolicyRoles']:
        iam_client.detach_role_policy(
            RoleName=role['RoleName'], 
            PolicyArn=policy['Policy']['Arn']
        )

    iam_client.delete_policy(
        PolicyArn=policy['Policy']['Arn']
    )

    

for role_name in [
    agent_role_name, 
    kb_role_name
]:
    try: 
        iam_client.delete_role(
            RoleName=role_name
        )
    except Exception as e:
        print(e)
        print("couldn't delete role", role_name)
        
    
try:

    open_search_serverless_client.delete_collection(
        id=opensearch_collection_response["createCollectionDetail"]["id"]
    )

    open_search_serverless_client.delete_access_policy(
          name=kb_collection_name,
          type='data'
    )    

    open_search_serverless_client.delete_security_policy(
          name=kb_collection_name,
          type='network'
    )   

    open_search_serverless_client.delete_security_policy(
          name=kb_collection_name,
          type='encryption'
    )    
    bedrock_agent_client.delete_knowledge_base(
        knowledgeBaseId=knowledge_base_id
    )
except Exception as e:
    print(e)

## Conclusion

We have now experimented with using boto3 SDK to create, invoke and delete an agent having a single KB connected to it.
## Take aways

Adapt this notebook to create new agents for your application

## Thank You