# Create Sub-agents and Knowledge Base

Sub-agents play a crucial role in multi-agent collaboration within Large Language Models (LLMs), enhancing their problem-solving capabilities and efficiency. These specialized components within the larger system are designed to handle specific tasks or domains, allowing for a more modular and focused approach to complex problems. By dividing responsibilities among sub-agents, LLMs can tackle multifaceted challenges more effectively, leveraging the unique strengths and expertise of each sub-agent.Furthermore, sub-agents facilitate better scalability and adaptability, as new functionalities can be added or existing ones modified without overhauling the entire system. The interaction and coordination between sub-agents also promote emergent behaviors and solutions that may not be achievable by a single, monolithic agent. This collaborative approach mirrors human teamwork, where diverse skills and perspectives combine to achieve superior outcomes. 

In this first lab you will be creating two agents and one Knowledge Base. The first agent will handle questions regarding existing mortgages. The second agent will handle applications for a new mortgage. The Knowledge Base you create will handle any general mortgage or refinancing related questions. This lab follows the below high-level steps:

* Create a Knowledge Base (KB) using [Knowledge Bases for Amazon Bedrock]

* Create sample Knowledge Base documents and upload them to the Knowledge Base

* Create the agent for managing existing mortgages and supporting AWS Lambda function

* Create the agent for handling new mortgage applications and supporting AWS Lambda function

* Provide session context to the agents

* Test agent responses

* Measure latency of a single agent



First step is to install the pre-requisites packages. NOTE: You only need to do this is this is the first notebook you are running. 

In [1]:
# !pip install --upgrade -q -r requirements.txt
# !pip install --upgrade boto3 botocore 

In [1]:
# restart kernel for packages to take effect
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

In [2]:
import boto3 
boto3.__version__

'1.34.143'

In [3]:
import os
import time
import boto3
import logging
import botocore
import json

%load_ext autoreload
%autoreload 2

from knowledge_base import BedrockKnowledgeBase
from agent import AgentsForAmazonBedrock

In [4]:
agents = AgentsForAmazonBedrock()

In [5]:
#Clients
s3_client = boto3.client('s3')
sts_client = boto3.client('sts')
bedrock_agent_client = boto3.client('bedrock-agent')
bedrock_agent_runtime_client = boto3.client('bedrock-agent-runtime')
bedrockruntime_client = boto3.client('bedrock-runtime')

logging.basicConfig(format='[%(asctime)s] p%(process)s {%(filename)s:%(lineno)d} %(levelname)s - %(message)s', level=logging.INFO)
logger = logging.getLogger(__name__)

In [6]:
region = agents.get_region()
account_id = sts_client.get_caller_identity()["Account"]

suffix = f"{region}-{account_id}"
bucket_name = f'mac-workshop-{suffix}'
agent_foundation_models = [ 
    "anthropic.claude-3-haiku-20240307-v1:0",
    "anthropic.claude-3-sonnet-20240229-v1:0"
    ]

## 1. Create Knowledge Base 
Let's start by creating a [Knowledge Base for Amazon Bedrock](https://aws.amazon.com/bedrock/knowledge-bases/) 
to provide knowledge about mortgages. Knowledge Bases allow you to integrate with different vector databases including [Amazon OpenSearch Serverless](https://aws.amazon.com/opensearch-service/features/serverless/), [Amazon Aurora](https://aws.amazon.com/rds/aurora/) and [Pinecone](http://app.pinecone.io/bedrock-integration). For this example, we will integrate the knowledge base with Amazon OpenSearch Serverless. To do so, we will use the helper class `BedrockKnowledgeBase` which creates the knowledge base and all of its prerequisites:
1. IAM roles and policies
2. S3 bucket
3. Amazon OpenSearch Serverless encryption, network and data access policies
4. Amazon OpenSearch Serverless collection
5. Amazon OpenSearch Serverless vector index
6. Knowledge Base
7. Knowledge Base data source

In [None]:
knowledge_base = BedrockKnowledgeBase(
    kb_name="general-mortgage-kb",
    kb_description="Useful for answering questions about mortgage refinancing and for questions comparing various mortgage types",
    data_bucket_name=bucket_name
)

### Create sample KB documents and upload the dataset to Amazon S3
Now that we have created the knowledge base, let's populate it with some mortgage knowledge. We will first utilize Anthropic Claude to create a sample document that describes the topic of refinancing and different types of mortgages. The Knowledge Base data source expects the data to be available on the S3 bucket connected to it and changes on the data can be syncronized to the knowledge base using the `Ingest` API call. In this example we will use the [boto3 abstraction](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-agent/client/start_ingestion_job.html) of the API, via our helper class. 

Let's create our sample document and then upload it to S3 so it can be ingested into the knowledge base. 

In [8]:
# create the prompt
prompt= "Create a document of about 4 pages long that discusses the topic of refinancing a mortgage and 15 vs 30 year mortgage. Please organize the subtopics to flow logically and have separate sections. Make sure each section includes all relevant details that may be important. Clearly illustrate the pros and cons of refinancing and also the differences between a 15 and 30 year mortgage. Only output the text and not any additional info like metadata about the document."

In [9]:
body = json.dumps({"prompt": prompt})
# modelId = 'amazon.titan-text-premier-v1:0' # Make sure Titan text premier is available in the account you are doing this workhsop in before uncommenting!
modelId = "anthropic.claude-3-haiku-20240307-v1:0"

native_request = {
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 2000,
    "temperature": 0,
    "messages": [
        {
            "role": "user",
            "content": [{"type": "text", "text": prompt}],
        }
    ],
}

request = json.dumps(native_request)

try:
    # Invoke the model with the request.
    response = bedrockruntime_client.invoke_model(modelId=modelId, body=request)
    
except Exception as e:
    print(f"ERROR: Can't invoke '{modelId}'. Reason: {e}")
    exit(1)

model_response = json.loads(response["body"].read())
response_text = model_response["content"][0]["text"]

In [None]:
import os
if os.path.exists("mortgage_dataset") == True:
    print("Already exists, please move forward")
else: 
    os.makedirs("mortgage_dataset")

In [11]:
with open('mortgage_dataset/kb.txt', 'w+') as kb:
    kb.write(response_text)
    kb.close()

In [None]:
def upload_directory(path, bucket_name):
        for root,dirs,files in os.walk(path):
            for file in files:
                file_to_upload = os.path.join(root,file)
                print(f"uploading file {file_to_upload} to {bucket_name}")
                s3_client.upload_file(file_to_upload,bucket_name,file)

upload_directory("mortgage_dataset", bucket_name)

Now we ingest the documents, which chunks the source documents and stores an embedding for each chunk into the underying 
knowledge base vector store. For a simple example, this ingestion takes a couple minutes. 

In [None]:
# ensure that the kb is available
time.sleep(30)
# sync knowledge base
knowledge_base.start_ingestion_job()

Finally we collect the Knowledge Base Id to integrate it with our Agent later on

In [14]:
from knowledge_base import BedrockKnowledgeBaseHelper
helper = BedrockKnowledgeBaseHelper()
kb_id = helper.get_kb()
kb_arn = f"arn:aws:bedrock:{region}:{account_id}:knowledge-base/{kb_id}"

In [None]:
kb_id 

### Test the Knowledge Base
Now the Knowledge Base is available we can test it out using the [**retrieve**](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-agent-runtime/client/retrieve.html) and [**retrieve_and_generate**](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-agent-runtime/client/retrieve_and_generate.html) functions. 

#### Testing Knowledge Base with Retrieve and Generate API

Let's first test the knowledge base using the retrieve and generate API. With this API, Bedrock takes care of retrieving the necessary references from the knowledge base and generating the final answer using a Bedrock LLM.

In [None]:
response = bedrock_agent_runtime_client.retrieve_and_generate(
    input={
        "text": "compare and contrast 15-year vs 30-year mortgage type"
    },
    retrieveAndGenerateConfiguration={
        "type": "KNOWLEDGE_BASE",
        "knowledgeBaseConfiguration": {
            'knowledgeBaseId': kb_id,
            "modelArn": f"arn:aws:bedrock:{region}::foundation-model/{agent_foundation_models[0]}",
            "retrievalConfiguration": {
                "vectorSearchConfiguration": {
                    "numberOfResults":5
                } 
            }
        }
    }
)

print(response['output']['text'],end='\n'*2)

As you can see, with the retrieve and generate API we get the final response directly and we don't see the different sources used to generate this response. Let's now retrieve the source information from the knowledge base with the retrieve API.

#### Testing Knowledge Base with Retrieve API
If you need an extra layer of control, you can retrieve the chuncks that best match your query using the retrieve API. In this setup, we can configure the desired number of results and control the final answer with your own application logic. The API then provides you with the matching content, its S3 location, the similarity score and the chunk metadata

In [None]:
response_ret = bedrock_agent_runtime_client.retrieve(
    knowledgeBaseId=kb_id, 
    nextToken='string',
    retrievalConfiguration={
        "vectorSearchConfiguration": {
            "numberOfResults":3,
        } 
    },
    retrievalQuery={
        'text': 'What are the cons of a 15-year mortgage?'
    }
)

def response_print(retrieve_resp):
    #structure 'retrievalResults': list of contents. Each list has content, location, score, metadata
    for num,chunk in enumerate(response_ret['retrievalResults'],1):
        print('-----------------------------------------------------------------------------------------')
        print(f'Chunk {num}: ',chunk['content']['text'],end='\n'*2)
        print(f'Chunk {num} Location: ',chunk['location'],end='\n'*2)
        print(f'Chunk {num} Score: ',chunk['score'],end='\n'*2)
        print(f'Chunk {num} Metadata: ',chunk['metadata'],end='\n'*2)

response_print(response_ret)

## 2. Create the Agent for managing existing mortgages

In this section we will go through all the steps to create an Agent for Amazon Bedrock. 

These are the steps to complete:
    
1. Create the new agent (with the helper function taking care of IAM role creation)
2. Add an action group backed by a new Lambda function (with the helper function handling IAM role creation, Lambda function creation, adding the action group to the agent, and preparing the agent)

#### Create the Lambda function code
Here we create a source code file for a new Lambda function to implement the action group for the Existing Mortgage agent. You will notice the **TODO** section in the below code. In this example we have hardcoded the response back from the agent but in your environment this is where your business logic would reside. 

In [None]:
%%writefile existing_mortgage_function.py
import json

def get_named_parameter(event, name):
    return next(item for item in event['parameters'] if item['name'] == name)['value']
    
def populate_function_response(event, response_body):
    return {'response': {'actionGroup': event['actionGroup'], 'function': event['function'],
                'functionResponse': {'responseBody': {'TEXT': {'body': str(response_body)}}}}}

def get_mortgage_status(customer_id):
    # TODO: Implement real business logic to retrieve mortgage status
    return {
        "account_number": customer_id,
        "outstanding_principal": 150000.0,
        "interest_rate": 4.5,
        "maturity_date": "2030-06-30",
        "payments_remaining": 72,
        "last_payment_date": "2024-06-01",
        "next_payment_due": "2024-07-01",
        "next_payment_amount": 1250.0
    }

def lambda_handler(event, context):
    print(event)
    function = event['function']
    if function == 'get_mortgage_status':
        customer_id = get_named_parameter(event, 'customer_id')
        if not customer_id:
            raise Exception("Missing mandatory parameter: customer_id")
        result = get_mortgage_status(customer_id)
    else:
        result = f"Error, function '{function}' not recognized"

    response = populate_function_response(event, result)
    print(response)
    return response

Next, the function for getting mortgage application document status. You will notice the **TODO** section in the below code. In this example we have hardcoded the response back from the agent but in your environment this is where your business logic would reside.

In [None]:
%%writefile mortgage_application_function.py
import json

def get_named_parameter(event, name):
    return next(item for item in event['parameters'] if item['name'] == name)['value']
    
def populate_function_response(event, response_body):
    return {'response': {'actionGroup': event['actionGroup'], 'function': event['function'],
                'functionResponse': {'responseBody': {'TEXT': {'body': str(response_body)}}}}}

def get_mortgage_application_document_status(customer_id):
    # TODO: Implement the actual logic to retrieve the document status for the given customer ID
    return [
        {
            "type": "proof_of_income",
            "status": "COMPLETED"
        },
        {
            "type": "employment_information",
            "status": "MISSING"
        },
        {
            "type": "proof_of_assets",
            "status": "COMPLETED"
        },
        {
            "type": "credit_information",
            "status": "COMPLETED"
        }
    ]

def lambda_handler(event, context):
    function = event['function']

    if function == 'get_mortgage_application_document_status':
        customer_id = get_named_parameter(event, 'customer_id')
        if not customer_id:
            raise Exception("Missing mandatory parameter: customer_id")
        result = get_mortgage_application_document_status(customer_id)
    else:
        raise Exception(f"Unrecognized function: {function}")

    response = populate_function_response(event, result)
    print(response)
    return response

Now we create the agent itself, giving it a name, a brief description, and most importantly, a set of instructions.

In [27]:
existing_mortgage_agent_id = agents.create_agent("existing_mortgage_agent", 
"""
you are a mortgage bot. you can retrieve the latest details about an existing mortgage on behalf of customers.
""", 
"""
you are a mortgage bot. you can retrieve the latest details about a user's current mortgage. resist the temptation to ask the user for input. 
only do so after you have exhausted available actions. never ask the user for information that you already can retrieve yourself through 
available actions. for example, you have actions to retrieve details about the existing mortgage (interest rate, balance, number of payments, 
mortgage maturity date, last payment date, next payment date, etc.). 
so never ask the user for those details (it would be very annoying).
never make up information that you are unable to retrieve from your available actions. do not engage with users about topics other than
an existing mortgage. leave those other topics for other experts to handle. for example, do not respond to general questions about mortgages.
never make up the customer ID. always ask for it if you need it for using an available action, but don't yet have it. 
sending "UNKNOWN" as the customer ID to an action is not acceptable. confirm the ID with the user.
""",

                                                 agent_foundation_models)

Lastly, we add an action group to the new agent, which also creates for us the new Lambda function, and prepares the agent.

NOTE: we can use a simple PartyRock app to quickly generate both the Function Definitions and the draft Lambda function code based on
a simple description of the action group API's to support. For this example, we could simply describe the desired api as:

```
single api to get mortgage status. takes a customer id. returns an object with a number of fields: account number, outstanding principal, interest rate, maturity date, number of payments remaining, due date of next payment, amount of next payment.
```

In [28]:
resp = agents.add_action_group_with_lambda(
            "existing_mortgage_agent", 
            "existing_mortgage_ag", 
            "existing_mortgage_function.py", 
            [
                {
                    "name": "get_mortgage_status",
                    "description": """
                    Retrieves the mortgage status for a given customer ID. Returns an object containing the account number, 
                    outstanding principal, interest rate, maturity date, number of payments remaining, due date of next payment, 
                    and amount of next payment.""",
                    "parameters": {
                    "customer_id": {
                        "description": "The unique identifier for the customer whose mortgage status is being requested.",
                        "type": "string",
                        "required": True
                    }
                    }
                }
            ], 
            "existing_mortgage_actions", 
            "Set of functions for managing an existing mortgage"
            )

In [None]:
print(agents.invoke("I'm customer 98991. when's my next payment due?", existing_mortgage_agent_id))

## 3. Create the Agent for managing an application for a new mortgage

In this section we will go through all the steps to create an Agent for Amazon Bedrock. 

These are the steps to complete:
    
1. Create the new agent (with the helper function taking care of IAM role creation)
2. Add an action group backed by a new Lambda function (with the helper function handling IAM role creation, Lambda function creation, adding the action group to the agent, and preparing the agent)

In [30]:
mortgage_application_agent_id = agents.create_agent("mortgage_application_agent", 
"""
you are a bot to create, manage, and complete an application for a new mortgage. you help customers know what documentation 
they already provided and which ones they still need to provide.
""", 
"""
you are a mortgage bot for creating, managing, and completing an application for a new mortgage. 
you can help customers know what documentation they have already provided and which ones they still need to provide.
never make up information that you are unable to retrieve from your available actions. do not engage with users about topics other than
an existing mortgage. leave those other topics for other experts to handle. for example, do not respond to general questions about mortgages.
""",

                                                 agent_foundation_models)

Lastly, we add an action group to the new agent, which also creates for us the new Lambda function, and prepares the agent.

NOTE: we can use a simple PartyRock app to quickly generate both the Function Definitions and the draft Lambda function code based on
a simple description of the action group API's to support. For this example, we could simply describe the desired api as:

```
single api to get the list of documents required to complete a mortgage application that is in process, and the status of each one. the api takes in a customer id, and gives back a list of objects, one for each required document. each object has the type of the required document, and the status (COMPLETED or MISSING). the required document types are proof of income, employment information, proof of assets, and credit information. the list contains one object for each of these document types.
```

In [31]:
resp = agents.add_action_group_with_lambda(
            "mortgage_application_agent", 
            "mortgage_application_ag", 
            "mortgage_application_function.py", 
            [
                {
                    "name": "get_mortgage_application_document_status",
                    "description": """
                    Retrieves the list of required documents for a mortgage application in process, 
                    along with their respective statuses (COMPLETED or MISSING). 
                    The function takes a customer ID as input and returns a list of objects, where each object represents 
                    a required document type. The required document types are proof of income, employment information, 
                    proof of assets, and credit information. Each object in the returned list contains the type of the 
                    required document and its corresponding status.""",
                    "parameters": {
                    "customer_id": {
                        "description": """
                    The unique identifier of the customer whose mortgage application document status is to be retrieved.""",
                        "type": "string",
                        "required": True
                    }
                    }
                }
            ], 
            "mortgage_application_actions", 
            "Set of functions for managing an application for a new mortgage"
            )

In [None]:
print(agents.invoke("I'm customer 98991. what docs do I still owe you?", 
                    mortgage_application_agent_id))

## 4. Use session attributes to provide context to the agent

In [None]:
from datetime import datetime
today = datetime.today().strftime('%b-%d-%Y')

session_state = {
    "promptSessionAttributes": {
        "customer_ID": "498",
        "today": today
    }
}
session_state

Here I ask the agent a question that requires a customer ID, and it succeeds without having to prompt the user.

In [None]:
print(agents.invoke("what docs do I still owe you?", 
                    mortgage_application_agent_id, session_id="123",
                    session_state=session_state))

In [None]:
print(agents.invoke("how many years until my maturity date?", 
                    existing_mortgage_agent_id, session_id="123",
                    session_state=session_state))

In [None]:
print(agents.invoke("how many days until my next payment?", 
                    existing_mortgage_agent_id, session_id="123",
                    session_state=session_state))

## 5. Quick performance test

In [37]:
import uuid 
import time
import numpy as np

def query_loop_by_direct_invoke(query, agent_id, num_invokes):
    latencies = []
    for i in range(num_invokes):
        _session_id = str(uuid.uuid1())
        _start_time = time.time()
        resp = agents.invoke(query, agent_id, session_id=_session_id)
        _end_time = time.time()
        latencies.append(_end_time - _start_time)

    print(f'\n\nInvoked agent directly {num_invokes} times.')
    # get sum of total latencies
    total_time = sum(latencies)
    # get average latency
    avg_time = total_time / num_invokes
    # get p90 latency
    p90_time = np.percentile(latencies, 90)

    print(f'Average latency: {avg_time:.1f}, P90 latency: {p90_time:.1f}')

In [None]:
query_loop_by_direct_invoke("I am customer 999. how many years until the mortgage maturity date?", 
                            existing_mortgage_agent_id, 25)

## 6. Clean up

**NOTE: Don't execute the deletion cells if you are planning to proceed 
to subsequent notebooks, as those notebooks depend on the existence
of these agents and knowledge bases.**

In [6]:
agents.delete_lambda("existing_mortgage_ag")
agents.delete_agent("existing_mortgage_agent")

agents.delete_lambda("mortgage_application_ag")
agents.delete_agent("mortgage_application_agent")

In [None]:
knowledge_base.delete_kb()