# Multi Agent Collaboration - Setup
To speed up the lab, we will setup some resources. These include:

1. A Bedrock Knowledge Base: This will serve as our mock source for information to gather facts about companies, people, products, etc. In a real-world scenario, this would be replaced by API calls to internal and third-party Information provider like LexisNexis, Thomson Reuters Westlaw, Bloomberg Terminal, Factiva, etc.
2. A Bedrock Flow: This flow will extract entities from news facts, and use the Knowledge Base to enrich data about the entities via gathering research material. The resulting research should make it easier to write a comprehensive news article.

All the cells in this notebook can take around 10 minutes to execute. Run all cells now and make sure all cells have executed before opening the next notebook. To confirm all cells have executed, go right to the bottom and wait for the phrase "Setup Complete!" to appear.


First we ensure that all Python packages required for this notebook are installed, please ignore any errors.

In [None]:
%pip install -r ../../requirements.txt

Now we setup our SDK to communicate with various AWS services

In [None]:
import boto3
import json
import os
import sys

sts_client = boto3.client('sts')
session = boto3.session.Session()

account_id = sts_client.get_caller_identity()["Account"]
region = session.region_name

s3_client = boto3.client('s3', region)
bedrock_client = boto3.client('bedrock-runtime', region)

Next we import some of the help functions that have been written to create:
1. Bedrock Knowledge Bases
2. Bedrock Agents

In [None]:
sys.path.insert(0, ".")
sys.path.insert(1, "..")
sys.path.insert(2, "../..")

from utils.bedrock_agent_helper import (
    AgentsForAmazonBedrock
)
from utils.knowledge_base_helper import (
    KnowledgeBasesForAmazonBedrock
)
from utils.flow_helper import (
    load_and_fill_json,
    make_connection
)
agents = AgentsForAmazonBedrock()
kb = KnowledgeBasesForAmazonBedrock()

Let's start filling information about the Knowledge Base we're about to create

In [None]:
knowledge_base_name = f'lab7-mac-kb'
knowledge_base_description = "KB containing information about entities like companies, people, and products"
s3_bucket_name = f"labs-bucket-{region}-{account_id}"
bucket_prefix = "data/kb/mac/"

Let's make the Knowledge Base, this command can take a while to finish running.
Behind the scenes, Bedrock is spinning up:
1. OpenSearch Serverless Collection
2. OpenSearch Vector Index
3. Bedrock Knowledge Base

In [None]:
kb_id, ds_id = kb.create_or_retrieve_knowledge_base(
    knowledge_base_name,
    knowledge_base_description,
    s3_bucket_name,
    "amazon.titan-embed-text-v2:0",
    bucket_prefix
)

print(f"Knowledge Base ID: {kb_id}")
print(f"Data Source ID: {ds_id}")

We have synthetically generated mock information regarding companies, products, and people to simulate the output you may see from an information service. These have been stored in the `information_sources` directory.

We will now upload the mock information to the S3 bucket connected to our knowledge base.

In [None]:
def upload_directory(path, bucket_name, bucket_prefix):
    for root, dirs, files in os.walk(path):
        for file in files:
            file_to_upload = os.path.join(root, file)
            print(f"uploading file {file_to_upload} to {bucket_name}")
            s3_client.upload_file(file_to_upload, bucket_name, f"{bucket_prefix}{file}")

In [None]:
upload_directory("../information_sources", s3_bucket_name, bucket_prefix)

It's time to sync that data and ingest it into the vector store.

In [None]:
kb.synchronize_data(kb_id, ds_id)

Let's set up some variables which will be used later

In [None]:
kb_info = kb.get_kb(kb_id)
kb_arn = kb_info['knowledgeBase']['knowledgeBaseArn']
print(kb_id)
print(kb_info)

In [None]:
%store kb_id

Now we will create a Bedrock Flow.

This flow will query the Knowledge Base we created earlier and return research material on entities (companies, products, people) that are identified in the news facts.

The first step involves creating a flow policy:

In [None]:
iam = boto3.client('iam')

# Create or get IAM role
role_name = 'BedrockFlowsRole'
trust_policy = {
    "Version": "2012-10-17",
    "Statement": [{
        "Effect": "Allow",
        "Principal": {"Service": "bedrock.amazonaws.com"},
        "Action": "sts:AssumeRole"
    }]
}

# Load the policy from the JSON file
with open('bedrock-flow-policy.json', 'r') as file:
    bedrock_policy = json.load(file)


try:
    # Try to get existing role
    role_response = iam.get_role(RoleName=role_name)
    role_arn = role_response['Role']['Arn']
    print(f"Using existing role: {role_arn}")

except iam.exceptions.NoSuchEntityException:
    # Role doesn't exist, create it
    print(f"Creating new role: {role_name}")
    role_response = iam.create_role(
        RoleName=role_name,
        AssumeRolePolicyDocument=json.dumps(trust_policy)
    )
    role_arn = role_response['Role']['Arn']

    # Attach Bedrock policy
    policy_name = 'BedrockFlowsPolicy'
    iam.put_role_policy(
        RoleName=role_name,
        PolicyName=policy_name,
        PolicyDocument=json.dumps(bedrock_policy)
    )

    print(f"Created role: {role_arn}")

except Exception as e:
    print(f"Error handling IAM role: {str(e)}")
    raise e

Now we create the actual Bedrock Flow using the API

In [None]:
client = boto3.client(service_name='bedrock-agent')

# Replace with the service role that you created. For more information, see
# https://docs.aws.amazon.com/bedrock/latest/userguide/flows-permissions.html
FLOWS_SERVICE_ROLE = role_arn

# Define each node

# The input node validates that the content of the InvokeFlow request
# is a JSON object.
input_node = {
    "type": "Input",
    "name": "FlowInputNode",
    "outputs": [
        {
            "name": "document",
            "type": "String"
        }
    ]
}


# Bedrock Flow node definitions can be long. 
# To reduce the size of this notebook, the nodes we ned have been off-loaded
# to JSON files. 
# Please feel free to open the JSON files to have a look at them.
# Helper dictionary to populate the variables defined in the JSON files
helper_dict = {"region": region,
               "account_id": account_id,
               "kb_id": kb_id}

prompt_node_extract = load_and_fill_json('flow-extract-node.json', helper_dict)
prompt_node_research = load_and_fill_json('flow-research-node.json', helper_dict)
kb_node = load_and_fill_json('flow-kb-node.json', helper_dict)


# The output node validates that the output from the last node is a string and
# returns it as is. The name must be "document".
output_node = {
    "type": "Output",
    "name": "Final_Output",
    "inputs": [
        {
            "name": "document",
            "type": "String",
            "expression": "$.data"
        }
    ]
}

# We now use a helper function to create connections between the nodes
connections = []
connections.append(make_connection(input_node, prompt_node_extract))
connections.append(make_connection(prompt_node_extract, prompt_node_research))
connections.append(make_connection(prompt_node_research, kb_node))
connections.append(make_connection(kb_node, output_node))


# Create the flow from the nodes and connections
response = client.create_flow(
    name="lab-7-flow",
    description="A flow that gets info from a knowledge base",
    executionRoleArn=FLOWS_SERVICE_ROLE,
    definition={
        "nodes": [input_node, prompt_node_extract,
                  prompt_node_research, kb_node, output_node],
        "connections": connections
    }
)

flow_id = response.get("id")
flow_arn = response.get("arn")

In [None]:
print(f"Flow ARN: {flow_arn}")
print(f"Flow ID: {flow_id}")

If you recall from our architectural diagram, the flow will be called by a Lambda function.

A Bedrock Flow needs to be prepared, and we need to create a version and an alias before it can be called by another service (Lambda in our case). 

In [None]:
client.prepare_flow(flowIdentifier=flow_id)

In [None]:
response = client.create_flow_version(flowIdentifier=flow_id)
                                
flow_version = response.get("version")

In [None]:
response = client.create_flow_alias(
    flowIdentifier=flow_id,
    name="latest",
    description="Alias pointing to the latest version of the flow.",
    routingConfiguration=[
        {
            "flowVersion": flow_version
        }
    ]
)

flow_alias_arn = response.get("arn")

In [None]:
print(flow_alias_arn)

We now modify the Lambda function so that it can call this flow.

In [None]:
# Set lambda with correct environment variables
# Initialize the Lambda client
lambda_client = boto3.client('lambda')

# List all functions
response = lambda_client.list_functions()
call_flow_lambda_arn = ""
call_flow_lambda_name = ""

# todo: check if we can get this from the cloudformation stack output
for function in response['Functions']:
    if 'CallFlowLambda' in function['FunctionName']:
        call_flow_lambda_name = function['FunctionName']
        call_flow_lambda_arn = function['FunctionArn']
        env_vars = {
            "FLOW_ARN": flow_arn,
            "FLOW_ALIAS_ARN": flow_alias_arn
        }
        response = lambda_client.update_function_configuration(
            FunctionName=function['FunctionName'],
            Environment={'Variables': env_vars}
        )
        break
print("Setup Complete!")

Please wait for the above output to finish before proceeding.

The last printed statement will be: Setup Complete!

## Saving information
Let's store the variables that will be used in other notebooks:

In [None]:
%store call_flow_lambda_arn
%store call_flow_lambda_name