# Bedrock Agents with Lambda Functions for Actions (Experimental Approach)

This notebook demonstrates an experimental approach to using AWS Bedrock Agents where the agent's actions (Action Groups) are implemented as separate AWS Lambda functions. This contrasts with the `lamda-approach` directory where the core logic resides within the notebook itself.

**Goal:** Create a Bedrock Agent that can:
1.  **Research:** Use a Couchbase vector store (populated with data) to find relevant documents based on a user query (via `bedrock_agent_researcher` Lambda).
2.  **Write/Format:** Format the research findings using a Bedrock LLM (via `bedrock_agent_writer` Lambda).

**Steps:**

1.  **Configuration:** Set up environment variables and paths.
2.  **AWS & Couchbase Setup:** Initialize AWS clients and connect to Couchbase.
3.  **Couchbase Resources:** Create/verify the necessary Couchbase bucket, scope, collection, and search index.
4.  **Data Loading:** Load documents into the Couchbase vector store.
5.  **IAM Role:** Create an IAM role for the Bedrock Agent and Lambda functions.
6.  **Lambda Deployment:** Package and deploy the `researcher` and `writer` Lambda functions.
7.  **Agent Creation:** Define and create the Bedrock Agent.
8.  **Action Group Creation:** Create action groups linking the agent to the deployed Lambda functions using OpenAPI schemas.
9.  **Agent Preparation:** Prepare the agent, making it ready for invocation.
10. **Agent Invocation:** Test the agent with a sample prompt.
11. **Cleanup (Optional):** Provide steps to delete the created AWS resources.

## 1. Imports and Configuration

Import necessary libraries and load configuration from environment variables or a `.env` file. Ensure you have a `.env` file in the `awsbedrock-agents/lambda-experiments/` directory with your AWS credentials, Couchbase details, and AWS Account ID, or set these as environment variables.

In [1]:
import json
import logging
import os
import subprocess
import time
import traceback
import uuid
from datetime import timedelta
import shutil

import boto3
from botocore.exceptions import ClientError
from couchbase.auth import PasswordAuthenticator
from couchbase.cluster import Cluster
from couchbase.exceptions import (
    BucketNotFoundException,
    CollectionNotFoundException,
    CouchbaseException,
    InternalServerFailureException,
    QueryIndexAlreadyExistsException,
    ScopeNotFoundException,
    ServiceUnavailableException,
    SearchIndexNotFoundException
)
from couchbase.management.buckets import (
    BucketSettings, BucketType,
    CreateBucketSettings
)
from couchbase.management.collections import CollectionSpec
from couchbase.management.search import SearchIndex, SearchIndexManager
from couchbase.options import ClusterOptions, QueryOptions
from dotenv import load_dotenv
from langchain_aws import BedrockEmbeddings
from langchain_couchbase.vectorstores import CouchbaseSearchVectorStore
from botocore.config import Config
from botocore.waiter import WaiterModel, create_waiter_with_client

# --- Configuration ---
# Setup logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

# Load environment variables from notebook's directory .env
dotenv_path = os.path.join(os.getcwd(), '.env') # Assumes .env is where notebook runs
logger.info(f"Attempting to load .env file from: {dotenv_path}")
if os.path.exists(dotenv_path):
    load_dotenv(dotenv_path=dotenv_path)
    logger.info(".env file loaded successfully.")
else:
    logger.warning(f".env file not found at {dotenv_path}. Relying on environment variables.")

# Couchbase Configuration
CB_HOST = os.getenv("CB_HOST", "couchbase://localhost")
CB_USERNAME = os.getenv("CB_USERNAME", "Administrator")
CB_PASSWORD = os.getenv("CB_PASSWORD") # Ensure this is set in .env
CB_BUCKET_NAME = os.getenv("CB_BUCKET_NAME", "vector-search-exp")
SCOPE_NAME = os.getenv("SCOPE_NAME", "bedrock_exp")
COLLECTION_NAME = os.getenv("COLLECTION_NAME", "docs_exp")
INDEX_NAME = os.getenv("INDEX_NAME", "vector_search_bedrock_exp")

# AWS Configuration
AWS_REGION = os.getenv("AWS_REGION", "us-east-1")
AWS_ACCESS_KEY_ID = os.getenv("AWS_ACCESS_KEY_ID") # Ensure this is set
AWS_SECRET_ACCESS_KEY = os.getenv("AWS_SECRET_ACCESS_KEY") # Ensure this is set
AWS_ACCOUNT_ID = os.getenv("AWS_ACCOUNT_ID") # Ensure this is set

# Bedrock Model IDs
EMBEDDING_MODEL_ID = "amazon.titan-embed-text-v2:0"
AGENT_MODEL_ID = "anthropic.claude-3-sonnet-20240229-v1:0" # Using Sonnet for the agent

# Paths (relative to this notebook's location)
NOTEBOOK_DIR = os.getcwd()
SCHEMAS_DIR = os.path.join(NOTEBOOK_DIR, 'schemas')
LAMBDA_FUNCTIONS_DIR = os.path.join(NOTEBOOK_DIR, 'lambda_functions')
RESEARCHER_SCHEMA_PATH = os.path.join(SCHEMAS_DIR, 'researcher_schema.json')
WRITER_SCHEMA_PATH = os.path.join(SCHEMAS_DIR, 'writer_schema.json')
INDEX_JSON_PATH = os.path.join(NOTEBOOK_DIR, 'aws_index.json')
DOCS_JSON_PATH = os.path.join(NOTEBOOK_DIR, 'documents.json')

# --- Check Environment Variables ---
def check_environment_variables():
    """Check if required environment variables are set."""
    required_vars = ["AWS_ACCESS_KEY_ID", "AWS_SECRET_ACCESS_KEY", "AWS_ACCOUNT_ID", "CB_PASSWORD"]
    missing_vars = [var for var in required_vars if not os.getenv(var)]
    if missing_vars:
        logger.error(f"Missing required environment variables: {', '.join(missing_vars)}")
        logger.error("Please set these variables in your environment or .env file")
        return False
    logger.info("All required environment variables are set.")
    return True

if not check_environment_variables():
    raise EnvironmentError("Missing required environment variables. Please check configuration.")

2025-05-01 07:48:26,060 - INFO - Attempting to load .env file from: /Users/kaustavghosh/Desktop/vector-search-cookbook/awsbedrock-agents/lambda-experiments/.env
2025-05-01 07:48:26,062 - INFO - .env file loaded successfully.
2025-05-01 07:48:26,063 - INFO - All required environment variables are set.


## 2. Initialize AWS Clients and Connect to Couchbase

Create the necessary Boto3 clients for interacting with AWS services (Bedrock Runtime, IAM, Lambda, Bedrock Agent, Bedrock Agent Runtime) and establish a connection to the Couchbase cluster.

In [2]:
def initialize_aws_clients():
    """Initialize required AWS clients."""
    try:
        logger.info(f"Initializing AWS clients in region: {AWS_REGION}")
        session = boto3.Session(
            aws_access_key_id=AWS_ACCESS_KEY_ID,
            aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
            region_name=AWS_REGION
        )
        # Use a config with longer timeouts for agent operations
        agent_config = Config(
            connect_timeout=120,
            read_timeout=600, # Agent preparation can take time
            retries={'max_attempts': 5, 'mode': 'adaptive'}
        )
        bedrock_runtime = session.client('bedrock-runtime', region_name=AWS_REGION)
        iam_client = session.client('iam', region_name=AWS_REGION)
        lambda_client = session.client('lambda', region_name=AWS_REGION)
        bedrock_agent_client = session.client('bedrock-agent', region_name=AWS_REGION, config=agent_config)
        bedrock_agent_runtime_client = session.client('bedrock-agent-runtime', region_name=AWS_REGION, config=agent_config)
        logger.info("AWS clients initialized successfully.")
        return bedrock_runtime, iam_client, lambda_client, bedrock_agent_client, bedrock_agent_runtime_client
    except Exception as e:
        logger.error(f"Error initializing AWS clients: {e}")
        raise

def connect_couchbase():
    """Connect to Couchbase cluster."""
    try:
        logger.info(f"Connecting to Couchbase cluster at {CB_HOST}...")
        auth = PasswordAuthenticator(CB_USERNAME, CB_PASSWORD)
        options = ClusterOptions(auth)
        cluster = Cluster(CB_HOST, options)
        cluster.wait_until_ready(timedelta(seconds=10))
        logger.info("Successfully connected to Couchbase.")
        return cluster
    except CouchbaseException as e:
        logger.error(f"Couchbase connection error: {e}")
        raise
    except Exception as e:
        logger.error(f"Unexpected error connecting to Couchbase: {e}")
        raise

# Initialize clients
bedrock_runtime_client, iam_client, lambda_client, bedrock_agent_client, bedrock_agent_runtime_client = initialize_aws_clients()
cb_cluster = connect_couchbase()

2025-05-01 07:48:26,070 - INFO - Initializing AWS clients in region: us-east-1
2025-05-01 07:48:26,420 - INFO - AWS clients initialized successfully.
2025-05-01 07:48:26,421 - INFO - Connecting to Couchbase cluster at couchbases://cb.hlcup4o4jmjr55yf.cloud.couchbase.com...
2025-05-01 07:48:28,457 - INFO - Successfully connected to Couchbase.


## 3. Setup Couchbase Bucket, Scope, Collection, and Search Index

Define functions to create the Couchbase bucket, scope, and collection if they don't exist. Also, create or update the vector search index required by the `CouchbaseSearchVectorStore`.

In [3]:
def setup_collection(cluster, bucket_name, scope_name, collection_name):
    """Set up Couchbase collection."""
    logger.info(f"Setting up collection: {bucket_name}/{scope_name}/{collection_name}")
    try:
        # Check/Create Bucket
        try:
            bucket = cluster.bucket(bucket_name)
            logger.info(f"Bucket '{bucket_name}' exists.")
        except BucketNotFoundException:
            logger.info(f"Bucket '{bucket_name}' does not exist. Creating it...")
            bucket_settings = BucketSettings(
                name=bucket_name,
                bucket_type=BucketType.COUCHBASE,
                ram_quota_mb=256,
                flush_enabled=True,
                num_replicas=0
            )
            try:
                 cluster.buckets().create_bucket(bucket_settings)
                 logger.info(f"Bucket '{bucket_name}' created. Waiting for ready state (10s)...")
                 time.sleep(10)
                 bucket = cluster.bucket(bucket_name)
            except Exception as create_e:
                 logger.error(f"Failed to create bucket '{bucket_name}': {create_e}")
                 raise
        except Exception as e:
             logger.error(f"Error getting bucket '{bucket_name}': {e}")
             raise

        bucket_manager = bucket.collections()

        # Check/Create Scope
        scopes = bucket_manager.get_all_scopes()
        scope_exists = any(s.name == scope_name for s in scopes)
        if not scope_exists:
            logger.info(f"Scope '{scope_name}' does not exist. Creating it...")
            try:
                 bucket_manager.create_scope(scope_name)
                 logger.info(f"Scope '{scope_name}' created. Waiting (2s)...")
                 time.sleep(2)
            except CouchbaseException as e:
                 if "already exists" in str(e).lower() or "scope_exists" in str(e).lower():
                      logger.info(f"Scope '{scope_name}' likely already exists.")
                 else:
                      logger.error(f"Failed to create scope '{scope_name}': {e}")
                      raise
        else:
             logger.info(f"Scope '{scope_name}' already exists.")

        # Check/Create Collection
        scopes = bucket_manager.get_all_scopes() # Re-fetch
        collection_exists = False
        for s in scopes:
             if s.name == scope_name:
                  if any(c.name == collection_name for c in s.collections):
                       collection_exists = True
                       break
        if not collection_exists:
            logger.info(f"Collection '{collection_name}' does not exist. Creating it...")
            try:
                collection_spec = CollectionSpec(collection_name, scope_name)
                bucket_manager.create_collection(collection_spec)
                logger.info(f"Collection '{collection_name}' created. Waiting (2s)...")
                time.sleep(2)
            except CouchbaseException as e:
                 if "already exists" in str(e).lower() or "collection_exists" in str(e).lower():
                     logger.info(f"Collection '{collection_name}' likely already exists.")
                 else:
                     logger.error(f"Failed to create collection '{collection_name}': {e}")
                     raise
        else:
            logger.info(f"Collection '{collection_name}' already exists.")

        # Ensure primary index exists
        try:
            logger.info(f"Ensuring primary index exists on `{bucket_name}`.`{scope_name}`.`{collection_name}`...")
            cluster.query(f"CREATE PRIMARY INDEX IF NOT EXISTS ON `{bucket_name}`.`{scope_name}`.`{collection_name}`").execute()
            logger.info("Primary index present or created successfully.")
        except Exception as e:
            logger.error(f"Error creating primary index: {str(e)}")

        logger.info("Collection setup complete.")
        return cluster.bucket(bucket_name).scope(scope_name).collection(collection_name)

    except Exception as e:
        logger.error(f"Error setting up collection: {str(e)}")
        logger.error(traceback.format_exc())
        raise

def setup_search_index(cluster, index_name, bucket_name, scope_name, collection_name, index_definition_path):
    """Set up search indexes."""
    try:
        logger.info(f"Looking for index definition at: {index_definition_path}")
        if not os.path.exists(index_definition_path):
             raise FileNotFoundError(f"Index definition file not found: {index_definition_path}")

        with open(index_definition_path, 'r') as file:
            index_definition = json.load(file)
            index_definition['name'] = index_name
            index_definition['sourceName'] = bucket_name
            logger.info(f"Loaded index definition, ensuring name is '{index_name}' and source is '{bucket_name}'.")

    except Exception as e:
        logger.error(f"Error loading index definition: {str(e)}")
        raise

    try:
        search_index_manager = cluster.search_indexes()
        search_index = SearchIndex.from_json(index_definition)
        logger.info(f"Upserting search index '{index_name}'...")
        search_index_manager.upsert_index(search_index)
        logger.info(f"Index '{index_name}' upsert submitted. Waiting for indexing (10s)...")
        time.sleep(10)
        logger.info(f"Search index '{index_name}' setup complete.")
        
    except QueryIndexAlreadyExistsException:
        logger.warning(f"Search index '{index_name}' likely already existed (caught QueryIndexAlreadyExistsException, check if applicable). Upsert attempted.")

    except CouchbaseException as e:
        logger.error(f"Couchbase error during search index setup for '{index_name}': {e}")
        raise
    except Exception as e:
        logger.error(f"Unexpected error during search index setup for '{index_name}': {e}")
        raise

def clear_collection(cluster, bucket_name, scope_name, collection_name):
    """Delete all documents from the specified collection."""
    try:
        logger.warning(f"Attempting to clear all documents from `{bucket_name}`.`{scope_name}`.`{collection_name}`...")
        query = f"DELETE FROM `{bucket_name}`.`{scope_name}`.`{collection_name}`"
        result = cluster.query(query).execute()
        mutation_count = 0
        try:
             metrics_data = result.meta_data().metrics()
             if metrics_data:
                  mutation_count = metrics_data.mutation_count()
        except Exception as metrics_e:
             logger.warning(f"Could not retrieve mutation count after delete: {metrics_e}")
        logger.info(f"Successfully cleared documents (approx. {mutation_count} mutations).")
    except Exception as e:
        logger.error(f"Error clearing documents: {e}. Collection might be empty or index not ready.")

# Execute setup
cb_collection = setup_collection(cb_cluster, CB_BUCKET_NAME, SCOPE_NAME, COLLECTION_NAME)
setup_search_index(cb_cluster, INDEX_NAME, CB_BUCKET_NAME, SCOPE_NAME, COLLECTION_NAME, INDEX_JSON_PATH)

2025-05-01 07:48:28,476 - INFO - Setting up collection: vector-search-testing/shared/bedrock
2025-05-01 07:48:29,568 - INFO - Bucket 'vector-search-testing' exists.
2025-05-01 07:48:30,502 - INFO - Scope 'shared' already exists.
2025-05-01 07:48:31,415 - INFO - Collection 'bedrock' already exists.
2025-05-01 07:48:31,417 - INFO - Ensuring primary index exists on `vector-search-testing`.`shared`.`bedrock`...
2025-05-01 07:48:32,342 - INFO - Primary index present or created successfully.
2025-05-01 07:48:32,343 - INFO - Collection setup complete.
2025-05-01 07:48:32,344 - INFO - Looking for index definition at: /Users/kaustavghosh/Desktop/vector-search-cookbook/awsbedrock-agents/lamda-approach/aws_index.json
2025-05-01 07:48:32,348 - INFO - Loaded index definition, ensuring name is 'vector_search_bedrock' and source is 'vector-search-testing'.
2025-05-01 07:48:32,349 - INFO - Upserting search index 'vector_search_bedrock'...


## 4. Initialize Vector Store and Load Data

Clear any existing data from the collection, initialize the `CouchbaseSearchVectorStore` using the Bedrock embeddings client, and load documents from the `documents.json` file.

In [4]:
# Clear existing documents first
clear_collection(cb_cluster, CB_BUCKET_NAME, SCOPE_NAME, COLLECTION_NAME)

try:
    logger.info(f"Initializing Bedrock Embeddings client with model: {EMBEDDING_MODEL_ID}")
    embeddings = BedrockEmbeddings(
        client=bedrock_runtime_client,
        model_id=EMBEDDING_MODEL_ID
    )
    logger.info("Successfully created Bedrock embeddings client.")

    logger.info(f"Initializing CouchbaseSearchVectorStore with index: {INDEX_NAME}")
    vector_store = CouchbaseSearchVectorStore(
        cluster=cb_cluster,
        bucket_name=CB_BUCKET_NAME,
        scope_name=SCOPE_NAME,
        collection_name=COLLECTION_NAME,
        embedding=embeddings,
        index_name=INDEX_NAME
    )
    logger.info("Successfully created Couchbase vector store.")

    # Load documents from JSON file
    logger.info(f"Looking for documents at: {DOCS_JSON_PATH}")
    if not os.path.exists(DOCS_JSON_PATH):
         raise FileNotFoundError(f"Documents file not found: {DOCS_JSON_PATH}")

    with open(DOCS_JSON_PATH, 'r') as f:
        data = json.load(f)
        documents_to_load = data.get('documents', [])
    logger.info(f"Loaded {len(documents_to_load)} documents from {DOCS_JSON_PATH}")

    # Add documents to vector store
    if documents_to_load:
        logger.info(f"Adding {len(documents_to_load)} documents to vector store...")
        texts = [doc.get('text', '') for doc in documents_to_load]
        metadatas = []
        for i, doc in enumerate(documents_to_load):
            metadata_raw = doc.get('metadata', {})
            if isinstance(metadata_raw, str):
                try:
                    metadata = json.loads(metadata_raw)
                    if not isinstance(metadata, dict):
                         logger.warning(f"Parsed metadata not a dict: {metadata}. Using empty.")
                         metadata = {}
                except json.JSONDecodeError:
                    logger.warning(f"Could not parse metadata string: {metadata_raw}. Using empty.")
                    metadata = {}
            elif isinstance(metadata_raw, dict):
                metadata = metadata_raw
            else:
                logger.warning(f"Metadata not string or dict: {metadata_raw}. Using empty.")
                metadata = {}
            metadatas.append(metadata)

        inserted_ids = vector_store.add_texts(texts=texts, metadatas=metadatas)
        logger.info(f"Successfully added {len(inserted_ids)} documents to the vector store.")
        logger.info("Waiting briefly for vector indexing...")
        time.sleep(5)
    else:
         logger.warning("No documents found in the JSON file to add.")

except FileNotFoundError as e:
     logger.error(f"Setup failed: {e}")
     raise
except Exception as e:
    logger.error(f"Error during vector store setup or data loading: {e}")
    logger.error(traceback.format_exc())
    raise

logger.info("--- Couchbase Setup and Data Loading Complete ---")

2025-05-01 07:48:33,274 - INFO - Successfully cleared documents (approx. 0 mutations).
2025-05-01 07:48:33,276 - INFO - Initializing Bedrock Embeddings client with model: amazon.titan-embed-text-v2:0
2025-05-01 07:48:33,277 - INFO - Successfully created Bedrock embeddings client.
2025-05-01 07:48:33,278 - INFO - Initializing CouchbaseSearchVectorStore with index: vector_search_bedrock
2025-05-01 07:48:36,745 - INFO - Successfully created Couchbase vector store.
2025-05-01 07:48:36,747 - INFO - Looking for documents at: /Users/kaustavghosh/Desktop/vector-search-cookbook/awsbedrock-agents/lamda-approach/documents.json
2025-05-01 07:48:36,749 - INFO - Loaded 7 documents from /Users/kaustavghosh/Desktop/vector-search-cookbook/awsbedrock-agents/lamda-approach/documents.json
2025-05-01 07:48:36,749 - INFO - Adding 7 documents to vector store...
2025-05-01 07:48:40,575 - INFO - Successfully added 7 documents to the vector store.
2025-05-01 07:48:40,575 - INFO - Waiting briefly for vector inde

## 5. Create IAM Role for Agent and Lambdas

Define and create an IAM role that grants necessary permissions for Bedrock Agents and Lambda functions to interact with each other and other AWS services (like CloudWatch Logs and Bedrock).

In [5]:
def create_agent_role(iam_client, role_name, aws_account_id):
    """Creates or gets the IAM role for the Bedrock Agent Lambda functions."""
    logger.info(f"Checking/Creating IAM role: {role_name}")
    assume_role_policy_document = {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Principal": {
                    "Service": [
                        "lambda.amazonaws.com",
                        "bedrock.amazonaws.com"
                    ]
                },
                "Action": "sts:AssumeRole"
            }
        ]
    }

    role_arn = None
    try:
        get_role_response = iam_client.get_role(RoleName=role_name)
        role_arn = get_role_response['Role']['Arn']
        logger.info(f"IAM role '{role_name}' already exists. Updating trust policy.")
        iam_client.update_assume_role_policy(
            RoleName=role_name,
            PolicyDocument=json.dumps(assume_role_policy_document)
        )
        logger.info(f"Trust policy updated for role '{role_name}'.")

    except iam_client.exceptions.NoSuchEntityException:
        logger.info(f"IAM role '{role_name}' not found. Creating...")
        try:
            create_role_response = iam_client.create_role(
                RoleName=role_name,
                AssumeRolePolicyDocument=json.dumps(assume_role_policy_document),
                Description='IAM role for Bedrock Agent Lambda functions (Experiment)',
                MaxSessionDuration=3600
            )
            role_arn = create_role_response['Role']['Arn']
            logger.info(f"Created IAM role '{role_name}'. ARN: {role_arn}. Waiting 15s...")
            time.sleep(15)
        except ClientError as e:
            logger.error(f"Error creating IAM role '{role_name}': {e}")
            raise

    except ClientError as e:
        logger.error(f"Error getting/updating IAM role '{role_name}': {e}")
        raise

    # Attach basic execution policy
    try:
        logger.info(f"Attaching basic Lambda execution policy to role '{role_name}'...")
        iam_client.attach_role_policy(
            RoleName=role_name,
            PolicyArn='arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole'
        )
        logger.info("Attached basic Lambda execution policy.")
    except ClientError as e:
        logger.warning(f"Error attaching basic Lambda execution policy (may already exist): {e}")

    # Add inline policies for logging and Bedrock
    basic_inline_policy_name = "LambdaBasicLoggingPermissions"
    basic_inline_policy_doc = {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": ["logs:CreateLogGroup", "logs:CreateLogStream", "logs:PutLogEvents"],
                "Resource": f"arn:aws:logs:{AWS_REGION}:{aws_account_id}:log-group:/aws/lambda/*:*"
            }
        ]
    }
    bedrock_policy_name = "BedrockAgentPermissions"
    bedrock_policy_doc = {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": ["bedrock:*"],
                "Resource": "*"
            }
        ]
    }
    try:
        logger.info(f"Putting inline policy '{basic_inline_policy_name}'...")
        iam_client.put_role_policy(
            RoleName=role_name,
            PolicyName=basic_inline_policy_name,
            PolicyDocument=json.dumps(basic_inline_policy_doc)
        )
        logger.info(f"Putting inline policy '{bedrock_policy_name}'...")
        iam_client.put_role_policy(
            RoleName=role_name,
            PolicyName=bedrock_policy_name,
            PolicyDocument=json.dumps(bedrock_policy_doc)
        )
        logger.info("Inline policies updated. Waiting 10s...")
        time.sleep(10)
    except ClientError as e:
        logger.error(f"Error putting inline policy: {e}")

    if not role_arn:
         raise Exception(f"Failed to create or retrieve ARN for role {role_name}")

    return role_arn

# Execute role creation
agent_role_name = "bedrock_agent_lambda_exp_role"
try:
    agent_role_arn = create_agent_role(iam_client, agent_role_name, AWS_ACCOUNT_ID)
    logger.info(f"Agent IAM Role ARN: {agent_role_arn}")
except Exception as e:
    logger.error(f"Failed to create/verify IAM role: {e}")
    logger.error(traceback.format_exc())
    raise

2025-05-01 07:48:45,615 - INFO - Checking/Creating IAM role: bedrock_agent_lambda_exp_role
2025-05-01 07:48:46,733 - INFO - IAM role 'bedrock_agent_lambda_exp_role' already exists. Updating trust policy.
2025-05-01 07:48:47,060 - INFO - Trust policy updated for role 'bedrock_agent_lambda_exp_role'.
2025-05-01 07:48:47,060 - INFO - Attaching basic Lambda execution policy to role 'bedrock_agent_lambda_exp_role'...
2025-05-01 07:48:47,402 - INFO - Attached basic Lambda execution policy.
2025-05-01 07:48:47,404 - INFO - Putting inline policy 'LambdaBasicLoggingPermissions'...
2025-05-01 07:48:47,729 - INFO - Putting inline policy 'BedrockAgentPermissions'...
2025-05-01 07:48:48,057 - INFO - Inline policies updated. Waiting 10s...
2025-05-01 07:48:58,061 - INFO - Agent IAM Role ARN: arn:aws:iam::598307997273:role/bedrock_agent_lambda_exp_role


## 6. Package and Deploy Lambda Functions

Define functions to:
1.  **Package:** Use the `Makefile` in the `lambda_functions` directory to install dependencies and create deployment zip files for each Lambda function.
2.  **Upload (if needed):** Upload the zip file to an S3 bucket if it exceeds the direct upload size limit.
3.  **Deploy:** Create or update the AWS Lambda functions using the packaged code and the previously created IAM role.
4.  **Delete (Helper):** A function to delete existing Lambda functions before deployment.

**Note:** Running `make` requires the `make` utility to be installed in your environment. The `Makefile` handles dependency installation into a `package_dir` and zipping.

In [6]:
def delete_lambda_function(lambda_client, function_name):
    """Delete Lambda function if it exists."""
    logger.info(f"Attempting to delete Lambda function: {function_name}...")
    try:
        statement_id = f"AllowBedrockInvokeBasic-{function_name}"
        try:
            logger.info(f"Attempting to remove permission {statement_id}...")
            lambda_client.remove_permission(FunctionName=function_name, StatementId=statement_id)
            logger.info(f"Removed permission {statement_id}.")
            time.sleep(2)
        except lambda_client.exceptions.ResourceNotFoundException:
            logger.info(f"Permission {statement_id} not found. Skipping removal.")
        except ClientError as perm_e:
            logger.warning(f"Error removing permission {statement_id}: {str(perm_e)}")

        lambda_client.get_function(FunctionName=function_name)
        logger.info(f"Function {function_name} exists. Deleting...")
        lambda_client.delete_function(FunctionName=function_name)
        logger.info(f"Waiting for {function_name} deletion...")
        time.sleep(10)
        logger.info(f"Function {function_name} deletion initiated.")
        return True
    except lambda_client.exceptions.ResourceNotFoundException:
        logger.info(f"Lambda function '{function_name}' does not exist.")
        return False
    except Exception as e:
        logger.error(f"Error deleting Lambda function '{function_name}': {str(e)}")
        return False

def upload_to_s3(zip_file, region, bucket_name=None):
    """Upload zip file to S3 and return S3 location."""
    logger.info(f"Preparing to upload {zip_file} to S3 in region {region}...")
    config = Config(connect_timeout=60, read_timeout=300, retries={'max_attempts': 3, 'mode': 'adaptive'})
    s3_client = boto3.client('s3', region_name=region, config=config)
    sts_client = boto3.client('sts', region_name=region, config=config)

    if bucket_name is None:
        try:
            account_id = sts_client.get_caller_identity().get('Account')
            timestamp = int(time.time())
            bucket_name = f"lambda-deployment-{account_id}-{timestamp}"
            logger.info(f"Generated S3 bucket name: {bucket_name}")
        except Exception as e:
            fallback_id = uuid.uuid4().hex[:12]
            bucket_name = f"lambda-deployment-{fallback_id}"
            logger.warning(f"Error getting account ID ({e}). Using fallback: {bucket_name}")

    try:
        s3_client.head_bucket(Bucket=bucket_name)
        logger.info(f"Using existing S3 bucket: {bucket_name}")
    except ClientError as e:
        error_code = int(e.response['Error']['Code'])
        if error_code == 404:
            logger.info(f"Creating S3 bucket: {bucket_name}...")
            try:
                if region == 'us-east-1':
                    s3_client.create_bucket(Bucket=bucket_name)
                else:
                    s3_client.create_bucket(Bucket=bucket_name, CreateBucketConfiguration={'LocationConstraint': region})
                logger.info(f"Created S3 bucket: {bucket_name}. Waiting...")
                waiter = s3_client.get_waiter('bucket_exists')
                waiter.wait(Bucket=bucket_name, WaiterConfig={'Delay': 5, 'MaxAttempts': 12})
                logger.info(f"Bucket {bucket_name} is available.")
            except Exception as create_e:
                logger.error(f"Error creating bucket '{bucket_name}': {create_e}")
                raise
        else:
            logger.error(f"Error checking bucket '{bucket_name}': {e}")
            raise

    s3_key = f"lambda/{os.path.basename(zip_file)}-{uuid.uuid4().hex[:8]}"
    try:
        logger.info(f"Uploading {zip_file} to s3://{bucket_name}/{s3_key}...")
        file_size = os.path.getsize(zip_file)
        if file_size > 100 * 1024 * 1024:
            logger.info("Using multipart upload...")
            transfer_config = boto3.s3.transfer.TransferConfig(multipart_threshold=10*1024*1024, max_concurrency=10, multipart_chunksize=10*1024*1024, use_threads=True)
            s3_transfer = boto3.s3.transfer.S3Transfer(client=s3_client, config=transfer_config)
            s3_transfer.upload_file(zip_file, bucket_name, s3_key)
        else:
            with open(zip_file, 'rb') as f:
                s3_client.put_object(Bucket=bucket_name, Key=s3_key, Body=f)
        logger.info(f"Successfully uploaded to s3://{bucket_name}/{s3_key}")
        return {'S3Bucket': bucket_name, 'S3Key': s3_key}
    except Exception as upload_e:
        logger.error(f"S3 upload failed: {upload_e}")
        raise

def package_function(function_name, source_dir, build_dir):
    """Package Lambda function using Makefile found in source_dir."""
    makefile_path = os.path.join(source_dir, 'Makefile')
    temp_package_dir = os.path.join(source_dir, 'package_dir')
    source_req_path = os.path.join(source_dir, 'requirements.txt')
    source_func_script_path = os.path.join(source_dir, f'{function_name}.py')
    target_func_script_path = os.path.join(source_dir, 'lambda_function.py') # Target name for make
    make_output_zip = os.path.join(source_dir, 'lambda_package.zip') # Output from make
    final_zip_path = os.path.join(build_dir, f'{function_name}.zip') # Final location

    logger.info(f"--- Packaging function {function_name} ---")
    logger.info(f"Source Dir (Makefile location): {source_dir}")
    logger.info(f"Build Dir (Final zip location): {build_dir}")

    if not os.path.exists(source_func_script_path):
        raise FileNotFoundError(f"Source script not found: {source_func_script_path}")
    if not os.path.exists(source_req_path):
        raise FileNotFoundError(f"Requirements file not found: {source_req_path}")
    if not os.path.exists(makefile_path):
        raise FileNotFoundError(f"Makefile not found: {makefile_path}")

    if os.path.exists(target_func_script_path):
        logger.warning(f"Removing existing target script: {target_func_script_path}")
        os.remove(target_func_script_path)

    try:
        logger.info(f"Copying {source_func_script_path} to {target_func_script_path}")
        shutil.copy(source_func_script_path, target_func_script_path)

        make_command = ['make', '-f', makefile_path, 'clean', 'package']
        logger.info(f"Running make command: {' '.join(make_command)} (in {source_dir})")
        # Use shell=True cautiously or ensure make path is absolute/in PATH
        # Redirect stdout to avoid cluttering notebook output
        process = subprocess.run(make_command, cwd=source_dir, capture_output=True, text=True, check=True)
        logger.info("Make command completed successfully.")
        # logger.debug(f"Make stdout:\n{process.stdout}") # Uncomment for debugging make output

        if not os.path.exists(make_output_zip):
            raise FileNotFoundError(f"Makefile did not produce expected output: {make_output_zip}")

        logger.info(f"Moving and renaming {make_output_zip} to {final_zip_path}")
        if os.path.exists(final_zip_path):
             logger.warning(f"Removing existing final zip: {final_zip_path}")
             os.remove(final_zip_path)
        os.rename(make_output_zip, final_zip_path)
        logger.info(f"Zip file ready: {final_zip_path}")

        return final_zip_path

    except subprocess.CalledProcessError as e:
        logger.error(f"Error running Makefile for {function_name}: {e}")
        logger.error(f"Make stderr:\n{e.stderr}")
        raise
    except Exception as e:
        logger.error(f"Error packaging function {function_name}: {str(e)}")
        logger.error(traceback.format_exc())
        raise
    finally:
        if os.path.exists(target_func_script_path):
            logger.info(f"Cleaning up temporary script: {target_func_script_path}")
            os.remove(target_func_script_path)
        if os.path.exists(make_output_zip):
            logger.warning(f"Cleaning up intermediate zip: {make_output_zip}")
            os.remove(make_output_zip)

def create_lambda_function(lambda_client, function_name, handler, role_arn, zip_file, region):
    """Create or update Lambda function."""
    logger.info(f"Deploying Lambda function {function_name} from {zip_file}...")
    config = Config(connect_timeout=120, read_timeout=300, retries={'max_attempts': 5, 'mode': 'adaptive'})
    lambda_client_local = boto3.client('lambda', region_name=region, config=config)

    zip_size_mb = 0
    try:
        zip_size_bytes = os.path.getsize(zip_file)
        zip_size_mb = zip_size_bytes / (1024 * 1024)
        logger.info(f"Zip file size: {zip_size_mb:.2f} MB")
    except OSError as e:
         logger.error(f"Could not get size of zip file {zip_file}: {e}")
         raise

    use_s3 = zip_size_mb > 45
    s3_location = None
    zip_content = None

    if use_s3:
        logger.info(f"Package size requires S3 deployment.")
        s3_location = upload_to_s3(zip_file, region)
        if not s3_location:
             raise Exception("Failed to upload Lambda package to S3.")
    else:
         logger.info("Deploying package directly.")
         try:
             with open(zip_file, 'rb') as f:
                 zip_content = f.read()
         except OSError as e:
              logger.error(f"Could not read zip file {zip_file}: {e}")
              raise

    common_args = {
        'FunctionName': function_name,
        'Runtime': 'python3.9',
        'Role': role_arn,
        'Handler': handler,
        'Timeout': 180,
        'MemorySize': 1536,
        'Environment': {
            'Variables': {
                'CB_HOST': os.getenv('CB_HOST', 'couchbase://localhost'),
                'CB_USERNAME': os.getenv('CB_USERNAME', 'Administrator'),
                'CB_PASSWORD': os.getenv('CB_PASSWORD'), # Passed from notebook env
                'CB_BUCKET_NAME': os.getenv('CB_BUCKET_NAME', 'vector-search-exp'),
                'SCOPE_NAME': os.getenv('SCOPE_NAME', 'bedrock_exp'),
                'COLLECTION_NAME': os.getenv('COLLECTION_NAME', 'docs_exp'),
                'INDEX_NAME': os.getenv('INDEX_NAME', 'vector_search_bedrock_exp'),
                'EMBEDDING_MODEL_ID': os.getenv('EMBEDDING_MODEL_ID', EMBEDDING_MODEL_ID),
                'AGENT_MODEL_ID': os.getenv('AGENT_MODEL_ID', AGENT_MODEL_ID),
            }
        }
    }

    if use_s3:
        code_arg = {'S3Bucket': s3_location['S3Bucket'], 'S3Key': s3_location['S3Key']}
    else:
        code_arg = {'ZipFile': zip_content}

    max_retries = 3
    base_delay = 10
    for attempt in range(1, max_retries + 1):
        try:
            logger.info(f"Creating function '{function_name}' (attempt {attempt})...")
            create_args = common_args.copy()
            create_args['Code'] = code_arg
            create_args['Publish'] = True

            create_response = lambda_client_local.create_function(**create_args)
            function_arn = create_response['FunctionArn']
            logger.info(f"Created function '{function_name}'. ARN: {function_arn}")

            time.sleep(5)
            statement_id = f"AllowBedrockInvokeBasic-{function_name}"
            try:
                logger.info(f"Adding invoke permission ({statement_id})...")
                lambda_client_local.add_permission(
                    FunctionName=function_name, StatementId=statement_id,
                    Action='lambda:InvokeFunction', Principal='bedrock.amazonaws.com'
                )
                logger.info(f"Added permission {statement_id}.")
            except lambda_client_local.exceptions.ResourceConflictException:
                 logger.info(f"Permission {statement_id} already exists.")
            except ClientError as perm_e:
                logger.warning(f"Failed to add permission {statement_id}: {perm_e}")

            logger.info(f"Waiting for function '{function_name}' to become active...")
            waiter = lambda_client_local.get_waiter('function_active_v2')
            waiter.wait(FunctionName=function_name, WaiterConfig={'Delay': 5, 'MaxAttempts': 24})
            logger.info(f"Function '{function_name}' is active.")
            return function_arn

        except lambda_client_local.exceptions.ResourceConflictException:
             logger.warning(f"Function '{function_name}' already exists. Updating code...")
             try:
                 update_args = {'FunctionName': function_name, 'Publish': True}
                 if use_s3:
                     update_args['S3Bucket'] = s3_location['S3Bucket']
                     update_args['S3Key'] = s3_location['S3Key']
                 else:
                     update_args['ZipFile'] = zip_content
                 update_response = lambda_client_local.update_function_code(**update_args)
                 function_arn = update_response['FunctionArn']
                 logger.info(f"Updated function code for '{function_name}'. New version ARN: {function_arn}")

                 try:
                      logger.info(f"Updating configuration for '{function_name}'...")
                      lambda_client_local.update_function_configuration(**common_args)
                      logger.info(f"Configuration updated.")
                 except ClientError as conf_e:
                      logger.warning(f"Could not update configuration: {conf_e}")

                 time.sleep(5)
                 statement_id = f"AllowBedrockInvokeBasic-{function_name}"
                 try:
                     logger.info(f"Verifying/Adding invoke permission ({statement_id}) after update...")
                     lambda_client_local.add_permission(
                         FunctionName=function_name, StatementId=statement_id,
                         Action='lambda:InvokeFunction', Principal='bedrock.amazonaws.com'
                     )
                     logger.info(f"Permission {statement_id} added/verified.")
                 except lambda_client_local.exceptions.ResourceConflictException:
                     logger.info(f"Permission {statement_id} already exists.")
                 except ClientError as perm_e:
                     logger.warning(f"Failed to add/verify permission {statement_id}: {perm_e}")

                 logger.info(f"Waiting for function '{function_name}' update...")
                 waiter = lambda_client_local.get_waiter('function_updated_v2')
                 waiter.wait(FunctionName=function_name, WaiterConfig={'Delay': 5, 'MaxAttempts': 24})
                 logger.info(f"Function '{function_name}' update complete.")
                 return function_arn

             except ClientError as update_e:
                 logger.error(f"Failed to update function '{function_name}': {update_e}")
                 if attempt < max_retries:
                      delay = base_delay * (2 ** (attempt - 1))
                      logger.info(f"Retrying update in {delay} seconds...")
                      time.sleep(delay)
                 else:
                      logger.error("Max update retries reached.")
                      raise update_e

        except ClientError as e:
            error_code = e.response.get('Error', {}).get('Code')
            if error_code in ['ThrottlingException', 'ProvisionedConcurrencyConfigNotFoundException', 'EC2ThrottledException'] or 'Rate exceeded' in str(e):
                logger.warning(f"Retryable error on attempt {attempt}: {e}")
                if attempt < max_retries:
                    delay = base_delay * (2 ** (attempt - 1)) + (uuid.uuid4().int % 5)
                    logger.info(f"Retrying in {delay} seconds...")
                    time.sleep(delay)
                else:
                    logger.error("Max retries reached.")
                    raise e
            else:
                logger.error(f"Error creating/updating Lambda '{function_name}': {e}")
                logger.error(traceback.format_exc())
                raise e
        except Exception as e:
             logger.error(f"Unexpected error during Lambda deployment: {e}")
             logger.error(traceback.format_exc())
             raise e

    raise Exception(f"Failed to deploy Lambda function {function_name} after {max_retries} attempts.")

# --- Deploy Lambdas ---
researcher_lambda_name = "bedrock_agent_researcher_exp"
writer_lambda_name = "bedrock_agent_writer_exp"
lambda_build_dir = NOTEBOOK_DIR # Final zip ends up here

logger.info("--- Starting Lambda Deployment --- ")
researcher_lambda_arn = None
writer_lambda_arn = None
researcher_zip_path = None
writer_zip_path = None

try:
    # Delete existing functions first
    delete_lambda_function(lambda_client, researcher_lambda_name)
    delete_lambda_function(lambda_client, writer_lambda_name)

    # Package functions
    researcher_zip_path = package_function("bedrock_agent_researcher", LAMBDA_FUNCTIONS_DIR, lambda_build_dir)
    writer_zip_path = package_function("bedrock_agent_writer", LAMBDA_FUNCTIONS_DIR, lambda_build_dir)

    # Create/Update functions
    researcher_lambda_arn = create_lambda_function(
        lambda_client=lambda_client, function_name=researcher_lambda_name,
        handler='lambda_function.lambda_handler', role_arn=agent_role_arn,
        zip_file=researcher_zip_path, region=AWS_REGION
    )
    logger.info(f"Researcher Lambda Deployed: {researcher_lambda_arn}")

    writer_lambda_arn = create_lambda_function(
        lambda_client=lambda_client, function_name=writer_lambda_name,
        handler='lambda_function.lambda_handler', role_arn=agent_role_arn,
        zip_file=writer_zip_path, region=AWS_REGION
    )
    logger.info(f"Writer Lambda Deployed: {writer_lambda_arn}")

except FileNotFoundError as e:
     logger.error(f"Lambda packaging failed: Required file not found. {e}")
     raise
except Exception as e:
    logger.error(f"Lambda deployment failed: {e}")
    logger.error(traceback.format_exc())
    raise
finally:
    logger.info("Cleaning up deployment zip files...")
    if researcher_zip_path and os.path.exists(researcher_zip_path):
        try: os.remove(researcher_zip_path)
        except OSError as e: logger.warning(f"Could not remove zip {researcher_zip_path}: {e}")
    if writer_zip_path and os.path.exists(writer_zip_path):
         try: os.remove(writer_zip_path)
         except OSError as e: logger.warning(f"Could not remove zip {writer_zip_path}: {e}")

logger.info("--- Lambda Deployment Complete --- ")

2025-05-01 07:48:58,087 - INFO - --- Starting Lambda Deployment --- 
2025-05-01 07:48:58,088 - INFO - Attempting to delete Lambda function: bedrock_agent_researcher_exp...
2025-05-01 07:48:58,088 - INFO - Attempting to remove permission AllowBedrockInvokeBasic-bedrock_agent_researcher_exp...
2025-05-01 07:48:58,886 - INFO - Permission AllowBedrockInvokeBasic-bedrock_agent_researcher_exp not found. Skipping removal.
2025-05-01 07:48:59,213 - INFO - Lambda function 'bedrock_agent_researcher_exp' does not exist.
2025-05-01 07:48:59,215 - INFO - Attempting to delete Lambda function: bedrock_agent_writer_exp...
2025-05-01 07:48:59,217 - INFO - Attempting to remove permission AllowBedrockInvokeBasic-bedrock_agent_writer_exp...
2025-05-01 07:48:59,495 - INFO - Permission AllowBedrockInvokeBasic-bedrock_agent_writer_exp not found. Skipping removal.
2025-05-01 07:48:59,828 - INFO - Lambda function 'bedrock_agent_writer_exp' does not exist.
2025-05-01 07:48:59,829 - INFO - --- Packaging function

## 7. Define Agent Creation and Deletion Functions

Define helper functions to manage the Bedrock Agent lifecycle:
*   `get_agent_by_name`: Find an existing agent's ID.
*   `delete_action_group`: Delete a specific action group.
*   `delete_agent_and_resources`: Delete an agent and all its associated action groups (useful for cleanup).
*   `create_agent`: Create a new Bedrock Agent with a specified foundation model and instruction.

In [7]:
def get_agent_by_name(agent_client, agent_name):
    """Find an agent ID by its name using list_agents."""
    logger.info(f"Attempting to find agent by name: {agent_name}")
    try:
        paginator = agent_client.get_paginator('list_agents')
        for page in paginator.paginate():
            for agent_summary in page.get('agentSummaries', []):
                if agent_summary.get('agentName') == agent_name:
                    agent_id = agent_summary.get('agentId')
                    logger.info(f"Found agent '{agent_name}' with ID: {agent_id}")
                    return agent_id
        logger.info(f"Agent '{agent_name}' not found.")
        return None
    except ClientError as e:
        logger.error(f"Error listing agents to find '{agent_name}': {e}")
        return None

def delete_action_group(agent_client, agent_id, action_group_id):
    """Deletes a specific action group for an agent."""
    logger.info(f"Attempting to delete action group {action_group_id} for agent {agent_id} DRAFT...")
    try:
        agent_client.delete_agent_action_group(
            agentId=agent_id,
            agentVersion='DRAFT',
            actionGroupId=action_group_id,
            skipResourceInUseCheck=True
        )
        logger.info(f"Successfully deleted action group {action_group_id}.")
        time.sleep(5)
        return True
    except agent_client.exceptions.ResourceNotFoundException:
        logger.info(f"Action group {action_group_id} not found. Skipping deletion.")
        return False
    except ClientError as e:
        error_code = e.response.get('Error', {}).get('Code')
        if error_code == 'ConflictException':
            logger.warning(f"Conflict deleting action group {action_group_id}. Retrying once...")
            time.sleep(15)
            try:
                agent_client.delete_agent_action_group(
                    agentId=agent_id, agentVersion='DRAFT', actionGroupId=action_group_id, skipResourceInUseCheck=True
                )
                logger.info(f"Successfully deleted action group {action_group_id} after retry.")
                return True
            except Exception as retry_e:
                 logger.error(f"Error deleting action group {action_group_id} on retry: {retry_e}")
                 return False
        else:
            logger.error(f"Error deleting action group {action_group_id}: {e}")
            return False

def delete_agent_and_resources(agent_client, agent_name):
    """Deletes the agent and its associated action groups."""
    agent_id = get_agent_by_name(agent_client, agent_name)
    if not agent_id:
        logger.info(f"Agent '{agent_name}' not found, no deletion needed.")
        return

    logger.warning(f"--- Deleting Agent Resources for '{agent_name}' (ID: {agent_id}) ---")
    try:
        logger.info(f"Listing action groups for agent {agent_id} DRAFT...")
        action_groups = agent_client.list_agent_action_groups(
            agentId=agent_id,
            agentVersion='DRAFT'
        ).get('actionGroupSummaries', [])

        if action_groups:
            logger.info(f"Found {len(action_groups)} action groups to delete.")
            for ag in action_groups:
                delete_action_group(agent_client, agent_id, ag['actionGroupId'])
        else:
            logger.info("No action groups found to delete.")
    except ClientError as e:
        logger.error(f"Error listing/deleting action groups for agent {agent_id}: {e}")

    try:
        logger.info(f"Attempting to delete agent {agent_id} ('{agent_name}')...")
        agent_client.delete_agent(agentId=agent_id, skipResourceInUseCheck=True)
        logger.info(f"Waiting up to 2 minutes for agent {agent_id} deletion...")
        deleted = False
        for _ in range(24):
            try:
                agent_client.get_agent(agentId=agent_id)
                time.sleep(5)
            except agent_client.exceptions.ResourceNotFoundException:
                logger.info(f"Agent {agent_id} successfully deleted.")
                deleted = True
                break
            except ClientError as e:
                 error_code = e.response.get('Error', {}).get('Code')
                 if error_code == 'ThrottlingException':
                     logger.warning("Throttled checking deletion status, waiting...")
                     time.sleep(10)
                 else:
                     logger.error(f"Error checking deletion status: {e}")
                     break
        if not deleted:
             logger.warning(f"Agent {agent_id} deletion confirmation timed out.")
    except agent_client.exceptions.ResourceNotFoundException:
        logger.info(f"Agent {agent_id} ('{agent_name}') already deleted or not found.")
    except ClientError as e:
        logger.error(f"Error deleting agent {agent_id}: {e}")

    logger.info(f"--- Agent Resource Deletion Complete for '{agent_name}' ---")

def create_agent(agent_client, agent_name, agent_role_arn, foundation_model_id):
    """Creates a new Bedrock Agent."""
    logger.info(f"--- Creating Agent: {agent_name} ---")
    try:
        instruction = (
            "You are a multi-step research assistant. Your goal is to answer user questions based on documents retrieved using your tools. "
            "Follow these steps precisely: "
            "1. If the user asks an informational question, you MUST first use the `search_documents` function available in the `ResearcherActionGroup` to find relevant documents. Provide the user's core question as the 'query' parameter. "
            "2. Examine the text returned by `search_documents`. This is the research result. "
            "3. If the user's original request included specific formatting instructions (e.g., 'summarize', 'bullet points', 'list the key points'), you MUST then use the `format_content` function available in the `WriterActionGroup`. Pass the research result (the text returned by `search_documents`) as the 'content' parameter and the requested format (e.g., 'bullet points') as the 'style' parameter. "
            "4. If no specific formatting was requested, present the research result directly. "
            "5. Only use the tools provided and follow this exact sequence. Do not answer informational questions from memory or make up information. Always start by searching."
        )
        response = agent_client.create_agent(
            agentName=agent_name,
            agentResourceRoleArn=agent_role_arn,
            foundationModel=foundation_model_id,
            instruction=instruction,
            idleSessionTTLInSeconds=1800,
            description=f"Experimental agent: Couchbase search & format ({foundation_model_id})"
        )
        agent_info = response.get('agent')
        agent_id = agent_info.get('agentId')
        agent_arn = agent_info.get('agentArn')
        agent_status = agent_info.get('agentStatus')
        logger.info(f"Agent creation initiated. ID: {agent_id}, ARN: {agent_arn}, Status: {agent_status}")

        logger.info(f"Waiting for agent {agent_id} to reach initial state...")
        for _ in range(12):
             current_status = agent_client.get_agent(agentId=agent_id)['agent']['agentStatus']
             logger.info(f"Agent {agent_id} status: {current_status}")
             if current_status != 'CREATING':
                  break
             time.sleep(5)

        final_status = agent_client.get_agent(agentId=agent_id)['agent']['agentStatus']
        if final_status == 'FAILED':
            raise Exception(f"Agent creation failed for {agent_name}")
        elif final_status != 'NOT_PREPARED':
            logger.warning(f"Agent {agent_id} reached unexpected status '{final_status}'.")
        else:
             logger.info(f"Agent {agent_id} created (Status: {final_status}).")
        return agent_id, agent_arn

    except ClientError as e:
        logger.error(f"Error creating agent '{agent_name}': {e}")
        raise

## 8. Create Agent Action Groups

Define a function to create or update an action group. This function links an agent action (defined by an OpenAPI schema) to a specific Lambda function ARN. We will create two action groups:
*   `ResearcherActionGroup`: Linked to the `bedrock_agent_researcher_exp` Lambda.
*   `WriterActionGroup`: Linked to the `bedrock_agent_writer_exp` Lambda.

In [8]:
def create_action_group(agent_client, agent_id, action_group_name, function_arn, schema_path):
    """Creates or updates an action group for the agent using a schema file."""
    logger.info(f"--- Creating/Updating Action Group: {action_group_name} for Agent: {agent_id} ---")
    logger.info(f"Lambda ARN: {function_arn}")
    logger.info(f"Schema Path: {schema_path}")

    if not os.path.exists(schema_path):
        raise FileNotFoundError(f"Action group schema file not found: {schema_path}")

    try:
        with open(schema_path, 'r') as f:
            schema_definition = f.read()

        # Check if Action Group already exists for the DRAFT version
        try:
             logger.info(f"Checking if action group '{action_group_name}' exists for DRAFT...")
             paginator = agent_client.get_paginator('list_agent_action_groups')
             existing_group = None
             for page in paginator.paginate(agentId=agent_id, agentVersion='DRAFT'):
                 for ag_summary in page.get('actionGroupSummaries', []):
                      if ag_summary.get('actionGroupName') == action_group_name:
                           existing_group = ag_summary
                           break
                 if existing_group:
                      break

             if existing_group:
                 ag_id = existing_group['actionGroupId']
                 logger.warning(f"Action Group '{action_group_name}' (ID: {ag_id}) exists. Updating.")
                 response = agent_client.update_agent_action_group(
                     agentId=agent_id, agentVersion='DRAFT', actionGroupId=ag_id,
                     actionGroupName=action_group_name, functionArn=function_arn,
                     actionGroupExecutor={'lambda': function_arn},
                     apiSchema={'payload': schema_definition},
                     actionGroupState='ENABLED'
                 )
                 ag_info = response.get('agentActionGroup')
                 logger.info(f"Updated Action Group '{action_group_name}' (ID: {ag_info.get('actionGroupId')}).")
                 time.sleep(5)
                 return ag_info.get('actionGroupId')
             else:
                  logger.info(f"Action group '{action_group_name}' does not exist. Creating new.")

        except ClientError as e:
             logger.error(f"Error checking existing action group '{action_group_name}': {e}. Proceeding with create.")

        # Create new action group
        response = agent_client.create_agent_action_group(
            agentId=agent_id, agentVersion='DRAFT',
            actionGroupName=action_group_name,
            actionGroupExecutor={'lambda': function_arn},
            apiSchema={'payload': schema_definition},
            actionGroupState='ENABLED'
        )
        ag_info = response.get('agentActionGroup')
        ag_id = ag_info.get('actionGroupId')
        logger.info(f"Created Action Group '{action_group_name}' with ID: {ag_id}")
        time.sleep(5)
        return ag_id

    except ClientError as e:
        logger.error(f"Error creating/updating action group '{action_group_name}': {e}")
        raise
    except FileNotFoundError as e:
         logger.error(f"Schema file error for '{action_group_name}': {e}")
         raise

## 9. Create and Prepare the Agent

Now, execute the agent creation and preparation steps:
1.  Delete any existing agent with the same name to ensure a clean state.
2.  Create the agent using the `create_agent` function.
3.  Create the two action groups using `create_action_group`, linking them to the respective Lambda ARNs and schema files.
4.  Prepare the agent using the `prepare_agent` function. This step compiles the agent configuration and makes it ready for use. This can take several minutes.

In [9]:
def prepare_agent(agent_client, agent_id):
    """Prepares the DRAFT version of the agent."""
    logger.info(f"--- Preparing Agent: {agent_id} ---")
    try:
        response = agent_client.prepare_agent(agentId=agent_id)
        agent_version = response.get('agentVersion')
        prepared_at = response.get('preparedAt')
        status = response.get('agentStatus')
        logger.info(f"Agent preparation initiated for version '{agent_version}'. Status: {status}. Prepared At: {prepared_at}")

        logger.info(f"Waiting for agent {agent_id} preparation (up to 10 minutes)...")
        waiter_config = {
            'version': 2,
            'waiters': {
                'AgentPrepared': {
                    'delay': 30,
                    'operation': 'GetAgent',
                    'maxAttempts': 20,
                    'acceptors': [
                        {'matcher': 'path', 'expected': 'PREPARED', 'argument': 'agent.agentStatus', 'state': 'success'},
                        {'matcher': 'path', 'expected': 'FAILED', 'argument': 'agent.agentStatus', 'state': 'failure'},
                        {'matcher': 'path', 'expected': 'UPDATING', 'argument': 'agent.agentStatus', 'state': 'retry'}
                    ]
                }
            }
        }
        waiter_model = WaiterModel(waiter_config)
        custom_waiter = create_waiter_with_client('AgentPrepared', waiter_model, agent_client)

        try:
             custom_waiter.wait(agentId=agent_id)
             logger.info(f"Agent {agent_id} successfully prepared.")
        except Exception as wait_e:
             logger.error(f"Agent {agent_id} preparation failed or timed out: {wait_e}")
             try:
                 final_status = agent_client.get_agent(agentId=agent_id)['agent']['agentStatus']
                 logger.error(f"Final agent status: {final_status}")
             except Exception as get_e:
                 logger.error(f"Could not get final agent status: {get_e}")
             raise Exception(f"Agent preparation failed for {agent_id}")

    except ClientError as e:
        logger.error(f"Error initiating agent preparation for {agent_id}: {e}")
        raise

# --- Execute Agent Setup ---
agent_name = "couchbase_research_writer_agent_exp" # Unique name
agent_id = None
agent_arn = None

# Ensure Lambda ARNs are valid
if not researcher_lambda_arn or not writer_lambda_arn:
    raise ValueError("Lambda ARNs not available. Deployment likely failed.")

try:
    # 1. Delete existing agent first
    delete_agent_and_resources(bedrock_agent_client, agent_name)

    # 2. Create the agent
    agent_id, agent_arn = create_agent(
        agent_client=bedrock_agent_client,
        agent_name=agent_name,
        agent_role_arn=agent_role_arn,
        foundation_model_id=AGENT_MODEL_ID
    )

    # 3. Create Action Groups
    researcher_ag_id = create_action_group(
        agent_client=bedrock_agent_client,
        agent_id=agent_id,
        action_group_name="ResearcherActionGroup",
        function_arn=researcher_lambda_arn,
        schema_path=RESEARCHER_SCHEMA_PATH
    )

    writer_ag_id = create_action_group(
        agent_client=bedrock_agent_client,
        agent_id=agent_id,
        action_group_name="WriterActionGroup",
        function_arn=writer_lambda_arn,
        schema_path=WRITER_SCHEMA_PATH
    )

    # 4. Prepare the agent
    prepare_agent(bedrock_agent_client, agent_id)

    logger.info(f"--- Bedrock Agent '{agent_name}' (ID: {agent_id}) Setup Complete ---")

except FileNotFoundError as e:
     logger.error(f"Agent setup failed: Schema file not found. {e}")
     raise
except Exception as e:
    logger.error(f"Bedrock Agent setup failed: {e}")
    logger.error(traceback.format_exc())
    logger.info("Attempting cleanup after agent setup failure...")
    delete_agent_and_resources(bedrock_agent_client, agent_name)
    raise

2025-05-01 07:50:50,295 - INFO - Attempting to find agent by name: couchbase_research_writer_agent_exp
2025-05-01 07:50:51,128 - INFO - Found agent 'couchbase_research_writer_agent_exp' with ID: ZJJNYBIBXY
2025-05-01 07:50:51,128 - INFO - Listing action groups for agent ZJJNYBIBXY DRAFT...
2025-05-01 07:50:51,417 - INFO - Found 2 action groups to delete.
2025-05-01 07:50:51,417 - INFO - Attempting to delete action group 5IZ2EPNM9R for agent ZJJNYBIBXY DRAFT...
2025-05-01 07:50:51,691 - INFO - Successfully deleted action group 5IZ2EPNM9R.
2025-05-01 07:50:56,696 - INFO - Attempting to delete action group H7YPF1NQDV for agent ZJJNYBIBXY DRAFT...
2025-05-01 07:50:56,976 - INFO - Successfully deleted action group H7YPF1NQDV.
2025-05-01 07:51:01,980 - INFO - Attempting to delete agent ZJJNYBIBXY ('couchbase_research_writer_agent_exp')...
2025-05-01 07:51:02,281 - INFO - Waiting up to 2 minutes for agent ZJJNYBIBXY deletion...
2025-05-01 07:51:08,163 - INFO - Agent ZJJNYBIBXY successfully de

## 10. Test Agent Invocation

Define a function to invoke the agent using the `bedrock-agent-runtime` client. Invoke the prepared agent using its ID and the special `TSTALIASID` alias (recommended for testing the DRAFT version). Provide a sample prompt and observe the agent's response and trace.

In [10]:
def test_agent_invocation(agent_runtime_client, agent_id, agent_alias_id, session_id, prompt):
    """Invokes the agent and prints the response stream and trace."""
    logger.info(f"--- Testing Agent Invocation (Agent ID: {agent_id}, Alias: {agent_alias_id}) ---")
    logger.info(f"Session ID: {session_id}")
    logger.info(f"Prompt: \"{prompt}\"")

    try:
        response = agent_runtime_client.invoke_agent(
            agentId=agent_id,
            agentAliasId=agent_alias_id,
            sessionId=session_id,
            inputText=prompt,
            enableTrace=True
        )

        logger.info("Agent invocation successful. Processing response stream...")
        completion_text = ""
        trace_events = []
        final_response_text = ""

        print("\n--- Agent Response Stream ---")
        for event in response.get('completion', []):
            if 'chunk' in event:
                data = event['chunk'].get('bytes', b'')
                decoded_chunk = data.decode('utf-8')
                print(decoded_chunk, end="") # Print stream in real-time
                final_response_text += decoded_chunk
            elif 'trace' in event:
                trace_part = event['trace'].get('trace')
                if trace_part:
                     trace_events.append(trace_part)
            # else: # Optional: log unhandled event types
                 # logger.warning(f"Unhandled event type: {event}")
        print("\n--- End of Stream ---")

        # Log trace summary
        if trace_events:
             logger.info("\n--- Invocation Trace Summary ---")
             for i, trace in enumerate(trace_events):
                  trace_type = trace.get('type')
                  step_type = trace.get('orchestration', {}).get('stepType')
                  rationale = trace.get('rationale', {}).get('text')
                  observation = trace.get('observation') # Includes function results
                  model_invocation_input = trace.get('modelInvocationInput') # Can be large

                  log_line = f"Trace {i+1}: Type={trace_type}, Step={step_type}"
                  if rationale: log_line += f", Rationale='{rationale[:100]}...'"
                  logger.info(log_line)

                  # Log observation details (e.g., function results)
                  if observation:
                      func_result = observation.get('finalResponse', {}).get('text')
                      if func_result:
                          logger.info(f"  Observation (Function Result): {func_result[:150]}...")
                      # else: logger.info(f"  Observation: {observation}") # Log full observation if needed

                  # Log model input summary (optional, can be verbose)
                  # if model_invocation_input:
                  #     fm_input = model_invocation_input.get('text','') or model_invocation_input.get('prompt', '')
                  #     logger.info(f"  Model Input: {fm_input[:150]}...")
        else:
            logger.info("No trace events received.")

        return final_response_text

    except ClientError as e:
        logger.error(f"Error invoking agent: {e}")
        logger.error(traceback.format_exc())
        return f"ERROR: {e}"
    except Exception as e:
         logger.error(f"Unexpected error during agent invocation: {e}")
         logger.error(traceback.format_exc())
         return f"UNEXPECTED ERROR: {e}"

# --- Execute Test Invocation ---
agent_alias_id = "TSTALIASID"
session_id = str(uuid.uuid4())
# test_prompt = "What were the key findings in the llama 2 paper? Format the answer as bullet points."
test_prompt = "What do you know about the Cline AI assistant? Format the answer as bullet points."

if agent_id and agent_alias_id:
    agent_response = test_agent_invocation(
        agent_runtime_client=bedrock_agent_runtime_client,
        agent_id=agent_id,
        agent_alias_id=agent_alias_id,
        session_id=session_id,
        prompt=test_prompt
    )
    print(f"\n\n--- Final Agent Response Text ---\n{agent_response}")
else:
    logger.error("Agent ID or Alias ID not available, skipping invocation test.")

logger.info("--- Script Execution Finished --- ")

2025-05-01 07:51:57,799 - INFO - --- Testing Agent Invocation (Agent ID: QRON0D6Y95, Alias: TSTALIASID) ---
2025-05-01 07:51:57,800 - INFO - Session ID: 3cab08ee-2e58-4bf6-959c-e462a5320283
2025-05-01 07:51:57,803 - INFO - Prompt: "What do you know about the Cline AI assistant? Format the answer as bullet points."
2025-05-01 07:51:58,932 - INFO - Agent invocation successful. Processing response stream...



--- Agent Response Stream ---


2025-05-01 07:52:15,257 - INFO - 
--- Invocation Trace Summary ---
2025-05-01 07:52:15,257 - INFO - Trace 1: Type=None, Step=None
2025-05-01 07:52:15,257 - INFO - Trace 2: Type=None, Step=None
2025-05-01 07:52:15,257 - INFO - Trace 3: Type=None, Step=None
2025-05-01 07:52:15,258 - INFO - Trace 4: Type=None, Step=None
2025-05-01 07:52:15,258 - INFO - Trace 5: Type=None, Step=None
2025-05-01 07:52:15,258 - INFO - Trace 6: Type=None, Step=None
2025-05-01 07:52:15,258 - INFO - Trace 7: Type=None, Step=None
2025-05-01 07:52:15,259 - INFO - Trace 8: Type=None, Step=None
2025-05-01 07:52:15,259 - INFO - Trace 9: Type=None, Step=None
2025-05-01 07:52:15,259 - INFO - Trace 10: Type=None, Step=None
2025-05-01 07:52:15,260 - INFO - Trace 11: Type=None, Step=None
2025-05-01 07:52:15,260 - INFO - Trace 12: Type=None, Step=None
2025-05-01 07:52:15,260 - INFO - Trace 13: Type=None, Step=None
2025-05-01 07:52:15,261 - INFO - --- Script Execution Finished --- 


• I do not have any information about the Cline AI assistant in my knowledge base.
--- End of Stream ---


--- Final Agent Response Text ---
• I do not have any information about the Cline AI assistant in my knowledge base.


## 11. Cleanup (Optional)

To avoid incurring further costs, you can delete the AWS resources created by this notebook. Run the following cell to delete the Bedrock Agent (and its action groups) and the Lambda functions.

In [11]:
# Set this to True to run the cleanup steps
RUN_CLEANUP = False

if RUN_CLEANUP:
    logger.warning("--- Starting Resource Cleanup --- ")

    # Delete Agent and Action Groups
    if agent_name: # Use the name defined earlier
        logger.info(f"Deleting agent '{agent_name}' and its resources...")
        delete_agent_and_resources(bedrock_agent_client, agent_name)
    else:
        logger.info("Agent name not defined, skipping agent deletion.")

    # Delete Lambda Functions
    if researcher_lambda_name:
        logger.info(f"Deleting Lambda function '{researcher_lambda_name}'...")
        delete_lambda_function(lambda_client, researcher_lambda_name)
    if writer_lambda_name:
        logger.info(f"Deleting Lambda function '{writer_lambda_name}'...")
        delete_lambda_function(lambda_client, writer_lambda_name)

    # Note: IAM role deletion is often manual or requires careful checks
    # to ensure it's not used by other resources.
    # Consider deleting the role 'bedrock_agent_lambda_exp_role' via the AWS Console/CLI
    # if it's no longer needed.
    logger.warning(f"Cleanup script does not automatically delete the IAM role: {agent_role_name}")
    logger.warning(f"Please delete it manually via AWS Console/CLI if no longer needed.")

    # Note: S3 bucket created for large Lambda uploads is also not deleted automatically.
    # Check S3 for buckets named like 'lambda-deployment-*' and delete if necessary.
    logger.warning("Cleanup script does not automatically delete S3 buckets used for deployment.")

    # Note: Couchbase bucket/scope/collection are not deleted.
    logger.warning(f"Cleanup script does not delete Couchbase resources ({CB_BUCKET_NAME}/{SCOPE_NAME}/{COLLECTION_NAME}).")

    logger.info("--- Resource Cleanup Attempt Complete --- ")
else:
    logger.info("Cleanup skipped (RUN_CLEANUP is False).")

2025-05-01 07:52:15,268 - INFO - Cleanup skipped (RUN_CLEANUP is False).
