# Lab 1: Knowledge Base Setup with Amazon Bedrock

This notebook guides you through creating a Bedrock Knowledge Base containing your underwriting manual. This knowledge base will ground your AI agents in underwriting policy and best practices.

## What You'll Build

By the end of this notebook, you will have:
1. Uploaded underwriting manual documents to S3
2. Created a Bedrock Knowledge Base with OpenSearch Serverless vector store
3. Configured and synced a data source from S3
4. Tested retrieval using Bedrock's Retrieve and RetrieveAndGenerate APIs
5. Persisted Knowledge Base identifiers for use in later labs

## Architecture

The Knowledge Base architecture includes:
- **Source Documents**: Underwriting manual PDFs stored in S3
- **Bedrock Knowledge Base**: Orchestration layer for RAG pipeline
- **Vector Store**: Amazon OpenSearch Serverless for embeddings
- **Foundation Model**: Claude 3.5 Sonnet v2 for embeddings and generation

Let's get started!


<div class="alert alert-block alert-info">
<b>Note:</b> Please make sure to enable <b>Claude 3.7 Sonnet</b> and <b>Titan Text Embeddings V2</b> model access in Amazon Bedrock Console before proceeding.
<br><br>
<b>IMPORTANT:</b> You must enable model access in the Bedrock console:
<ol>
<li>Go to the <a href="https://console.aws.amazon.com/bedrock/home#/modelaccess">Bedrock Model Access page</a></li>
<li>Click "Enable specific models"</li>
<li>Select:
  <ul>
    <li><b>Claude 3.7 Sonnet</b> (for retrieval and generation)</li>
    <li><b>Titan Text Embeddings V2</b> (for creating embeddings)</li>
  </ul>
</li>
<li>Click "Save changes"</li>
</ol>
Run the notebook cell by cell instead of using "Run All Cells" option.
</div>


## 1. Environment Setup

First, we'll install required packages and set up AWS SDK clients.


In [None]:
# Install required packages
%pip install --upgrade pip --quiet
%pip install boto3 --upgrade --quiet
%pip install python-dotenv --quiet
%pip install opensearch-py --quiet
%pip install requests_aws4auth --quiet

In [None]:
import os
import sys
import time
import boto3
import logging
import json
from datetime import datetime

# Initialize AWS clients
s3_client = boto3.client('s3')
sts_client = boto3.client('sts')
bedrock_agent_client = boto3.client('bedrock-agent')
bedrock_agent_runtime_client = boto3.client('bedrock-agent-runtime')
iam_client = boto3.client('iam')
oss_client = boto3.client('opensearchserverless')

# Get account and region info
session = boto3.session.Session()
region = session.region_name
account_id = sts_client.get_caller_identity()["Account"]

# Set up logging
logging.basicConfig(format='[%(asctime)s] %(levelname)s - %(message)s', level=logging.INFO)
logger = logging.getLogger(__name__)

print(f"✅ AWS Setup Complete")
print(f"   Region: {region}")
print(f"   Account ID: {account_id}")
print(f"   Boto3 Version: {boto3.__version__}")


## 2. The Knowledge Base

Before we build our AI agents, let's understand **Retrieval Augmented Generation (RAG)** and **contextual grounding**.

### What is RAG?

**RAG** is a technique that enhances AI models by giving them access to external knowledge sources. Instead of relying solely on their training data, RAG-enabled models can:
1. **Retrieve** relevant information from a knowledge base
2. **Augment** their responses with that specific context
3. **Generate** accurate, citation-backed answers

This is crucial for underwriting because our AI agents need to follow specific company policies and guidelines—not just general knowledge.

### Our Knowledge Base: The Underwriting Manual

In this workshop, our knowledge base is the **underwriting manual**—the same resource human underwriters use to make decisions. Let's take a look at it!


In [None]:
# Let's explore the underwriting manual structure
import os
from pathlib import Path

manual_path = Path("../underwriting-manual")

print("📚 Underwriting Manual Contents:\n")
for item in sorted(manual_path.iterdir()):
    if item.is_dir():
        file_count = len(list(item.glob("**/*.md")))
        print(f"   📁 {item.name}/ ({file_count} files)")
    elif item.suffix == ".md":
        print(f"   📄 {item.name}")

print(f"\n💡 The manual contains comprehensive underwriting guidelines organized by topic.")
print(f"   Human underwriters reference these documents when evaluating applications.")


### Example: Hypertension Guidelines

Let's look at a specific example. Open the file `underwriting-manual/3-medical-impairments/cardiovascular/hypertension.md` and browse through it.

You'll notice it contains:
- **Definition & Classification**: What the condition is and how it's categorized
- **Required Evidence**: What information underwriters need to evaluate the risk
- **Rating Guidelines**: Specific scoring tables based on blood pressure readings, age, complications, etc.
- **Medication Considerations**: How different treatments affect risk assessment

This is the exact same knowledge that human underwriters use! Now, we're going to teach our AI agents to use this same knowledge source to make consistent, policy-compliant decisions.

Let's read a snippet of the hypertension guidelines to see what they look like:


In [None]:
# Read and display a portion of the hypertension guidelines
hypertension_file = Path("../underwriting-manual/3-medical-impairments/cardiovascular/hypertension.md")

with open(hypertension_file, 'r') as f:
    content = f.read()

# Show the first section (Definition & Classification) and part of Rating Guidelines
lines = content.split('\n')

print("📖 Sample from Hypertension Underwriting Guidelines:\n")
print("=" * 80)

# Show first 25 lines (Definition & Classification section)
print('\n'.join(lines[:27]))
print("\n...")
print("\n# Rating Guidelines Section (lines 63-75):\n")
print('\n'.join(lines[63:75]))
print("\n...")

print("=" * 80)
print("\n✨ Notice how this provides structured, actionable guidance for risk assessment!")
print("   These guidelines will ground our AI agents in company underwriting policy.")


## 3. Configuration

Now let's talk through what we're setting up:

We're going to create a **Bedrock Knowledge Base** with an **OpenSearch Serverless vector store** backing it. Here's how the pieces fit together:

- **Bedrock Knowledge Base**: The orchestration layer that manages document ingestion, embeddings, and retrieval
- **OpenSearch Serverless**: A vector database that stores document embeddings for semantic search
- **S3**: Holds the source documents (our underwriting manual)
- **Embedding Model**: Converts text into vectors that capture semantic meaning (we'll use Amazon Titan Text Embeddings V2)

This infrastructure will make the underwriting manual accessible to our AI agents through semantic search—meaning they can find relevant policy information based on meaning, not just keyword matching.

Let's define the configuration:


In [None]:
# Generate unique suffix for resource names
timestamp = datetime.now().strftime("%Y%m%d%H%M%S")[-7:]
suffix = f"{timestamp}"

# Knowledge Base configuration
knowledge_base_name = f"underwriting-kb-{suffix}"
knowledge_base_description = "Underwriting manual and policy guidelines for commercial property insurance"

# S3 bucket for documents
kb_bucket_name = os.environ.get('KB_BUCKET_NAME', f'underwriting-kb-docs-{account_id}')
kb_s3_prefix = f"underwriting-manual-{suffix}/"

# Foundation model configuration
model_arn = "us.anthropic.claude-3-7-sonnet-20250219-v1:0"
embedding_model_id = "amazon.titan-embed-text-v2:0"

# OpenSearch Serverless configuration (keep names short for 32 char policy name limit)
vector_store_name = f"uw-kb-{suffix}"  # Shortened to fit policy name limits
index_name = "uw-index"

print(f"✅ Configuration Set")
print(f"   KB Name: {knowledge_base_name}")
print(f"   S3 Bucket: {kb_bucket_name}")
print(f"   Vector Store: {vector_store_name}")


## 4. Upload Underwriting Manual to S3

We'll upload the underwriting manual documents to S3.


In [None]:
# Create S3 bucket if it doesn't exist
try:
    s3_client.head_bucket(Bucket=kb_bucket_name)
    print(f"✅ S3 bucket '{kb_bucket_name}' already exists")
except:
    try:
        if region == 'us-east-1':
            s3_client.create_bucket(Bucket=kb_bucket_name)
        else:
            s3_client.create_bucket(
                Bucket=kb_bucket_name,
                CreateBucketConfiguration={'LocationConstraint': region}
            )
        print(f"✅ Created S3 bucket: {kb_bucket_name}")
    except Exception as e:
        print(f"⚠️  Error creating bucket: {e}")


In [None]:
# Upload underwriting manual documents to S3
def upload_directory_to_s3(local_path, bucket_name, s3_prefix):
    """Upload all files from a local directory to S3"""
    uploaded_count = 0
    
    for root, dirs, files in os.walk(local_path):
        for file in files:
            if file.startswith('.') or file.endswith(('.pyc', '.pyo')):
                continue
                
            local_file_path = os.path.join(root, file)
            relative_path = os.path.relpath(local_file_path, local_path)
            s3_key = os.path.join(s3_prefix, relative_path).replace("\\", "/")
            
            try:
                s3_client.upload_file(local_file_path, bucket_name, s3_key)
                print(f"   Uploaded: {file}")
                uploaded_count += 1
            except Exception as e:
                print(f"   ⚠️  Failed to upload {file}: {e}")
    
    return uploaded_count

# Path to underwriting manual documents
underwriting_manual_path = "../underwriting-manual"

if os.path.exists(underwriting_manual_path):
    print(f"📤 Uploading documents from {underwriting_manual_path}...")
    uploaded = upload_directory_to_s3(underwriting_manual_path, kb_bucket_name, kb_s3_prefix)
    print(f"\n✅ Uploaded {uploaded} documents to S3")
else:
    print(f"⚠️  Directory not found: {underwriting_manual_path}")
    print(f"   Using sample documents for demonstration")


## 5. Create IAM Role for Knowledge Base

The Knowledge Base needs an IAM role to access S3 and invoke Bedrock models.


In [None]:
# Create IAM role for Knowledge Base
kb_role_name = f"AmazonBedrockExecutionRoleForKB_{suffix}"

# Trust policy - allows Bedrock to assume this role
trust_policy = {
    "Version": "2012-10-17",
    "Statement": [{
        "Effect": "Allow",
        "Principal": {"Service": "bedrock.amazonaws.com"},
        "Action": "sts:AssumeRole",
        "Condition": {
            "StringEquals": {"aws:SourceAccount": account_id},
            "ArnLike": {"aws:SourceArn": f"arn:aws:bedrock:{region}:{account_id}:knowledge-base/*"}
        }
    }]
}

# Permissions policy - S3 access and Bedrock model invocation
permissions_policy = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": ["s3:GetObject", "s3:ListBucket"],
            "Resource": [
                f"arn:aws:s3:::{kb_bucket_name}",
                f"arn:aws:s3:::{kb_bucket_name}/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": ["bedrock:InvokeModel"],
            "Resource": [f"arn:aws:bedrock:{region}::foundation-model/*"]
        },
        {
            "Effect": "Allow",
            "Action": ["aoss:APIAccessAll"],
            "Resource": [f"arn:aws:aoss:{region}:{account_id}:collection/*"]
        }
    ]
}

# Create the IAM role
try:
    response = iam_client.create_role(
        RoleName=kb_role_name,
        AssumeRolePolicyDocument=json.dumps(trust_policy),
        Description="Execution role for Bedrock Knowledge Base"
    )
    kb_role_arn = response['Role']['Arn']
    print(f"✅ Created IAM role: {kb_role_name}")
    
    # Attach inline policy
    iam_client.put_role_policy(
        RoleName=kb_role_name,
        PolicyName="BedrockKBPermissions",
        PolicyDocument=json.dumps(permissions_policy)
    )
    print(f"✅ Attached permissions policy")
    
    # Wait for role to propagate
    print("   Waiting for IAM role to propagate (10 seconds)...")
    time.sleep(10)
    
except iam_client.exceptions.EntityAlreadyExistsException:
    response = iam_client.get_role(RoleName=kb_role_name)
    kb_role_arn = response['Role']['Arn']
    print(f"✅ Using existing IAM role: {kb_role_name}")

print(f"   Role ARN: {kb_role_arn}")


## 6. Create OpenSearch Serverless Collection

We'll create a vector store using OpenSearch Serverless for storing document embeddings.


In [None]:
# Get current role ARN for data access policy
current_role = sts_client.get_caller_identity()['Arn']

# Create policy names (max 32 chars for OSS policies)
# Format: uw-kb-{7-digit-suffix}-{policy-type} = max 25 chars
encryption_policy_name = f"{vector_store_name}-enc"
network_policy_name = f"{vector_store_name}-net"
access_policy_name = f"{vector_store_name}-acc"

# Create encryption policy
encryption_policy = {
    "Rules": [{
        "ResourceType": "collection",
        "Resource": [f"collection/{vector_store_name}"]
    }],
    "AWSOwnedKey": True
}

try:
    oss_client.create_security_policy(
        name=encryption_policy_name,
        type="encryption",
        policy=json.dumps(encryption_policy)
    )
    print(f"✅ Created encryption policy: {encryption_policy_name}")
except oss_client.exceptions.ConflictException:
    print(f"✅ Encryption policy already exists: {encryption_policy_name}")

# Create network policy (allow public access)
network_policy = [{
    "Rules": [
        {
            "ResourceType": "collection",
            "Resource": [f"collection/{vector_store_name}"]
        },
        {
            "ResourceType": "dashboard",
            "Resource": [f"collection/{vector_store_name}"]
        }
    ],
    "AllowFromPublic": True
}]

try:
    oss_client.create_security_policy(
        name=network_policy_name,
        type="network",
        policy=json.dumps(network_policy)
    )
    print(f"✅ Created network policy: {network_policy_name}")
except oss_client.exceptions.ConflictException:
    print(f"✅ Network policy already exists: {network_policy_name}")

# Create data access policy
data_access_policy = [{
    "Rules": [
        {
            "Resource": [f"collection/{vector_store_name}"],
            "Permission": [
                "aoss:CreateCollectionItems",
                "aoss:DeleteCollectionItems",
                "aoss:UpdateCollectionItems",
                "aoss:DescribeCollectionItems"
            ],
            "ResourceType": "collection"
        },
        {
            "Resource": [f"index/{vector_store_name}/*"],
            "Permission": [
                "aoss:CreateIndex",
                "aoss:DeleteIndex",
                "aoss:UpdateIndex",
                "aoss:DescribeIndex",
                "aoss:ReadDocument",
                "aoss:WriteDocument"
            ],
            "ResourceType": "index"
        }
    ],
    "Principal": [kb_role_arn, current_role],
    "Description": "Data access for Bedrock KB"
}]

try:
    oss_client.create_access_policy(
        name=access_policy_name,
        type="data",
        policy=json.dumps(data_access_policy)
    )
    print(f"✅ Created data access policy: {access_policy_name}")
except oss_client.exceptions.ConflictException:
    print(f"✅ Data access policy already exists: {access_policy_name}")


In [None]:
# Create the OpenSearch Serverless collection
try:
    response = oss_client.create_collection(
        name=vector_store_name,
        type="VECTORSEARCH",
        description="Vector store for underwriting knowledge base"
    )
    collection_id = response['createCollectionDetail']['id']
    collection_arn = response['createCollectionDetail']['arn']
    print(f"✅ Creating OpenSearch Serverless collection...")
    print(f"   Collection ID: {collection_id}")
    
    # Wait for collection to become active (2-3 minutes)
    print("   Waiting for collection to become active (this may take 2-3 minutes)...")
    while True:
        response = oss_client.batch_get_collection(names=[vector_store_name])
        status = response['collectionDetails'][0]['status']
        if status == 'ACTIVE':
            collection_endpoint = response['collectionDetails'][0]['collectionEndpoint']
            print(f"\n✅ Collection is active!")
            print(f"   Endpoint: {collection_endpoint}")
            break
        elif status == 'FAILED':
            print(f"\n❌ Collection creation failed")
            break
        else:
            print(f"   Status: {status}...", end='\r')
            time.sleep(30)
            
except oss_client.exceptions.ConflictException:
    # Collection already exists
    response = oss_client.batch_get_collection(names=[vector_store_name])
    collection_id = response['collectionDetails'][0]['id']
    collection_arn = response['collectionDetails'][0]['arn']
    collection_endpoint = response['collectionDetails'][0]['collectionEndpoint']
    print(f"✅ Using existing collection")
    print(f"   Collection ID: {collection_id}")
    print(f"   Endpoint: {collection_endpoint}")


### Create Index in OpenSearch Collection

After the collection is active, we need to create the vector index with the proper Bedrock field mappings.

In [None]:
# Create OpenSearch index for vector storage
from opensearchpy import OpenSearch, RequestsHttpConnection
from requests_aws4auth import AWS4Auth

# Get AWS credentials for OpenSearch authentication
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(
    credentials.access_key,
    credentials.secret_key,
    region,
    'aoss',
    session_token=credentials.token
)

# Connect to OpenSearch Serverless collection
oss_host = collection_endpoint.replace('https://', '')
oss_client_py = OpenSearch(
    hosts=[{'host': oss_host, 'port': 443}],
    http_auth=awsauth,
    use_ssl=True,
    verify_certs=True,
    connection_class=RequestsHttpConnection,
    timeout=300
)

# Define index mapping for Bedrock Knowledge Base
# Field names must match what we specify in the KB fieldMapping
index_body = {
    "settings": {
        "index": {
            "knn": True,
            "knn.algo_param.ef_search": 512
        }
    },
    "mappings": {
        "properties": {
            "vector": {
                "type": "knn_vector",
                "dimension": 1024,  # Titan Embeddings V2 uses 1024 dimensions
                "method": {
                    "name": "hnsw",
                    "engine": "faiss",
                    "parameters": {
                        "ef_construction": 512,
                        "m": 16
                    },
                    "space_type": "l2"
                }
            },
            "text": {
                "type": "text",
                "index": True
            },
            "metadata": {
                "type": "text",
                "index": False
            }
        }
    }
}

# Create the index
try:
    print(f"Creating index '{index_name}' in collection...")
    if not oss_client_py.indices.exists(index=index_name):
        response = oss_client_py.indices.create(index=index_name, body=index_body)
        print(f"✅ Created OpenSearch index: {index_name}")
        
        # Wait for index to be fully ready
        time.sleep(5)
        
        # Verify index exists
        if oss_client_py.indices.exists(index=index_name):
            print(f"✅ Verified index exists and is ready")
        else:
            print(f"⚠️  Warning: Index not found after creation")
    else:
        print(f"✅ Index already exists: {index_name}")
except Exception as e:
    print(f"❌ Error creating index: {e}")
    print(f"   Attempting to connect to: {oss_host}")
    raise

## 7. Create Knowledge Base

Now we'll create the Bedrock Knowledge Base that ties everything together.


In [None]:
# Create Knowledge Base
kb_config = {
    "name": knowledge_base_name,
    "description": knowledge_base_description,
    "roleArn": kb_role_arn,
    "knowledgeBaseConfiguration": {
        "type": "VECTOR",
        "vectorKnowledgeBaseConfiguration": {
            "embeddingModelArn": f"arn:aws:bedrock:{region}::foundation-model/{embedding_model_id}"
        }
    },
    "storageConfiguration": {
        "type": "OPENSEARCH_SERVERLESS",
        "opensearchServerlessConfiguration": {
            "collectionArn": collection_arn,
            "vectorIndexName": index_name,
            "fieldMapping": {
                "vectorField": "vector",
                "textField": "text",
                "metadataField": "metadata"
            }
        }
    }
}

try:
    response = bedrock_agent_client.create_knowledge_base(**kb_config)
    kb_id = response['knowledgeBase']['knowledgeBaseId']
    kb_arn = response['knowledgeBase']['knowledgeBaseArn']
    print(f"✅ Created Knowledge Base")
    print(f"   KB ID: {kb_id}")
    print(f"   KB ARN: {kb_arn}")
    
    # Wait for KB to be ready
    time.sleep(10)
    
except Exception as e:
    print(f"❌ Error creating Knowledge Base: {e}")
    raise


## 8. Create Data Source and Start Ingestion

Let's create a data source pointing to our S3 bucket and start ingesting documents.


In [None]:
# Create Data Source
data_source_config = {
    "knowledgeBaseId": kb_id,
    "name": f"underwriting-manual-source-{suffix}",
    "description": "Underwriting manual documents from S3",
    "dataSourceConfiguration": {
        "type": "S3",
        "s3Configuration": {
            "bucketArn": f"arn:aws:s3:::{kb_bucket_name}",
            "inclusionPrefixes": [kb_s3_prefix]
        }
    },
    "vectorIngestionConfiguration": {
        "chunkingConfiguration": {
            "chunkingStrategy": "NONE"  # No chunking - preserve document structure
        }
    }
}

try:
    response = bedrock_agent_client.create_data_source(**data_source_config)
    data_source_id = response['dataSource']['dataSourceId']
    print(f"✅ Created data source")
    print(f"   Data Source ID: {data_source_id}")
    
except Exception as e:
    print(f"❌ Error creating data source: {e}")
    raise


In [None]:
# Start ingestion job
print("🔄 Starting ingestion job...")
print("   This will read documents from S3, create embeddings, and store them in the vector database")
print("   This may take 2-5 minutes depending on the number of documents...\n")

try:
    response = bedrock_agent_client.start_ingestion_job(
        knowledgeBaseId=kb_id,
        dataSourceId=data_source_id
    )
    ingestion_job_id = response['ingestionJob']['ingestionJobId']
    print(f"   Job ID: {ingestion_job_id}")
    
    # Monitor ingestion progress
    while True:
        response = bedrock_agent_client.get_ingestion_job(
            knowledgeBaseId=kb_id,
            dataSourceId=data_source_id,
            ingestionJobId=ingestion_job_id
        )
        status = response['ingestionJob']['status']
        
        if status == 'COMPLETE':
            stats = response['ingestionJob'].get('statistics', {})
            print(f"\n✅ Ingestion complete!")
            print(f"   Documents scanned: {stats.get('numberOfDocumentsScanned', 'N/A')}")
            print(f"   Documents indexed: {stats.get('numberOfDocumentsIndexed', 'N/A')}")
            if 'numberOfDocumentsFailed' in stats and stats['numberOfDocumentsFailed'] > 0:
                print(f"   Documents failed: {stats['numberOfDocumentsFailed']}")
            break
        elif status == 'FAILED':
            print(f"\n❌ Ingestion failed")
            failure_reasons = response['ingestionJob'].get('failureReasons', [])
            if failure_reasons:
                print(f"   Error: {failure_reasons}")
            break
        else:
            print(f"   Status: {status}...", end='\r')
            time.sleep(15)
            
except Exception as e:
    print(f"❌ Error during ingestion: {e}")
    raise


## 9. Test Knowledge Base Retrieval

Now that our knowledge base is ready, let's validate that it works by testing both retrieval APIs.

The first test is the RetrieveAndGenerate API. This API returns a complete answer with context from the knowledge base.

In [None]:
# Test 1: RetrieveAndGenerate API - Get complete answers with context
query = "What are the key underwriting considerations for Diabetes Type 2? What are the key underwriting considerations for Diabetes Type 2? Please include specific quotes from the underwriting manual. "

print(f"📝 Test Query 1: {query}\n")

response = bedrock_agent_runtime_client.retrieve_and_generate(
    input={"text": query},
    retrieveAndGenerateConfiguration={
        "type": "KNOWLEDGE_BASE",
        "knowledgeBaseConfiguration": {
            "knowledgeBaseId": kb_id,
            "modelArn": f"{model_arn}",
            "retrievalConfiguration": {
                "vectorSearchConfiguration": {
                    "numberOfResults": 5
                }
            }
        }
    }
)

print("💡 Generated Answer:")
print("=" * 80)
print(response['output']['text'])
print("=" * 80)


### Test Retrieve API

This API returns raw chunks with similarity scores. This is useful when we want more control over how the context is used. This is the API we will use with our agents.


In [None]:
# Test 2: Retrieve API - Get raw chunks with scores
query2 = "Lymphoma"

print(f"\n📝 Test Query 2: {query2}\n")

response_retrieve = bedrock_agent_runtime_client.retrieve(
    knowledgeBaseId=kb_id,
    retrievalQuery={"text": query2},
    retrievalConfiguration={
        "vectorSearchConfiguration": {
            "numberOfResults": 3
        }
    }
)

print(f"🔍 Retrieved {len(response_retrieve['retrievalResults'])} chunks:\n")

for i, result in enumerate(response_retrieve['retrievalResults'], 1):
    print(f"Chunk {i}:")
    print(f"  Similarity Score: {result['score']:.4f}")
    print(f"  Text Preview: {result['content']['text'][:200]}...")
    print()


## 11. Persist Environment Variables

Let's save the Knowledge Base identifiers for use in later labs.


In [None]:
from pathlib import Path

# Set environment variables for current session
os.environ['BEDROCK_KB_ID'] = kb_id
os.environ['BEDROCK_KB_ARN'] = kb_arn
os.environ['BEDROCK_KB_DATASOURCE_ID'] = data_source_id
os.environ['KB_BUCKET_NAME'] = kb_bucket_name

# Write to .env file for persistence across sessions
env_file_path = Path("../.env")
env_content = f"""# Bedrock Knowledge Base Configuration
# Generated by Lab 1: Knowledge Base Setup
# Last updated: {datetime.now().isoformat()}

BEDROCK_KB_ID={kb_id}
BEDROCK_KB_ARN={kb_arn}
BEDROCK_KB_DATASOURCE_ID={data_source_id}
KB_BUCKET_NAME={kb_bucket_name}
REGION={region}
ACCOUNT_ID={account_id}

# OpenSearch Serverless
VECTOR_STORE_NAME={vector_store_name}
COLLECTION_ENDPOINT={collection_endpoint}
"""

with open(env_file_path, 'w') as f:
    f.write(env_content)

print("✅ Environment variables persisted")
print(f"   Saved to: {env_file_path.absolute()}")
print(f"\n📋 Configuration Summary:")
print(f"   KB ID: {kb_id}")
print(f"   KB ARN: {kb_arn}")
print(f"   Data Source ID: {data_source_id}")
print(f"   S3 Bucket: {kb_bucket_name}")
print(f"   Vector Store: {vector_store_name}")
print(f"\n💡 These values will be automatically loaded in Labs 2-4")

## 11. Verification and Summary

Let's verify everything is working correctly.


In [None]:
print("🔍 Running verification checks...\n")

checks_passed = 0
total_checks = 4

# Check 1: KB is active
try:
    response = bedrock_agent_client.get_knowledge_base(knowledgeBaseId=kb_id)
    status = response['knowledgeBase']['status']
    if status == 'ACTIVE':
        print("✅ Check 1: Knowledge Base is ACTIVE")
        checks_passed += 1
    else:
        print(f"⚠️  Check 1: Knowledge Base status is {status}")
except Exception as e:
    print(f"❌ Check 1: Failed - {e}")

# Check 2: Can retrieve documents
try:
    test_response = bedrock_agent_runtime_client.retrieve(
        knowledgeBaseId=kb_id,
        retrievalQuery={"text": "underwriting"}
    )
    if len(test_response['retrievalResults']) > 0:
        print("✅ Check 2: Retrieval is working")
        checks_passed += 1
    else:
        print("⚠️  Check 2: No results returned")
except Exception as e:
    print(f"❌ Check 2: Failed - {e}")

# Check 3: Environment variables set
if os.getenv('BEDROCK_KB_ID') and os.getenv('BEDROCK_KB_ARN'):
    print("✅ Check 3: Environment variables are set")
    checks_passed += 1
else:
    print("⚠️  Check 3: Environment variables not set")

# Check 4: .env file exists
if Path("../.env").exists():
    print("✅ Check 4: .env file created")
    checks_passed += 1
else:
    print("⚠️  Check 4: .env file not found")

print(f"\n{'='*80}")
print(f"Verification: {checks_passed}/{total_checks} checks passed")

if checks_passed == total_checks:
    print("\n🎉 SUCCESS! Lab 1 is complete!")
    print("\nYou've successfully:")
    print("  • Uploaded underwriting manual documents to S3")
    print("  • Created a Bedrock Knowledge Base with OpenSearch Serverless")
    print("  • Ingested and indexed all documents")
    print("  • Validated retrieval is working")
    print("  • Persisted configuration for later labs")
    print("\n✨ You're now ready to proceed to Lab 2: Impairment Detection Agent")
else:
    print("\n⚠️  Some checks failed. Please review the errors above.")


## Next Steps

🎉 Congratulations on completing Lab 1!

You've built the foundational Knowledge Base that will ground your AI agents in underwriting policy. The Knowledge Base identifiers have been saved and are ready to use in subsequent labs.

**What's Next:**
- **Lab 2:** Build the Impairment Detection Agent that uses this Knowledge Base to identify risks
- **Lab 3:** Create the Scoring Agent that computes risk scores based on detected impairments
- **Lab 4:** Implement the Actions Agent that routes submissions and triggers workflows

Ready to build your first agent? Proceed to **Lab 2: Impairment Detection Agent**!
