# Lab 1: Amazon Bedrock Fundamentals & Embeddings

**Duration:** 45-60 minutes  
**Cost:** < $0.50 (using Claude Haiku & Titan Embeddings)

## Learning Objectives
1. Set up Amazon Bedrock client and invoke foundation models
2. Work with Claude Haiku for cost-effective text generation
3. Generate and use embeddings with Titan Embeddings
4. Implement semantic search using vector similarity
5. Compare different embedding approaches

## Prerequisites
- AWS Account with Bedrock access
- Model access enabled for: Claude Haiku, Titan Embeddings
- IAM permissions for Bedrock

## 1. Setup and Configuration

In [None]:
# Install required packages
!pip install -q boto3 numpy scikit-learn pandas matplotlib seaborn

In [None]:
import boto3
import json
import numpy as np
from datetime import datetime
from sklearn.metrics.pairwise import cosine_similarity
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Initialize Bedrock client
bedrock_runtime = boto3.client(
    service_name='bedrock-runtime',
    region_name='us-east-1'  # Change to your preferred region
)

bedrock = boto3.client(
    service_name='bedrock',
    region_name='us-east-1'
)

print("✓ Bedrock clients initialized successfully")

## 2. Working with Claude Haiku (Cost-Effective LLM)

In [None]:
def invoke_claude_haiku(prompt, max_tokens=512, temperature=0.7):
    """
    Invoke Claude Haiku - most cost-effective Claude model
    Pricing: ~$0.25 per 1M input tokens, ~$1.25 per 1M output tokens
    """
    body = json.dumps({
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": max_tokens,
        "temperature": temperature,
        "messages": [
            {
                "role": "user",
                "content": prompt
            }
        ]
    })
    
    response = bedrock_runtime.invoke_model(
        modelId='anthropic.claude-3-haiku-20240307-v1:0',
        body=body
    )
    
    response_body = json.loads(response['body'].read())
    return response_body['content'][0]['text']

# Test Claude Haiku
test_prompt = "Explain Amazon Bedrock in 2 sentences."
response = invoke_claude_haiku(test_prompt, max_tokens=100)
print(f"Prompt: {test_prompt}\n")
print(f"Response: {response}")

## 3. Titan Embeddings - Generating Vector Representations

In [None]:
def get_embedding(text, model_id='amazon.titan-embed-text-v1'):
    """
    Generate embeddings using Titan Embeddings
    Pricing: ~$0.0001 per 1K tokens (very cost-effective)
    Returns 1536-dimensional vector
    """
    body = json.dumps({
        "inputText": text
    })
    
    response = bedrock_runtime.invoke_model(
        modelId=model_id,
        body=body
    )
    
    response_body = json.loads(response['body'].read())
    return np.array(response_body['embedding'])

# Test embedding generation
test_text = "Amazon Bedrock is a fully managed service for foundation models."
embedding = get_embedding(test_text)
print(f"Text: {test_text}")
print(f"Embedding dimension: {len(embedding)}")
print(f"First 10 values: {embedding[:10]}")

## 4. Semantic Search Implementation

In [None]:
# Sample knowledge base - AWS services descriptions
documents = [
    "Amazon S3 is an object storage service offering scalability, data availability, security, and performance.",
    "Amazon EC2 provides secure, resizable compute capacity in the cloud as virtual servers.",
    "Amazon RDS makes it easy to set up, operate, and scale a relational database in the cloud.",
    "AWS Lambda lets you run code without provisioning or managing servers, paying only for compute time.",
    "Amazon DynamoDB is a fast and flexible NoSQL database service for any scale.",
    "Amazon Bedrock offers foundation models from leading AI companies through a single API.",
    "Amazon SageMaker helps build, train, and deploy machine learning models quickly.",
    "Amazon CloudWatch monitors AWS resources and applications in real-time.",
    "Amazon VPC lets you provision a logically isolated section of the AWS Cloud.",
    "Amazon SNS is a fully managed messaging service for both application-to-application and application-to-person communication."
]

print(f"Knowledge base created with {len(documents)} documents")

In [None]:
# Generate embeddings for all documents
print("Generating embeddings for knowledge base...")
document_embeddings = []

for i, doc in enumerate(documents):
    embedding = get_embedding(doc)
    document_embeddings.append(embedding)
    print(f"✓ Document {i+1}/{len(documents)} embedded")

document_embeddings = np.array(document_embeddings)
print(f"\nEmbeddings matrix shape: {document_embeddings.shape}")

In [None]:
def semantic_search(query, top_k=3):
    """
    Perform semantic search using cosine similarity
    """
    # Get query embedding
    query_embedding = get_embedding(query).reshape(1, -1)
    
    # Calculate cosine similarity
    similarities = cosine_similarity(query_embedding, document_embeddings)[0]
    
    # Get top K results
    top_indices = np.argsort(similarities)[::-1][:top_k]
    
    results = []
    for idx in top_indices:
        results.append({
            'document': documents[idx],
            'similarity': similarities[idx],
            'rank': len(results) + 1
        })
    
    return results

# Test semantic search
queries = [
    "How can I store files in the cloud?",
    "I need a database that scales automatically",
    "What service helps with machine learning?"
]

for query in queries:
    print(f"\n{'='*80}")
    print(f"Query: {query}")
    print(f"{'='*80}")
    
    results = semantic_search(query, top_k=3)
    
    for result in results:
        print(f"\n[Rank {result['rank']}] Similarity: {result['similarity']:.4f}")
        print(f"Document: {result['document']}")

## 5. RAG (Retrieval-Augmented Generation) Example

In [None]:
def rag_query(question, top_k=2):
    """
    Implement simple RAG pattern:
    1. Retrieve relevant documents using semantic search
    2. Augment prompt with retrieved context
    3. Generate answer using Claude Haiku
    """
    # Retrieve relevant documents
    search_results = semantic_search(question, top_k=top_k)
    
    # Build context from retrieved documents
    context = "\n\n".join([r['document'] for r in search_results])
    
    # Create augmented prompt
    prompt = f"""Based on the following context, please answer the question.

Context:
{context}

Question: {question}

Answer concisely and only use information from the context provided."""
    
    # Generate answer
    answer = invoke_claude_haiku(prompt, max_tokens=200)
    
    return {
        'question': question,
        'context': search_results,
        'answer': answer
    }

# Test RAG
questions = [
    "What AWS service should I use for object storage?",
    "Which service is best for serverless computing?",
    "How can I monitor my AWS resources?"
]

for question in questions:
    print(f"\n{'='*80}")
    result = rag_query(question)
    
    print(f"Question: {result['question']}\n")
    print(f"Retrieved Context:")
    for ctx in result['context']:
        print(f"  - {ctx['document']} (similarity: {ctx['similarity']:.4f})")
    
    print(f"\nAnswer: {result['answer']}")

## 6. Embedding Analysis & Visualization

In [None]:
# Calculate similarity matrix
similarity_matrix = cosine_similarity(document_embeddings)

# Create heatmap
plt.figure(figsize=(12, 10))
sns.heatmap(similarity_matrix, 
            annot=True, 
            fmt='.2f', 
            cmap='coolwarm',
            xticklabels=[f"Doc {i+1}" for i in range(len(documents))],
            yticklabels=[f"Doc {i+1}" for i in range(len(documents))],
            cbar_kws={'label': 'Cosine Similarity'})

plt.title('Document Similarity Matrix', fontsize=16, pad=20)
plt.tight_layout()
plt.show()

# Print document labels
print("\nDocument Index:")
for i, doc in enumerate(documents):
    print(f"Doc {i+1}: {doc[:60]}...")

In [None]:
# PCA visualization of embeddings
from sklearn.decomposition import PCA

# Reduce to 2D
pca = PCA(n_components=2)
embeddings_2d = pca.fit_transform(document_embeddings)

# Plot
plt.figure(figsize=(12, 8))
plt.scatter(embeddings_2d[:, 0], embeddings_2d[:, 1], s=200, alpha=0.6, c=range(len(documents)), cmap='tab10')

# Add labels
for i, doc in enumerate(documents):
    plt.annotate(f"Doc {i+1}", 
                (embeddings_2d[i, 0], embeddings_2d[i, 1]),
                fontsize=10,
                xytext=(5, 5),
                textcoords='offset points')

plt.xlabel(f'PC1 ({pca.explained_variance_ratio_[0]:.1%} variance)', fontsize=12)
plt.ylabel(f'PC2 ({pca.explained_variance_ratio_[1]:.1%} variance)', fontsize=12)
plt.title('2D Visualization of Document Embeddings (PCA)', fontsize=14, pad=15)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print(f"Total variance explained: {sum(pca.explained_variance_ratio_):.1%}")

## 7. Comparing Different Embedding Models (Optional)

In [None]:
def compare_embedding_models(text):
    """
    Compare Titan Embed V1 and V2 (if available)
    """
    models = {
        'Titan V1': 'amazon.titan-embed-text-v1',
        'Titan V2': 'amazon.titan-embed-text-v2:0'
    }
    
    results = {}
    
    for name, model_id in models.items():
        try:
            embedding = get_embedding(text, model_id=model_id)
            results[name] = {
                'dimension': len(embedding),
                'embedding': embedding,
                'model_id': model_id
            }
            print(f"✓ {name}: {len(embedding)} dimensions")
        except Exception as e:
            print(f"✗ {name}: Not available - {str(e)}")
    
    return results

test_text = "Machine learning and artificial intelligence"
comparison = compare_embedding_models(test_text)

## 8. Cost Tracking & Optimization Tips

In [None]:
# Estimated costs for this lab
estimated_costs = {
    'Claude Haiku Invocations': {
        'count': 10,
        'avg_input_tokens': 100,
        'avg_output_tokens': 150,
        'input_cost_per_1M': 0.25,
        'output_cost_per_1M': 1.25
    },
    'Titan Embeddings': {
        'count': 20,
        'avg_tokens': 50,
        'cost_per_1K': 0.0001
    }
}

# Calculate costs
haiku = estimated_costs['Claude Haiku Invocations']
haiku_cost = (
    (haiku['count'] * haiku['avg_input_tokens'] / 1_000_000 * haiku['input_cost_per_1M']) +
    (haiku['count'] * haiku['avg_output_tokens'] / 1_000_000 * haiku['output_cost_per_1M'])
)

embeddings = estimated_costs['Titan Embeddings']
embeddings_cost = (
    embeddings['count'] * embeddings['avg_tokens'] / 1_000 * embeddings['cost_per_1K']
)

total_cost = haiku_cost + embeddings_cost

print("Estimated Lab Costs:")
print(f"  Claude Haiku: ${haiku_cost:.4f}")
print(f"  Titan Embeddings: ${embeddings_cost:.4f}")
print(f"  Total: ${total_cost:.4f}")
print("\nCost Optimization Tips:")
print("  1. Use Claude Haiku instead of Sonnet/Opus for simple tasks")
print("  2. Cache embeddings instead of regenerating them")
print("  3. Batch API calls when possible")
print("  4. Use appropriate max_tokens limits")
print("  5. Monitor usage with CloudWatch")

## 9. Exercise: Build Your Own Semantic Search

**Task:** Create a semantic search system for your own domain

1. Create a list of 10-15 documents about a topic of your choice
2. Generate embeddings for all documents
3. Implement semantic search function
4. Test with 3-5 queries
5. Visualize the results

In [None]:
# Your code here
my_documents = [
    # Add your documents
]

# Generate embeddings

# Implement search

# Test queries

## Summary

In this lab, you learned:
- ✅ How to invoke Amazon Bedrock foundation models
- ✅ Using Claude Haiku for cost-effective text generation
- ✅ Generating embeddings with Titan Embeddings
- ✅ Implementing semantic search with cosine similarity
- ✅ Building a simple RAG system
- ✅ Visualizing embeddings and similarities
- ✅ Understanding Bedrock pricing and optimization

**Next Steps:**
- Lab 2: Amazon Bedrock Knowledge Bases & Advanced RAG
- Lab 3: LLM Evaluation & Agentic AI

**Additional Resources:**
- [Amazon Bedrock Documentation](https://docs.aws.amazon.com/bedrock/)
- [Bedrock Pricing](https://aws.amazon.com/bedrock/pricing/)
- [Bedrock Workshop](https://github.com/aws-samples/amazon-bedrock-workshop)