# Chapter 8 - Custom Models: Fine-tuning on Amazon Bedrock

## Overview
This notebook demonstrates how to fine-tune foundation models on Amazon Bedrock for specialized tasks. We'll walk through the complete process of customizing Amazon Titan Text Lite to perform dialogue topic identification and other domain-specific tasks.

## Introduction
This notebook demonstrates how to fine-tune foundation models on Amazon Bedrock for specialized tasks. We'll walk through the complete process of customizing Amazon Titan Text Lite to perform dialogue topic identification. This approach allows you to create custom AI models tailored to your specific use cases while leveraging the power of pre-trained foundation models.

## Prerequisites
- AWS account with Amazon Bedrock access
- Permissions to create IAM roles and S3 buckets
- Access to Amazon Titan Text Lite model
- Python environment with required packages

## Setup

### Install Required Dependencies

In [None]:
# Install the HuggingFace datasets library for data processing
!pip install datasets==2.15.0

### Import Libraries

In [None]:
# Core libraries for AWS services and data processing
import boto3      # AWS SDK for Python
import json       # JSON data handling
import datetime   # Timestamp generation
import os         # File system operations

## AWS Infrastructure Setup

### Create S3 Bucket and IAM Role

In [None]:
# Initialize AWS clients
iam = boto3.client("iam")
s3 = boto3.client('s3')

# Get current AWS account ID for unique resource naming
account_id = boto3.client('sts').get_caller_identity()['Account']
bucket_name = f"bedrock-finetuning-{account_id}"

# Create S3 bucket for storing training data and model outputs
print(f"Creating S3 bucket: {bucket_name}")
s3.create_bucket(Bucket=bucket_name)

# Create IAM role that Bedrock can assume for fine-tuning
role_name = f"Bedrock-Finetuning-Role-{account_id}"
print(f"Creating IAM role: {role_name}")

role = iam.create_role(
    RoleName=role_name,
    AssumeRolePolicyDocument=json.dumps({
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Principal": {
                    "Service": "bedrock.amazonaws.com"
                },
                "Action": "sts:AssumeRole"
            }
        ] 
    })
)["Role"]["RoleName"]

# Create IAM policy with S3 permissions for the training bucket
policy_arn = iam.create_policy(
    PolicyName="Bedrock-Finetuning-Role-Policy",
    PolicyDocument=json.dumps({
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "s3:GetObject",      # Read training data
                    "s3:PutObject",      # Write model outputs
                    "s3:ListBucket"      # List bucket contents
                ],
                "Resource": [
                    f"arn:aws:s3:::{bucket_name}",
                    f"arn:aws:s3:::{bucket_name}/*"
                ]
            }
        ]
    })
)["Policy"]["Arn"]

# Attach the policy to the role
iam.attach_role_policy(
    RoleName=role,
    PolicyArn=policy_arn
)

print("✅ AWS infrastructure setup complete!")

## Dataset Preparation

### Load and Process Dataset

In [None]:
# Load the DialogSum dataset from HuggingFace
# Citation: https://huggingface.co/datasets/knkarthick/dialogsum
from datasets import load_dataset

print("📥 Loading DialogSum dataset from HuggingFace...")
ds = load_dataset("knkarthick/dialogsum", split="train")
print(f"Dataset loaded with {len(ds)} examples")

In [None]:
# Clean and prepare the dataset
print("🔧 Processing dataset...")

# Remove unnecessary columns (we only need 'dialogue' and 'topic')
dataset = ds.remove_columns("id")
dataset = dataset.remove_columns("summary")

# Use a subset of 10,000 examples for faster training
dataset = dataset.select(range(10000))
print(f"Using {len(dataset)} examples for training")

# Split dataset: 90% training, 10% validation
train_and_validation_dataset = dataset.train_test_split(test_size=0.1)
print(f"Training examples: {len(train_and_validation_dataset['train'])}")
print(f"Validation examples: {len(train_and_validation_dataset['test'])}")

### Format and Save Dataset

In [None]:


# Create directory for dataset files
dataset_dir = "dataset"

def format_save_dataset(filename, dataset):
    """
    Convert dataset to JSONL format required by Bedrock fine-tuning.
    Each line contains a prompt-completion pair for topic identification.
    """
    os.makedirs(dataset_dir, exist_ok=True)
    
    with open(f"{dataset_dir}/{filename}", "w") as f:
        for i in dataset:
            dialogue = i["dialogue"]
            topic = i["topic"]
            
            # Format as prompt-completion pair for fine-tuning
            template = {
                "prompt": f"Identify the key topic representing the dialogue. \n\nDialogue: {dialogue}",
                "completion": f"{topic}",
            }
            
            # Write each example as a separate JSON line
            json.dump(template, f)
            f.write('\n')
    
    print(f"✅ Saved {filename} with {len(dataset)} examples")

# Save training and validation datasets
format_save_dataset("train.jsonl", train_and_validation_dataset["train"])
format_save_dataset("validation.jsonl", train_and_validation_dataset["test"])

### Upload Dataset to S3

In [None]:
# Upload formatted datasets to S3 bucket
print("📤 Uploading datasets to S3...")

s3 = boto3.client('s3')
account_id = boto3.client('sts').get_caller_identity()['Account']
bucket_name = f"bedrock-finetuning-{account_id}"

# Upload all files in the dataset directory
uploaded_files = []
for root, dirs, files in os.walk(dataset_dir):
    for file in files:
        full_path = os.path.join(root, file)
        relative_path = os.path.relpath(full_path, dataset_dir)
        
        # Upload file to S3
        s3.upload_file(full_path, bucket_name, relative_path)
        uploaded_files.append(relative_path)
        print(f"  ✅ Uploaded: {relative_path}")

print(f"📊 Dataset upload complete! Files available at s3://{bucket_name}/")

## Fine-tuning Job Creation

### Configure and Create Job

In [None]:
# Initialize Bedrock client for model customization
bedrock = boto3.client(service_name='bedrock')
account_id = boto3.client('sts').get_caller_identity()['Account']

print("🔧 Setting up Bedrock fine-tuning job...")

In [None]:
# Configure fine-tuning job parameters
datetime_string = datetime.datetime.now().strftime("%Y%m%d%H%M%S")

# Job configuration
customizationType = "FINE_TUNING"
customModelName = "custom-titan-lite-model"
baseModelIdentifier = "arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-text-lite-v1:0:4k"
roleArn = f"arn:aws:iam::{account_id}:role/Bedrock-Finetuning-Role-{account_id}"
jobName = f"Titan-Lite-Finetune-Job-{datetime_string}"

# Hyperparameters for training
hyperParameters = {
    "epochCount": "1",                    # Number of training epochs
    "batchSize": "1",                    # Batch size for training
    "learningRate": ".0001",              # Learning rate
    "learningRateWarmupSteps": "0"       # Warmup steps
}

print(f"📋 Job Name: {jobName}")
print(f"🎯 Base Model: Amazon Titan Text Lite v1")
print(f"⚙️ Hyperparameters: {hyperParameters}")

# Create the fine-tuning job
print("🚀 Starting fine-tuning job...")
response_ft = bedrock.create_model_customization_job(
    jobName=jobName,
    customModelName=customModelName,
    customizationType=customizationType,
    roleArn=roleArn,
    baseModelIdentifier=baseModelIdentifier,
    hyperParameters=hyperParameters,
    # Training data location
    trainingDataConfig={"s3Uri": f"s3://bedrock-finetuning-{account_id}/train.jsonl"},
    # Validation data location
    validationDataConfig={'validators': [{"s3Uri": f"s3://bedrock-finetuning-{account_id}/validation.jsonl"}]},
    # Output location for trained model
    outputDataConfig={"s3Uri": f"s3://bedrock-finetuning-{account_id}/finetuning-output"},
)

In [None]:
# Get the job ARN for tracking
jobArn = response_ft.get('jobArn')
print(f"✅ Fine-tuning job created successfully!")
print(f"📍 Job ARN: {jobArn}")
print(f"⏱️ Training will take several hours to complete...")

## Monitoring Training Progress

### Check Job Status

In [None]:
# Check the current status of the fine-tuning job
print(f"🔍 Checking status for job: {jobName}")

job_details = bedrock.get_model_customization_job(jobIdentifier=jobName)
status = job_details["status"]

print(f"📊 Current Status: {status}")

# Display additional job information
if 'creationTime' in job_details:
    print(f"🕐 Started: {job_details['creationTime']}")
if 'endTime' in job_details:
    print(f"🏁 Completed: {job_details['endTime']}")

# Status-specific messages
if status == 'InProgress':
    print("⏳ Training is in progress. Please wait and check again later.")
elif status == 'Completed':
    print("✅ Training completed successfully! Ready for provisioned throughput.")
elif status == 'Failed':
    print("❌ Training failed. Check the job details for error information.")
    if 'failureMessage' in job_details:
        print(f"Error: {job_details['failureMessage']}")
else:
    print(f"ℹ️ Status: {status}")

## Provisioned Throughput Setup

### Create Provisioned Throughput

In [None]:
# Purchase provisioned throughput for the custom model
print("💰 Setting up provisioned throughput...")

# ⚠️ WARNING: This will incur hourly charges!
print("⚠️  WARNING: Provisioned throughput incurs hourly charges!")
print("💡 Remember to delete the provisioned throughput when done testing.")

response_pt = bedrock.create_provisioned_model_throughput(
    modelId=customModelName,                           # Our custom model
    provisionedModelName="ProvisionedCustomTitanLite", # Name for the provisioned instance
    modelUnits=1                                       # Minimum 1 unit required
)

provisionedModelArn = response_pt.get('provisionedModelArn')

print(f"✅ Provisioned throughput created!")
print(f"📍 Provisioned Model ARN: {provisionedModelArn}")
print(f"⏱️ Provisioning may take a few minutes to become active...")

## Testing the Fine-tuned Model

In [None]:
# Test the fine-tuned model with a sample dialogue
bedrock_runtime = boto3.client(service_name='bedrock-runtime')

# Example test prompt (replace with your own dialogue)
test_dialogue = """
Person A: Hi, I'm calling about my credit card bill. I noticed some charges I don't recognize.
Person B: I'd be happy to help you with that. Can you provide me with your account number?
Person A: Sure, it's 1234-5678-9012-3456.
Person B: Thank you. I can see the charges you're referring to. Let me investigate these for you.
"""

# Format the prompt using the same template as training
prompt = f"Identify the key topic representing the dialogue. \n\nDialogue: {test_dialogue}"

print(f"🧪 Testing with prompt:")
print(f"{prompt}\n")

# Configure inference parameters
body = {
    "prompt": prompt,
    "temperature": 0.5,    # Controls randomness (0.0 = deterministic, 1.0 = very random)
    "p": 0.9,             # Top-p sampling (nucleus sampling)
    "max_tokens": 512,    # Maximum tokens to generate
}

print("🚀 Invoking fine-tuned model...")

# Call the fine-tuned model
response = bedrock_runtime.invoke_model(
    modelId=provisionedModelArn,  # Use our provisioned custom model
    body=json.dumps(body)
)

# Parse and display the response
response_body = json.loads(response['body'].read())
generated_text = response_body.get('outputText', '')

print(f"🎯 Model Response:")
print(f"{generated_text}")

print("✅ Fine-tuned model test complete!")

## 🧹 Cleanup (Important!)

**⚠️ Don't forget to clean up resources to avoid ongoing charges:**

1. **Delete Provisioned Throughput** (incurs hourly charges)
2. **Delete Custom Model** (if no longer needed)
3. **Delete S3 Bucket** (if no longer needed)
4. **Delete IAM Role and Policy** (if no longer needed)

```python
# Delete provisioned throughput
bedrock.delete_provisioned_model_throughput(
    provisionedModelId='ProvisionedCustomTitanLite'
)

# Delete custom model
bedrock.delete_custom_model(modelIdentifier=customModelName)
```

## Conclusion

In this notebook, we've demonstrated the complete workflow for fine-tuning a foundation model on Amazon Bedrock. This process enables you to create specialized AI models for specific tasks while leveraging the capabilities of pre-trained foundation models.

Key accomplishments:
1. **Infrastructure Setup**: We created the necessary AWS resources including an S3 bucket for data storage and IAM roles with appropriate permissions.

2. **Dataset Preparation**: We processed the DialogSum dataset to extract dialogue-topic pairs, formatted them for fine-tuning, and uploaded them to S3.

3. **Model Fine-tuning**: We configured and launched a fine-tuning job on Amazon Titan Text Lite, specifying appropriate hyperparameters for our task.

4. **Deployment**: We set up provisioned throughput to make our custom model available for inference.

5. **Testing**: We demonstrated how to use the fine-tuned model to identify topics from dialogue samples.

This fine-tuning approach has several advantages:
- **Task Specialization**: The model is optimized specifically for dialogue topic identification
- **Improved Performance**: Fine-tuned models typically outperform base models on specific tasks
- **Consistent Output Format**: Responses follow the training patterns, providing more predictable outputs
- **API Compatibility**: Uses the same API as foundation models, making integration seamless

Important considerations:
- **Cost Management**: Remember to delete provisioned throughput when not in use to avoid ongoing charges
- **Training Time**: Fine-tuning jobs can take several hours depending on dataset size and model complexity
- **Dataset Quality**: The quality and representativeness of your training data significantly impacts model performance