# SageMaker Image Classification Exercise

This notebook demonstrates Amazon SageMaker's **Image Classification** algorithm for classifying images into categories.

## What You'll Learn
1. How to prepare image data for classification
2. How to train an image classifier with transfer learning
3. How to interpret classification predictions

## What is Image Classification?

Image Classification assigns one or more labels to an entire image. Unlike object detection, it doesn't localize objects.

**SageMaker provides two implementations:**
- **MXNet-based**: CNN with ResNet architecture
- **TensorFlow-based**: Transfer learning from TensorFlow Hub models

## Use Cases

| Industry | Application |
|----------|-------------|
| E-commerce | Product categorization |
| Healthcare | Medical image diagnosis |
| Manufacturing | Defect detection |
| Social Media | Content moderation |
| Agriculture | Plant disease identification |

---

## Step 1: Setup and Imports

In [None]:
import boto3
import sagemaker
from sagemaker import get_execution_role
from sagemaker.image_uris import retrieve
from sagemaker.estimator import Estimator
import numpy as np
import json
import os
from datetime import datetime
from dotenv import load_dotenv
import matplotlib.pyplot as plt

# Load environment variables from .env file
load_dotenv()

# Configure AWS session from environment variables
aws_profile = os.getenv('AWS_PROFILE')
aws_region = os.getenv('AWS_REGION', 'us-west-2')
sagemaker_role = os.getenv('SAGEMAKER_ROLE_ARN')

if aws_profile:
    boto3.setup_default_session(profile_name=aws_profile, region_name=aws_region)
else:
    boto3.setup_default_session(region_name=aws_region)

# SageMaker session and role
sagemaker_session = sagemaker.Session()

if sagemaker_role:
    role = sagemaker_role
else:
    role = get_execution_role()

region = sagemaker_session.boto_region_name

print(f"AWS Profile: {aws_profile or 'default'}")
print(f"SageMaker Role: {role}")
print(f"Region: {region}")
print(f"SageMaker SDK Version: {sagemaker.__version__}")

In [None]:
# Configuration
BUCKET_NAME = sagemaker_session.default_bucket()
PREFIX = "image-classification"

print(f"S3 Bucket: {BUCKET_NAME}")
print(f"S3 Prefix: {PREFIX}")

## Step 2: Understand Data Format

SageMaker Image Classification supports multiple data formats:

### RecordIO Format (Recommended for MXNet)
Binary format packing images and labels together.

### Image Files + LST File
```
# train.lst format: index \t label \t path
0\t0\ttrain/cat/image001.jpg
1\t0\ttrain/cat/image002.jpg
2\t1\ttrain/dog/image001.jpg
3\t1\ttrain/dog/image002.jpg
```

### Augmented Manifest Format
```json
{"source-ref": "s3://bucket/image.jpg", "class": 0}
```

### Multi-label Classification
```
# For multi-label: label is comma-separated indices
0\t0,2\ttrain/image001.jpg  # Classes 0 and 2
```

In [None]:
def generate_sample_lst_file(num_samples=100, num_classes=5):
    """
    Generate a sample LST file content.
    
    LST format: index \t label \t image_path
    """
    np.random.seed(42)
    
    class_names = ['airplane', 'automobile', 'bird', 'cat', 'dog']
    lines = []
    
    for i in range(num_samples):
        label = np.random.randint(0, num_classes)
        class_name = class_names[label]
        image_path = f"train/{class_name}/image_{i:04d}.jpg"
        lines.append(f"{i}\t{label}\t{image_path}")
    
    return lines, class_names

lst_lines, class_names = generate_sample_lst_file()

print("Sample LST file content:")
print("Format: index\tlabel\tpath")
print("-" * 50)
for line in lst_lines[:10]:
    print(line)
print("...")

print(f"\nClasses: {class_names}")

## Step 3: Training Configuration

### Key Hyperparameters

| Parameter | Description | Default |
|-----------|-------------|---------|
| `num_classes` | Number of output classes | Required |
| `num_training_samples` | Number of training images | Required |
| `num_layers` | ResNet depth: 18, 34, 50, 101, 152, 200 | 152 |
| `use_pretrained_model` | Use ImageNet pretrained weights | 1 |
| `epochs` | Training epochs | 30 |
| `learning_rate` | Initial learning rate | 0.1 |
| `mini_batch_size` | Batch size | 32 |
| `image_shape` | "channels,height,width" | 3,224,224 |
| `augmentation_type` | crop, crop_color, crop_color_transform | crop_color_transform |
| `top_k` | Report top-k accuracy | 5 |
| `multi_label` | Enable multi-label classification | 0 |

### Training Modes

| Mode | Description | When to Use |
|------|-------------|-------------|
| Full Training | Train from scratch | Large datasets, unique domains |
| Transfer Learning | Fine-tune pretrained model | Small datasets, similar to ImageNet |

In [None]:
# Get Image Classification container image
image_classification_image = retrieve(
    framework='image-classification',
    region=region,
    version='1'
)

print(f"Image Classification Image URI: {image_classification_image}")

In [None]:
# Example estimator configuration (for reference)
print("""
Image Classification Estimator Configuration:
=============================================

image_classification_estimator = Estimator(
    image_uri=image_classification_image,
    role=role,
    instance_count=1,
    instance_type='ml.p3.2xlarge',  # GPU required
    output_path=f's3://{BUCKET_NAME}/{PREFIX}/output',
    sagemaker_session=sagemaker_session,
    base_job_name='image-classification'
)

hyperparameters = {
    "num_classes": 5,
    "num_training_samples": 10000,
    "num_layers": 50,              # ResNet-50
    "use_pretrained_model": 1,     # Transfer learning
    "epochs": 30,
    "learning_rate": 0.001,        # Lower for fine-tuning
    "lr_scheduler_step": "10,20",
    "lr_scheduler_factor": 0.1,
    "mini_batch_size": 32,
    "image_shape": "3,224,224",
    "augmentation_type": "crop_color_transform",
    "optimizer": "sgd",
    "momentum": 0.9,
    "weight_decay": 0.0001,
    "top_k": 5,
    "precision_dtype": "float32",
}

Data channels:
- train: Training images (RecordIO or image folder + lst)
- validation: Validation images
- train_lst: Training list file (if using images)
- validation_lst: Validation list file
""")

## Step 4: Understanding Model Output

The model outputs a probability distribution over classes.

In [None]:
def parse_classification_output(probabilities, class_names, top_k=5):
    """
    Parse classification output and return top-k predictions.
    
    Args:
        probabilities: Array of class probabilities
        class_names: List of class names
        top_k: Number of top predictions to return
    """
    # Sort by probability descending
    top_indices = np.argsort(probabilities)[::-1][:top_k]
    
    results = []
    for idx in top_indices:
        results.append({
            'class': class_names[idx],
            'class_id': idx,
            'probability': probabilities[idx]
        })
    
    return results

# Simulate model output
np.random.seed(42)
sample_probs = np.random.dirichlet(np.ones(5))  # Random probability distribution

predictions = parse_classification_output(sample_probs, class_names)

print("Sample classification output:")
print("=" * 40)
for pred in predictions:
    print(f"  {pred['class']}: {pred['probability']:.4f}")

In [None]:
def visualize_predictions(probabilities, class_names, title="Predictions"):
    """
    Visualize classification probabilities as a bar chart.
    """
    fig, ax = plt.subplots(figsize=(10, 5))
    
    colors = plt.cm.viridis(probabilities / max(probabilities))
    bars = ax.barh(class_names, probabilities, color=colors)
    
    ax.set_xlabel('Probability')
    ax.set_title(title)
    ax.set_xlim(0, 1)
    
    # Add value labels
    for bar, prob in zip(bars, probabilities):
        ax.text(bar.get_width() + 0.02, bar.get_y() + bar.get_height()/2,
               f'{prob:.4f}', va='center')
    
    plt.tight_layout()
    plt.show()

visualize_predictions(sample_probs, class_names, "Sample Image Classification")

## Step 5: Evaluation Metrics

### Single-label Classification
- **Top-1 Accuracy**: Correct if top prediction matches label
- **Top-5 Accuracy**: Correct if label is in top 5 predictions

### Multi-label Classification
- **Precision**: Fraction of predicted labels that are correct
- **Recall**: Fraction of true labels that are predicted
- **F1 Score**: Harmonic mean of precision and recall

In [None]:
def calculate_topk_accuracy(predictions_list, true_labels, k=5):
    """
    Calculate top-k accuracy.
    
    Args:
        predictions_list: List of probability arrays
        true_labels: List of true class indices
        k: Top-k parameter
    """
    correct = 0
    for probs, true_label in zip(predictions_list, true_labels):
        top_k_preds = np.argsort(probs)[::-1][:k]
        if true_label in top_k_preds:
            correct += 1
    
    return correct / len(predictions_list)

# Simulate evaluation
np.random.seed(42)
num_test_samples = 100

# Generate random predictions and labels
test_predictions = [np.random.dirichlet(np.ones(5)) for _ in range(num_test_samples)]
test_labels = np.random.randint(0, 5, num_test_samples)

top1_acc = calculate_topk_accuracy(test_predictions, test_labels, k=1)
top3_acc = calculate_topk_accuracy(test_predictions, test_labels, k=3)
top5_acc = calculate_topk_accuracy(test_predictions, test_labels, k=5)

print("Evaluation Metrics (random predictions):")
print("=" * 40)
print(f"Top-1 Accuracy: {top1_acc:.4f}")
print(f"Top-3 Accuracy: {top3_acc:.4f}")
print(f"Top-5 Accuracy: {top5_acc:.4f}")

## Step 6: Data Augmentation

SageMaker Image Classification supports these augmentation types:

| Type | Description |
|------|-------------|
| `crop` | Random cropping |
| `crop_color` | Random crop + color jittering |
| `crop_color_transform` | Crop + color + geometric transforms |

In [None]:
print("""
Data Augmentation Options:
==========================

1. crop:
   - Random crop from larger image
   - Horizontal flip

2. crop_color:
   - All of 'crop' plus:
   - Brightness adjustment
   - Saturation adjustment
   - Hue adjustment

3. crop_color_transform:
   - All of 'crop_color' plus:
   - Rotation
   - Shear
   - Aspect ratio change

Best Practices:
- Use 'crop_color_transform' for small datasets
- Use 'crop' for large datasets (faster training)
- Match image_shape to your network (224x224 for ResNet)
""")

---

## Summary

In this exercise, you learned:

1. **Data Formats**:
   - RecordIO (binary, efficient)
   - Image folder + LST file
   - Augmented manifest

2. **Key Hyperparameters**:
   - `num_layers`: ResNet depth (50 recommended)
   - `use_pretrained_model`: Transfer learning
   - `image_shape`: Input dimensions
   - `augmentation_type`: Data augmentation

3. **Training Modes**:
   - Full training: Large datasets
   - Transfer learning: Small datasets

4. **Output Format**:
   - Probability distribution over classes
   - Top-k predictions

5. **Evaluation Metrics**:
   - Top-1, Top-5 accuracy
   - Precision, Recall, F1 for multi-label

### Instance Requirements

| Task | Instance Types |
|------|----------------|
| Training | ml.p2.xlarge, ml.p3.2xlarge, ml.g4dn.xlarge |
| Inference | ml.m5.large (CPU) or ml.c5.large |

### Next Steps

- Prepare real image dataset
- Use SageMaker Ground Truth for labeling
- Try TensorFlow Image Classification for more models
- Implement multi-label classification