# Foundation Model Fine-tuning Summer School
## Using AWS SageMaker for Earth Observation Model Training

Welcome to this HDCRS summer school notebook on **practical guide to fine-tuning the Prithvi Earth Observation (EO) v2.0 model**! This hands-on session will teach you how to adapt the pre-trained model for specific Earth observation tasks using cloud-based infrastructure (AWS Sagemker).

## What You'll Learn

By the end of this notebook, you will understand:

### **Foundation Models & Transfer Learning Concepts**
- **Foundation Models Fine-tuning**: The process of taking a pre-trained model and training it further on your specific dataset. (Foundation models are Large pre-trained models trained on vast datasets that capture general patterns and can be adapted for specific tasks)
- **Transfer Learning**: Leveraging knowledge learned from one task to improve performance on a related task

### **Technical Skills**
- How to use **AWS SageMaker** for model training in the cloud
- Working with **Hugging Face datasets** for machine learning workflows
- Understanding training parameters and their impact on model performance
- Deploying the finetuned models for Application consumption

### **Real-World Application**
- Fine-tuning the **Prithvi Earth Observation (EO) v2.0** model for burn scar detection
- Using **Harmonized Landsat Sentinel (HLS)** satellite imagery
- Understanding the complete ML pipeline from data preparation to model deployment using cloud resources

## Usecase for Hands-on: Burn scar detection using Harmonised Landsat Sentinel and Prithvi EO v2.0

**Burn scar detection** is crucial for:
- **Environmental monitoring**: Understanding wildfire impact and recovery
- **Climate research**: Tracking ecosystem changes over time
- **Disaster response**: Rapid assessment of fire damage
- **Policy making**: Evidence-based land management decisions

## Key Concepts

### **AWS SageMaker**
Amazon SageMaker is a fully managed machine learning service that provides:
- **Managed training infrastructure**: No need to set up servers or manage hardware
- **Scalable compute resources**: From single GPUs to distributed multi-node training
- **MLOps capabilities**: Model versioning, monitoring, and deployment

### **Prithvi EO Foundation Model**
- **Pre-trained on massive satellite imagery datasets**
- **Multiple model sizes**: 100M, 300M, and 600M parameters
- **Temporal and spatial understanding**: Designed for Earth observation tasks
- **Transfer learning ready**: Can be fine-tuned for various geospatial applications

## Setup Instructions

**Before starting:**
1. Go to "Kernel" → Select "prithvi_eo" 
2. Ensure you have AWS credentials configured
3. Have an S3 bucket ready for data storage

Let's begin our journey into foundation model fine-tuning! 🚀

## Libraries Used

### **Core Libraries**
- **`boto3`**: AWS SDK for Python - allows us to interact with AWS services programmatically
- **`sagemaker`**: High-level interface for AWS SageMaker - simplifies model training and deployment
- **`rasterio`**: Geospatial raster data processing - reads satellite imagery files

### **Hugging Face**
- **`huggingface_hub`**: Access to pre-trained models and datasets from Hugging Face Hub
- **`snapshot_download`**: Downloads entire repositories (models/datasets) from Hugging Face
- **`hf_hub_download`**: Downloads specific files from Hugging Face repositories

### **Why These Libraries Matter**
Each library serves a specific purpose in our ML pipeline:
1. **Data handling**: `rasterio`, `numpy` for processing satellite imagery
2. **Cloud infrastructure**: `boto3`, `sagemaker` for scalable training
3. **Configuration management**: `yaml` for reproducible experiments
4. **Pre-trained assets**: `huggingface_hub` for accessing foundation models

In [None]:
import boto3
import numpy as np
import yaml
import rasterio
import sagemaker

from datetime import time
from glob import glob

from huggingface_hub import hf_hub_download, snapshot_download
from pathlib import Path

from sagemaker import get_execution_role
from sagemaker.estimator import Estimator

## Dataset preparation

For this hands-on session, Burn Scars example will be used for fine-tuning. All of the data and pre-trained models are available in Huggingface. Huggingface packages and git will be utilized to download, and prepare datasets and pretrained models.

->

## Understanding Dataset: HLS Burn Scars

### **What is HLS Data?**
**Harmonized Landsat Sentinel (HLS)** combines data from two major satellite missions:
- **Landsat 8/9 (NASA)**: 30m resolution, 16-day revisit cycle
- **Sentinel-2 (ESA)**: 10-20m resolution, 5-day revisit cycle

**Why combine them?**
- **Higher temporal resolution**: More frequent observations for change detection
- **Consistent data format**: Harmonized processing makes data easier to use
- **Better coverage**: Reduces data gaps from cloud cover or orbit patterns

### **Burn Scar Dataset**
The dataset we'll use contains:
- **Training samples**: HLS imagery with burn scar labels
- **Validation samples**: Used to monitor training progress
- **Test samples**: Final evaluation of model performance

### **Dataset Structure**
Each sample typically includes:
- **Multi-spectral imagery**: 6 bands (Blue, Green, Red, NIR, SWIR1, SWIR2)
- **Labels**: Binary masks (burn scar vs. no burn scar)
- **Metadata**: Location, date, acquisition parameters

### **Why Hugging Face for Datasets?**
- **Standardized format**: Easy to share and reproduce
- **Version control**: Track dataset changes over time
- **Community access**: Researchers can build upon each other's work
- **Integration**: Seamless connection with ML frameworks

Let's download and explore our dataset! 📊

### Step 1: Download HLS Burn Scars Dataset

**Dataset Source**: [ibm-nasa-geospatial/hls_burn_scars](https://huggingface.co/datasets/ibm-nasa-geospatial/hls_burn_scars)

**What happens in this step:**
1. **`snapshot_download()`**: Downloads the entire dataset repository from Hugging Face
2. **`allow_patterns`**: Only downloads the compressed tar.gz file (saves bandwidth)
3. **`tar -xvzf`**: Extracts the compressed dataset to our local directory

**File Organization:**
```
../data/hls_burn_scars/
├── training/          # Training images and labels
├── validation/        # Validation images and labels  
└── test/             # Test images and labels (for final evaluation)
```

In [None]:
DATA_PATH = "../data/hls_burn_scars"

snapshot_download(
    repo_id="ibm-nasa-geospatial/hls_burn_scars",
    allow_patterns="hls_burn_scars.tar.gz",
    repo_type="dataset",
    local_dir=DATA_PATH,
)
!tar -xvzf ../data/hls_burn_scars/hls_burn_scars.tar.gz -C ../data/hls_burn_scars

### Step 2: Upload Data to AWS S3

**Why use S3?**
- **Scalable storage**: Handle datasets of any size
- **Global access**: Training jobs can access data from anywhere
- **Cost-effective**: Pay only for what you store and transfer
- **Integration**: Native integration with SageMaker training jobs

**Important Note**: Replace `<BUCKET_NAME>` with your actual S3 bucket name

**What's happening here:**
1. **`sagemaker.Session()`**: Creates a session to manage AWS interactions
2. **`upload_data()`**: Uploads local data to S3 with organized prefixes
3. **Key prefixes**: Organize data in S3 like folders (datasets/training/, etc.)

**S3 Structure Created:**
```
s3://your-bucket/
├── datasets/
│   ├── training/      # Training data
│   ├── validation/    # Validation data
│   └── test/          # Test data
└── data/
    └── configs/       # Configuration files (added later)
```

In [None]:
# Prepare sagemaker session with files uploaded to s3 bucket

BUCKET_NAME = <BUCKET-NAME>

# Prepare sagemaker session with files uploaded to s3 bucket
sagemaker_session = sagemaker.Session()
train_images = sagemaker_session.upload_data(path=f'{DATA_PATH}/training', bucket=BUCKET_NAME, key_prefix='data/training')
val_images = sagemaker_session.upload_data(path=f'{DATA_PATH}/validation', bucket=BUCKET_NAME, key_prefix='data/validation')
test_images = sagemaker_session.upload_data(path=f'{DATA_PATH}/validation', bucket=BUCKET_NAME, key_prefix='data/test')

## Prepare config and parameters


**config file** contains all the settings that determine:
- **Model architecture**: Which foundation model to use, how many layers, etc.
- **Training parameters**: Learning rate, batch size, number of epochs
- **Data settings**: Where to find data, how to preprocess it
- **Hardware settings**: GPU usage, memory allocation

### **Why Use YAML?**
**YAML (Yet Another Markup Language)** is human-readable and perfect for configurations:
- **Easy to read**: Clear structure with indentation
- **Easy to modify**: Change parameters without touching code
- **Version control friendly**: Track configuration changes over time
- **Reproducible**: Same config = same experimental conditions

### **Key Configuration Sections**
1. **Model**: Foundation model selection and architecture
2. **Data**: Dataset paths, preprocessing, and augmentation
3. **Trainer**: Training parameters, callbacks, and logging
4. **Optimization**: Learning rate, optimizer settings

In [None]:
with open('../configs/prithvi_v2_eo_300_tl_unet_burnscars.yaml') as config_file:
    config = yaml.safe_load(config_file)

config

### Dataset Statistics

**Why calculate band statistics?**
In satellite imagery, different spectral bands have vastly different value ranges. **Normalization** ensures:
- **Stable training**: Prevents any single band from dominating the learning process
- **Faster convergence**: Helps the model learn more efficiently
- **Better performance**: Standardized inputs lead to better feature learning

**The normalization Applied:**
```
normalized_value = (pixel_value - mean) / standard_deviation
```

This transforms all pixel values to have **mean = 0** and **standard deviation = 1** (called **z-score normalization**).

In [None]:
def calculate_band_statistics(image_directory, image_pattern, bands=[0, 1, 2, 3, 4, 5]):
    """
    Calculate the mean and standard deviation of each band in a folder of GeoTIFF files.

    Args:
        image_directory (str): Directory where the source GeoTIFF files are stored that are passed to model for training.
        image_pattern (str): Pattern of the GeoTIFF file names that globs files for computing stats.
        bands (list, optional): List of bands to calculate statistics for. Defaults to [0, 1, 2, 3, 4, 5].

    Raises:
        Exception: If no images are found in the given directory.

    Returns:
        tuple: Two lists containing the means and standard deviations of each band.
    """
    # Initialize lists to store the means and standard deviations
    all_means = []
    all_stds = []

    # Use glob to get a list of all .tif images in the directory
    all_images = glob(f"{image_directory}/{image_pattern}")

    # Make sure there are images to process
    if not all_images:
        raise Exception("No images found")

    # Get the number of bands
    num_bands = len(bands)

    # Initialize arrays to hold sums and sum of squares for each band
    band_sums = np.zeros(num_bands)
    band_sq_sums = np.zeros(num_bands)
    pixel_counts = np.zeros(num_bands)

    # Iterate over each image
    for image_file in all_images:
        with rasterio.open(image_file) as src:
            # For each band, calculate the sum, square sum, and pixel count
            for band in bands:
                data = src.read(band + 1)  # rasterio band index starts from 1
                band_sums[band] += np.nansum(data)
                band_sq_sums[band] += np.nansum(data**2)
                pixel_counts[band] += np.count_nonzero(~np.isnan(data))

    # Calculate means and standard deviations for each band
    for i in bands:
        mean = band_sums[i] / pixel_counts[i]
        std = np.sqrt((band_sq_sums[i] / pixel_counts[i]) - (mean**2))
        all_means.append(mean)
        all_stds.append(std)

    return all_means, all_stds

# Configuration for training

-   `identifier`: This variable will be used as a prefix for all artifacts related to fine-tuning and deployments. Please update it with an appropriate identifier.
-   `usecase`: This variable refers to the use cases the Prithvi model will be fine-tuned for, e.g., `burn_scars`, `flood_detection`, etc. For this hands-on session, we will be using `burn_scars`. If you have your own data, please update accordingly.
-   `data_path`: Data path is where the data locally resides. This will be used to find the files for fine-tuning. These files will then be used to calculate statistics like `mean` and `standard deviation`. These files will also be uploaded to an S3 bucket for the training job to use.
-   `batch_size`: This is the number of data samples processed by the model in one iteration during training. We are using `4` by default. Depending on the availability of GPUs and resources, this can be increased.
-   `num_workers`: This is the number of worker processes used for data loading during training. We are using `2` by default. This can be adjusted based on CPU and I/O capabilities.
-   `num_classes`: This variable represents the number of classes in the fine-tuning job. For `burn_scars`, we have two classes: `burn_scar` and `no_burn_scar`. Update it according to the data you are using for training.
-   `prithvi_backbone`: This variable represents the Prithvi Earth Observation Foundation Model (EO FM) pre-trained using HLS data. There are several variations:
    -   `prithvi_eo_v1_100`: This is an older version of the Prithvi EO FM. It will not be used in this hands-on session.
    -   `prithvi_eo_v2_300`: This version of the Prithvi EO FM has approximately 300 million parameters (typically around 24 Transformer encoder layers). It can be selected for faster training and a lower memory footprint.
    -   `prithvi_eo_v2_300_tl`: This version also has ~300 million parameters (typically around 24 Transformer encoder layers) and is pre-trained with **T**emporal and **L**ocation embeddings. It is ideal for fine-tuning use cases where spatial and temporal information is important and a smaller footprint is desired. For example, crop classification using imagery from multiple time steps.
    -   `prithvi_eo_v2_600`: This is a larger version of the Prithvi EO FM with approximately 600 million parameters (typically 32 Transformer encoder layers). It can be selected for use cases requiring high precision, accuracy, or recall. Note: the memory footprint of this model is significantly larger than the 300M versions. Ensure sufficient resources are available.
    -   `prithvi_eo_v2_600_tl`: This version also has ~600 million parameters (typically 32 Transformer encoder layers) and includes **T**emporal and **L**ocation embeddings. It is best suited for high-performance fine-tuning on use cases where precise spatial and temporal information is crucial, such as detailed change detection or multi-temporal crop type mapping. The resource considerations are similar to the `prithvi_eo_v2_600` model.
-   `base_path`: This variable specifies the base directory for training operations, including the path for input data, configuration files, and the location for storing model artifacts post-training. For SageMaker training jobs, `/opt/ml/data` is commonly used. If using a different environment, please update accordingly.
-   `max_epochs`: This variable limits the number of `epochs` (full passes through the training dataset) a fine-tuning job runs for. A higher number of epochs equates to longer training time and may lead to better-performing models, but this needs to be validated on a case-by-case basis to avoid overfitting.
-   `indices`: For most of our fine-tuning jobs, a `decoder` is added on top of the selected Prithvi backbone. This variable specifies which Transformer blocks (by their index) from the Prithvi backbone will provide their output feature embeddings to this decoder. Commonly, features from blocks at or around 1/4, 1/2, 3/4, and the final depth of the backbone are used to capture multi-scale information. The selection of these indices impacts the architecture and parameter count of the decoder, not the Prithvi backbone itself.
-   `means`: This is the mean of the pixel values across each channel of the training dataset. The mean values, along with standard deviations, will be used for zero-center normalization of input values.
-   `stds`: This is the standard deviation of the pixel values across each channel of the training dataset. The standard deviation values, along with means, will be used for zero-center normalization of input values.
-   `model_path`: This variable specifies where the model artifacts will be stored after training.

->

## Training Configuration Parameters

Each parameter plays a crucial role in determining your model's training behavior and performance. Let's understand what each one does:

### ** Experiment Management**
- **`identifier`**: Your unique experiment name (like "summer_school_2025")
  - *Why important*: Helps organize and track different training runs
  - *Best practice*: Use descriptive names with dates or versions

- **`usecase`**: The specific task you're solving ("burn_scars", "flood_detection", etc.)
  - *Why important*: Organizes model artifacts by application
  - *Example*: "burn_scars" for wildfire damage assessment

### **Data Configuration**
- **`data_path`**: Local path to your training data
  - *Purpose*: Points to images for statistics calculation and initial processing
  - *Note*: Data gets uploaded to S3 for actual training

- **`batch_size`**: Number of images processed together (default: 4)
  - *Trade-offs*: Larger = faster training but more memory usage
  - *GPU memory consideration*: Reduce if you get "out of memory" errors

- **`num_workers`**: Parallel processes for data loading (default: 2)
  - *Purpose*: Speeds up data loading while model trains
  - *Optimization*: Usually set to number of CPU cores available

- **`num_classes`**: Number of prediction categories (2 for burn_scar vs. no_burn_scar)
  - *Critical*: Must match your dataset's label structure

### **Foundation Model Selection**

The **`prithvi_backbone`** determines which pre-trained model you start with:

| Model | Parameters | Layers | Use Case | Memory |
|-------|------------|--------|----------|---------|
| `prithvi_eo_v2_300` | 300M | 24 | General purpose, faster training | Lower |
| `prithvi_eo_v2_300_tl` | 300M | 24 | **T**emporal + **L**ocation aware | Lower |
| `prithvi_eo_v2_600` | 600M | 32 | High precision tasks | Higher |
| `prithvi_eo_v2_600_tl` | 600M | 32 | Temporal analysis, change detection | Higher |

**Choosing the right model:**
- **300M models**: Great for learning and most applications
- **600M models**: When you need maximum accuracy
- **TL versions**: When time/location context matters

### **Training Parameters**
- **`max_epochs`**: How many times to see the entire dataset
  - *Rule of thumb*: Start with 10-50, monitor validation performance
  - *Warning*: Too many epochs can cause overfitting

- **`indices`**: Which model layers to use for the decoder
  - *Purpose*: Extracts features at different scales (like zoom levels)
  - *Multi-scale learning*: Combines coarse and fine-grained features

### **Normalization Statistics**
- **`means`** & **`stds`**: Dataset-specific normalization values
  - *Why critical*: Ensures your data matches what the foundation model expects
  - *Calculated automatically*: From your training data

### **Set Your Parameters**

**Replace the placeholder values with your specific settings:**

```python
# REQUIRED: Replace these with your values
identifier = "your_name_summer_school_2025"  # Your unique identifier
usecase = "burn_scars"                       # Keep as burn_scars for this exercise
```

**💡 Parameter Recommendations for Learning:**
- Keep `batch_size = 4` (good for learning, won't overwhelm GPU memory)
- Start with `max_epochs = 10` (faster for experimentation)
- Use `prithvi_eo_v2_300` (smaller model, faster training)

**🚀 Advanced Students:**
- Try `prithvi_eo_v2_300_tl` for temporal understanding
- Increase `max_epochs` to 50-100 for better performance
- Experiment with `batch_size = 8` if you have sufficient GPU memory


In [None]:
identifier = "mr01eo"
usecase = "burnscars"
#local data path
data_path = '../data/hls_burn_scars/'

batch_size = 4
num_workers = 2

num_classes = 2

**What's happening in this cell:**

1. **Dynamic Model Architecture Setup**: The `indices` are automatically set based on your chosen backbone
   - **Why different indices?**: Each model has different numbers of layers
   - **Multi-scale features**: We extract features from 1/4, 1/2, 3/4, and final model depths

2. **Dataset Statistics Calculation**: 
   - **`calculate_band_statistics()`**: Computes mean and std for each spectral band
   - **Pattern matching**: `'training/*_merged.tif'` finds all training images
   - **Critical for normalization**: Ensures model sees appropriately scaled data

3. **Setting normalization parameters**: calculated means and stds from the training data
   - **Configuring training duration**: Number of epochs for the training process
   - **Setting up data paths**: Tells SageMaker where to find training, validation, and test data
   - **Organizing outputs**: Specifies where to save training logs and model checkpoints


**IMPORTANT**: Notice how we configure different indices for different model sizes - this is a key aspect of working with foundation models of varying depths!

In [None]:
"""
Model backbone can be either:
  - prithvi_eo_v1_100
  - prithvi_eo_v2_300
  - prithvi_eo_v2_300_tl
  - prithvi_eo_v2_600
  - prithvi_eo_v2_600_tl
"""
prithvi_backbone = 'prithvi_eo_v2_300'

base_path = '/opt/ml/data'

max_epochs = 1

config['data']['init_args']['batch_size'] = batch_size
config['data']['init_args']['num_workers'] = num_workers

config['data']['init_args']['num_classes'] = num_classes


config['model']['init_args']['model_args']['backbone'] = prithvi_backbone


indices = [5, 11, 17, 23]
if 'prithvi_eo_v2_100' in prithvi_backbone:
    indices = [2, 5, 8, 11]  # indices for prithvi_eo_v1_100
elif 'prithvi_eo_v2_300' in prithvi_backbone:
    indices = [5, 11, 17, 23]  # indices for prithvi_eo_v2_300
elif 'prithvi_eo_v2_600' in prithvi_backbone:
    indices = [7, 15, 23, 31]  # indices for prithvi_eo_v2_600

config['model']['init_args']['model_args']['necks'][0]['indices'] = indices

means, stds = calculate_band_statistics(data_path, 'training/*_merged.tif')

model_path = f"{base_path}/{usecase}/checkpoints"

In [None]:
# Mean and standard deviation calculated from the training dataset for all 6 bands,
# for zero center normalization.
config['data']['init_args']['means'] = [float(val) for val in means]
config['data']['init_args']['stds'] = [float(val) for val in stds]

# Total number of epochs the training will run for. Since we are short on time,
# we will just be running it for 1 epoch. This can be updated to any positive integer.
config['trainer']['max_epochs'] = max_epochs

config['data']['init_args']['test_data_root'] = f"{base_path}/test"
config['data']['init_args']['val_data_root'] = f"{base_path}/validation"
config['data']['init_args']['train_data_root'] = f"{base_path}/training"

config['trainer']['default_root_dir'] = f"{base_path}/{usecase}"

config['trainer']['callbacks'][2]['init_args']["dirpath"] = model_path

## AWS SageMaker Training Setup

### **Understanding SageMaker Training Jobs**

**What is a SageMaker Training Job?**
- **Managed infrastructure**: AWS provisions and manages compute resources automatically
- **Scalable**: Can use multi-GPU clusters
- **Pay-per-use**: Only pay for compute time actually used.
- **Integrated**: Works with S3, CloudWatch, and other AWS services

### **Key Components Explained:**

**Environment Variables Recap**
- **`CONFIG_FILE`**: Path to your YAML configuration
- **`MODEL_DIR`**: Where to save the trained model
- **`S3_URL`**: Where your data is stored
- **`ROLE_ARN`**: AWS permissions for accessing resources

**Instance Types**
- **`ml.p3.2xlarge`**: GPU instance good choice for single GPU deep learning
  - 1 NVIDIA V100 GPU (16GB memory)
  - 8 vCPUs, 61 GB RAM
  - Good balance of performance and cost

**Container Image**
- **ECR (Elastic Container Registry)**: Stores your training environment
- **Pre-built image**: Contains all installed necessary libraries and frameworks, Same setup every time you (or your teammate) train

**Cost Considerations:**
- **On-demand pricing**: 3-4 usd per hour for ml.p3.2xlarge
- **Training time**: Usually 1-3 hours for this exercise
- **Total cost**: Typically 5-15 usd for a complete training run

In [None]:
# Rename configuration file name to user specific filename
import os

config_filename = f"{identifier}-burn_scars_Prithvi-EO.yaml"
config_filepath = f"../configs/{config_filename}"
with open(config_filepath, 'w') as config_file:
    yaml.dump(config, config_file, default_flow_style=False)

# Upload config files to s3 bucket
configs = sagemaker_session.upload_data(path=config_filepath, bucket=BUCKET_NAME, key_prefix='data/configs')

In [None]:
# Setup variables for training using sagemaker

name = f'{identifier}-sagemaker'
role = get_execution_role()
input_s3_uri = f"s3://{BUCKET_NAME}/data"
model_name = f"{identifier}-workshop.ckpt"

environment_variables = {
    'CONFIG_FILE': f"{base_path}/configs/{config_filename}",
    'MODEL_DIR': model_path,
    'MODEL_NAME': model_name,
    'S3_URL': input_s3_uri,
    'ROLE_ARN': role,
    'ROLE_NAME': role.split('/')[-1],
    'EVENT_TYPE': usecase,
    'BUCKET_NAME': BUCKET_NAME,
    'VERSION': 'v1'
}
account_id = boto3.client('sts').get_caller_identity().get('Account')
ecr_container_url = f'{account_id}.dkr.ecr.us-west-2.amazonaws.com/eo_training:latest'
sagemaker_role = role.split('/')[-1]

instance_type = 'ml.p3.2xlarge'

instance_count = 1
memory_volume = 50

In [None]:
# Establish an estimator (model) using sagemaker and the configurations from the previous cell.
estimator = Estimator(image_uri=ecr_container_url,
                      role=get_execution_role(),
                      base_job_name=name,
                      instance_count=1,
                      environment=environment_variables,
                      instance_type=instance_type)

estimator.fit()

In [None]:
# Save important values in a file for reuse.
export_values = {
    'identifier': identifier,
    'model_name': model_name,
    'config_filename': config_filename,
    'bucket_name': BUCKET_NAME
}

with open('../variables.yaml', 'w') as variable_export:
    yaml.dump(export_values, variable_export, default_flow_style=False)


## NEXT Lesson: **Model Deployment**: Deploy the trained model for inference
continue deploy_and_use.ipynb