# Train, tune, and deploy a custom ML model using Degas 100M from AWS Marketplace


---

This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. 

![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-2/aws_marketplace|curating_aws_marketplace_listing_and_sample_notebook|Algorithm|Sample_Notebook_Template|title_of_your_product-Algorithm.ipynb)

---

Degas 100M: A geospatial foundational model that can be fine-tuned to your specific earth observation tasks. 

See the models' technical details here: https://arxiv.org/pdf/2405.02512v1

This sample notebook shows you how to train a custom ML model using [Degas 100M](https://aws.amazon.com/marketplace/pp/prodview-6bl3fwy2k5h5i) from AWS Marketplace.

> **Note**: This is a reference notebook and it cannot run unless you make changes suggested in the notebook.

## Pre-requisites
1. **Note**: This notebook contains elements which render correctly in Jupyter interface. Open this notebook from an Amazon SageMaker Notebook Instance or Amazon SageMaker Studio.
1. Ensure that IAM role used has **AmazonSageMakerFullAccess**
1. Some hands-on experience using [Amazon SageMaker](https://aws.amazon.com/sagemaker/).
1. To use this algorithm successfully, ensure that:
    1. Either your IAM role has these three permissions and you have authority to make AWS Marketplace subscriptions in the AWS account used: 
        1. **aws-marketplace:ViewSubscriptions**
        1. **aws-marketplace:Unsubscribe**
        1. **aws-marketplace:Subscribe**  
    2. or your AWS account has a subscription to [Degas 100M](https://aws.amazon.com/marketplace/pp/prodview-6bl3fwy2k5h5i). 

## Contents
1. [Subscribe to the algorithm](#1.-Subscribe-to-the-algorithm)
1. [Prepare dataset](#2.-Prepare-dataset)
	1. [Dataset format expected by the algorithm](#A.-Dataset-format-expected-by-the-algorithm)
	1. [Configure and visualize train and test dataset](#B.-Configure-and-visualize-train-and-test-dataset)
	1. [Upload datasets to Amazon S3](#C.-Upload-datasets-to-Amazon-S3)
1. [Train a machine learning model](#3:-Train-a-machine-learning-model)
	1. [Set up environment](#3.1-Set-up-environment)
	1. [Train a model](#3.2-Train-a-model)
1. [Deploy model and verify results](#4:-Deploy-model-and-verify-results)
    1. [Deploy trained model](#A.-Deploy-trained-model)
    1. [Create input payload](#B.-Create-input-payload)
    1. [Perform real-time inference](#C.-Perform-real-time-inference)
    1. [Visualize output](#D.-Visualize-output)
    1. [Calculate relevant metrics](#E.-Calculate-relevant-metrics)
    1. [Delete the endpoint](#F.-Delete-the-endpoint)
1. [Perform Batch inference](#6.-Perform-Batch-inference)
1. [Clean-up](#7.-Clean-up)
	1. [Delete the model](#A.-Delete-the-model)
	1. [Unsubscribe to the listing (optional)](#B.-Unsubscribe-to-the-listing-(optional))


## Usage instructions
You can run this notebook one cell at a time (By using Shift+Enter for running a cell).

## 1. Subscribe to the algorithm

To subscribe to the algorithm:
1. Open the algorithm listing page [Degas 100M](https://aws.amazon.com/marketplace/pp/prodview-6bl3fwy2k5h5i).
1. On the AWS Marketplace listing,  click on **Continue to subscribe** button.
1. On the **Subscribe to this software** page, review and click on **"Accept Offer"** if you agree with EULA, pricing, and support terms. 
1. Once you click on **Continue to configuration button** and then choose a **region**, you will see a **Product Arn**. This is the algorithm ARN that you need to specify while training a custom ML model. Copy the ARN corresponding to your region and specify the same in the following cell.

In [None]:
algo_arn = "<Customer to specify algorithm ARN corresponding to their AWS region>"

## 2. Prepare dataset

In [None]:
import boto3
import os
import sagemaker
import yaml


### A. Dataset format expected by the algorithm

Supported Types for training: tiff

The user will need to:

- Fill a template yaml file to configure the finetuning training task.

- Provide a directory with the correct folder structure for the data splits.

- Ensure the input files have the correct shape.

Below more details on these requirements.

Yaml configuration file:
The yaml configuration file provides with task specific information to be used for finetuning the model. As such, it requires the user to provide dataset information, such as a path to the root folder or the name of the splits, and task-specific information such as the name of the satellite, the bands to use or the number of classes to predict.

Directory structure:
The datasets should be saved in a parent folder that contains at least two subfolders; one for training data, one for validation data. Additionally you can also add a test subfolder. 

For our input datasets, we adopt the same data format as used by the foundational model [Prithvi](https://arxiv.org/abs/2310.18660). In this format, data should be arranged into a collection of tiff files and their corresponding masks. Tiff files should be bigger than 224x224 pixel scenes where the first dimension represents the number of bands, and the other two are the spatial dimensions. The mask is used to associate each pixel with a label. The number of mask channels is equal to the number of classes in the scene, plus an extra band for missing data. For finetuning, your tiff will be center croped and converted to a 224x224 scene.

#### Example input:
Here you can see example datasets: [burn scars](https://huggingface.co/datasets/ibm-nasa-geospatial/hls_burn_scars) and [multi temporal crop classification](https://huggingface.co/datasets/ibm-nasa-geospatial/multi-temporal-crop-classification).



You can also find more information about dataset format in **Usage Information** section of  [Degas 100M](https://aws.amazon.com/marketplace/pp/prodview-6bl3fwy2k5h5i) .

### B. Configure and visualize train and test dataset

In this example, we use the [burn scars](https://huggingface.co/datasets/ibm-nasa-geospatial/hls_burn_scars) dataset. Please install tools to download and preprocess the dataset.

In [None]:
!pip install huggingface_hub
!pip install rasterio


Then we download the dataset to a local directory and extract them.

In [None]:
!mkdir dataset

In [None]:
from huggingface_hub import snapshot_download

training_dataset = "./dataset"  
snapshot_download(repo_id="ibm-nasa-geospatial/hls_burn_scars", repo_type="dataset", local_dir=training_dataset)


In [None]:
!cd {training_dataset} && tar -zxvf hls_burn_scars.tar.gz

### C. Upload datasets to Amazon S3

We upload the dataset to S3 that will be used during SageMaker trainings.

In [None]:
sagemaker_session = sagemaker.Session()
bucket = sagemaker_session.default_bucket()
bucket


In [None]:
training_data = sagemaker_session.upload_data(
    training_dataset, bucket=bucket, key_prefix="degas-100M-burn-scar-dataset/training"
)


## 3: Train a machine learning model

Now that dataset is available in an accessible Amazon S3 bucket, we are ready to train a machine learning model. 

### 3.1 Set up environment

In [None]:
role = sagemaker.get_execution_role()

output_location = "s3://{}/degas_100M_example/{}".format(
    bucket, "output"
)


### 3.2 Train a model

The burn-scar dataset is created based on the HLS, and its labels have two classes (Unburnt land and Burn scar), and training and validation splits are provided for training. We set hyperparameters to customize finetuning specifically for the task as shown below. You can also find more information about dataset format in **Hyperparameters** section of [Degas 100M](https://aws.amazon.com/marketplace/pp/prodview-6bl3fwy2k5h5i).

In [None]:
# Define hyperparameters
hyperparameters = {
    "training_args": {
        "train_split": "training",
        "val_split": "validation",
        "test_split": "validation",
        "lr": 1.3e-05,
        "max_train_iters": 6000,  # iterations for training
        "max_eval_iters": 1000,  # evaluation interval       
    },
    "task_args": {  
        "satellite": "S2",
        "bands": [0, 1, 2, 3, 4, 5],
        "classes": ["Unburnt land", "Burn scar"],
    }
}


For information on creating an `Estimator` object, see [documentation](https://sagemaker.readthedocs.io/en/stable/api/training/estimators.html)

In [None]:
import json
are_hp_in_yaml = False

# Load yaml parameters if needed
if are_hp_in_yaml:
    params_path = "./configs/params/fm_finetune_params.yaml"
    with open(params_path) as f:
        hyperparameters = yaml.safe_load(f)

estimator = sagemaker.algorithm.AlgorithmEstimator(
    algorithm_arn=algo_arn,
    base_job_name="degas-100M-finetune-burnscars",
    role=role,
    instance_count=1,
    instance_type="ml.p3.2xlarge",
    input_mode="File",
    output_path=output_location,
    sagemaker_session=sagemaker_session,
    hyperparameters={
        "training_args": '"' + str(yaml.dump(hyperparameters["training_args"])).replace("\n", "\\n") + '"',
        "task_args": '"' + str(yaml.dump(hyperparameters["task_args"])).replace("\n", "\\n") + '"'
    },
)

# Run the training job.
estimator.fit({"training": training_data})        


See this [blog-post](https://aws.amazon.com/blogs/machine-learning/easily-monitor-and-visualize-metrics-while-training-models-on-amazon-sagemaker/) for more information how to visualize metrics during the process. You can also open the training job from [Amazon SageMaker console](https://console.aws.amazon.com/sagemaker/home?#/jobs/) and monitor the metrics/logs in **Monitor** section.

## 4: Deploy model and verify results

Now you can deploy the model for performing real-time inference.

In [None]:
model_name = "degas-100m-finetuned-burn-scar"

content_type = "application/x-npy"

real_time_inference_instance_type = (
    "ml.p3.2xlarge"
)
batch_transform_inference_instance_type = (
    "ml.p3.2xlarge"
)

### A. Deploy trained model

In [None]:
from sagemaker.base_serializers import NumpySerializer
from sagemaker.base_deserializers import NumpyDeserializer

predictor = estimator.deploy(
    1, real_time_inference_instance_type, serializer=NumpySerializer(), deserializer=NumpyDeserializer(allow_pickle=True)
)

Once endpoint is created, you can perform real-time inference.

### B. Create input payload

For inference, converting a tif format file to a numpy array is required. 
The following cell performs the preprocessing with such as normalizing and resizing.

In [None]:
import rasterio, io
import numpy as np
import torchvision.transforms.functional as F
import torch

sample = training_dataset + "/validation/subsetted_512x512_HLS.S30.T10SEH.2018190.v1.4_merged.tif"

# Use rasterio to read the geospatial .tif file
with rasterio.open(sample) as src:
    # Read the entire image
    image_array = src.read()

# Optional: Process the image_array as needed for your model
# For example, normalize, resize, etc.
# Assuming the model expects a specific format

img_norm_cfg = dict(
    means=[
        0.033349706741586264,
        0.05701185520536176,
        0.05889748132001316,
        0.2323245113436119,
        0.1972854853760658,
        0.11944914225186566,
    ],
    stds=[
        0.02269135568823774,
        0.026807560223070237,
        0.04004109844362779,
        0.07791732423672691,
        0.08708738838140137,
        0.07241979477437814,
    ],
)

image_array = image_array.astype(np.float32)
image_array = np.where(image_array == None, 0.0, image_array)
image_array = np.expand_dims(image_array, 1)

# Define the desired shape
target_shape = (6, 1, 224, 224)
x_bias = 200
y_bias = 0
norm = F.normalize(
    torch.from_numpy(image_array[:, 0, x_bias:target_shape[2]+x_bias, y_bias:target_shape[3]+y_bias]),
    img_norm_cfg["means"],
    img_norm_cfg["stds"],
    False,
)

Let's visualize the input sample.

In [None]:
from matplotlib import pyplot as plt

imgplot = plt.imshow(image_array[0, 0, x_bias:target_shape[2]+x_bias, y_bias:target_shape[3]+y_bias])

### C. Perform real-time inference

In [None]:
res = predictor.predict(np.expand_dims(np.expand_dims(norm, 0), 2))
pred_value, pred_label = torch.from_numpy(res)[0].topk(1, dim=0)


### D. Visualize output

In [None]:
imgplot = plt.imshow(pred_label.detach().numpy()[0])

### E. Calculate relevant metrics

Then we simple calculate accuracy of the prediction.

In [None]:
sample_label = training_dataset + "/validation/subsetted_512x512_HLS.S30.T10SEH.2018190.v1.4.mask.tif"

with rasterio.open(sample_label) as src:
    label_array = src.read()

target = label_array[0, x_bias:target_shape[2]+x_bias, y_bias:target_shape[3]+y_bias]
pred = pred_label[0].numpy()
    
acc = np.sum(np.equal(target, pred))/pred.size
print(f"{acc*100}%")

If [Amazon SageMaker Model Monitor](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor.html) supports the type of problem you are trying to solve using this algorithm, use the following examples to add Model Monitor support to your product:
For sample code to enable and monitor the model, see following notebooks:
1. [Enable Amazon SageMaker Model Monitor](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker_model_monitor/enable_model_monitor/SageMaker-Enable-Model-Monitor.ipynb)
2. [Amazon SageMaker Model Monitor - visualizing monitoring results](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker_model_monitor/visualization/SageMaker-Model-Monitor-Visualize.ipynb)

### F. Delete the endpoint

Now that you have successfully performed a real-time inference, you do not need the endpoint any more. you can terminate the same to avoid being charged.

In [None]:
predictor.delete_endpoint(delete_endpoint_config=True)

Since this is an experiment, you do not need to run a hyperparameter tuning job. However, if you would like to see how to tune a model trained using a third-party algorithm with Amazon SageMaker's hyperparameter tuning functionality, you can run the optional tuning step.

## 5. Perform Batch inference

In this section, you will perform batch inference using multiple input payloads together.

In [None]:
# upload the batch-transform job input files to S3

np.save("test.npy", np.expand_dims(np.expand_dims(norm, 0), 2))

transform_input_folder = "test.npy"
transform_input = sagemaker_session.upload_data(transform_input_folder, key_prefix=model_name)
print("Transform input uploaded to " + transform_input)

In [None]:
# Run the batch-transform job

output_path_s3 = "s3://{}/inference/output/".format(bucket)

transformer = estimator.transformer(
                        instance_count= 1,
                        instance_type=  batch_transform_inference_instance_type,
                        strategy= "SingleRecord",
                        output_path= output_path_s3,
                        accept= "application/x-npy")

transformer.transform(data=transform_input,
                    content_type='application/x-npy',
                   )

In [None]:
# output is available on following path
transformer.output_path

## 7. Clean-up

### A. Delete the model

In [None]:
transformer.delete_model()

### B. Unsubscribe to the listing (optional)

If you would like to unsubscribe to the algorithm, follow these steps. Before you cancel the subscription, ensure that you do not have any [deployable model](https://console.aws.amazon.com/sagemaker/home#/models) created from the model package or using the algorithm. Note - You can find this information by looking at the container name associated with the model. 

**Steps to unsubscribe to product from AWS Marketplace**:
1. Navigate to __Machine Learning__ tab on [__Your Software subscriptions page__](https://aws.amazon.com/marketplace/ai/library?productType=ml&ref_=mlmp_gitdemo_indust)
2. Locate the listing that you want to cancel the subscription for, and then choose __Cancel Subscription__  to cancel the subscription.



## Notebook CI Test Results

This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.

![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-1/aws_marketplace|curating_aws_marketplace_listing_and_sample_notebook|Algorithm|Sample_Notebook_Template|title_of_your_product-Algorithm.ipynb)

![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-2/aws_marketplace|curating_aws_marketplace_listing_and_sample_notebook|Algorithm|Sample_Notebook_Template|title_of_your_product-Algorithm.ipynb)

![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-1/aws_marketplace|curating_aws_marketplace_listing_and_sample_notebook|Algorithm|Sample_Notebook_Template|title_of_your_product-Algorithm.ipynb)

![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ca-central-1/aws_marketplace|curating_aws_marketplace_listing_and_sample_notebook|Algorithm|Sample_Notebook_Template|title_of_your_product-Algorithm.ipynb)

![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/sa-east-1/aws_marketplace|curating_aws_marketplace_listing_and_sample_notebook|Algorithm|Sample_Notebook_Template|title_of_your_product-Algorithm.ipynb)

![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-1/aws_marketplace|curating_aws_marketplace_listing_and_sample_notebook|Algorithm|Sample_Notebook_Template|title_of_your_product-Algorithm.ipynb)

![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-2/aws_marketplace|curating_aws_marketplace_listing_and_sample_notebook|Algorithm|Sample_Notebook_Template|title_of_your_product-Algorithm.ipynb)

![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-3/aws_marketplace|curating_aws_marketplace_listing_and_sample_notebook|Algorithm|Sample_Notebook_Template|title_of_your_product-Algorithm.ipynb)

![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-central-1/aws_marketplace|curating_aws_marketplace_listing_and_sample_notebook|Algorithm|Sample_Notebook_Template|title_of_your_product-Algorithm.ipynb)

![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-north-1/aws_marketplace|curating_aws_marketplace_listing_and_sample_notebook|Algorithm|Sample_Notebook_Template|title_of_your_product-Algorithm.ipynb)

![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-1/aws_marketplace|curating_aws_marketplace_listing_and_sample_notebook|Algorithm|Sample_Notebook_Template|title_of_your_product-Algorithm.ipynb)

![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-2/aws_marketplace|curating_aws_marketplace_listing_and_sample_notebook|Algorithm|Sample_Notebook_Template|title_of_your_product-Algorithm.ipynb)

![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-1/aws_marketplace|curating_aws_marketplace_listing_and_sample_notebook|Algorithm|Sample_Notebook_Template|title_of_your_product-Algorithm.ipynb)

![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-2/aws_marketplace|curating_aws_marketplace_listing_and_sample_notebook|Algorithm|Sample_Notebook_Template|title_of_your_product-Algorithm.ipynb)

![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-south-1/aws_marketplace|curating_aws_marketplace_listing_and_sample_notebook|Algorithm|Sample_Notebook_Template|title_of_your_product-Algorithm.ipynb)
