1. [Set Up](#1.-Set-Up)
2. [Fine-tune the pre-trained RESNET model on UIBK Avalanche dataset](#2.-Fine-tune-the-pre-trained-model-on-a-custome-dataset)
    * [Retrieve Training artifacts](#2.1.-Retrieve-Training-artifacts)
    * [Set Training parameters](#2.2.-Set-Training-parameters)
    * [Train with Automatic Model Tuning (HPO)](#AMT)
    * [Start Training](#2.4.-Start-Training)
    * [Deploy & run Inference on the fine-tuned model](#2.5.-Deploy-&-run-Inference-on-the-fine-tuned-model)


# 1. Set-up

This notebook requires ipywidgets (use dropdown)

In [15]:
!pip install ipywidgets>=7.7.0 --quiet

[0m

To train and host on Amazon Sagemaker, we need to setup and authenticate the use of AWS services. 

Here, we use the execution role associated with the current notebook instance as the AWS account role with SageMaker access. 

It has necessary permissions, including access to our data in S3. 

In [16]:
import sagemaker, boto3, json
from sagemaker.session import Session

sagemaker_session = Session()
aws_role = sagemaker_session.get_caller_identity_arn()
aws_region = boto3.Session().region_name
sess = sagemaker.Session()

Init the model id

In [17]:
model_id, model_version = "tensorflow-ic-imagenet-resnet-v2-152-classification-4", "*"
DELETE_END_POINT_AT_END = False # we will use this flag to decide if we delete the endpoint at the end or let it run

Show a dropdown of models (in case we want a different model)

In [18]:
import IPython
from ipywidgets import Dropdown
from sagemaker.jumpstart.notebook_utils import list_jumpstart_models

# Retrieves all Image Classification models available by SageMaker Built-In Algorithms.
filter_value = "task == ic"
ic_models = list_jumpstart_models(filter=filter_value)

# display the model-ids in a dropdown, for user to select a model.
dropdown = Dropdown(
    options=ic_models,
    value=model_id,
    description="SageMaker Built-In Image Classification Models:",
    style={"description_width": "initial"},
    layout={"width": "max-content"},
)
display(IPython.display.Markdown("## Select a SageMaker pre-trained model from the dropdown below"))
display(dropdown)

## Select a SageMaker pre-trained model from the dropdown below

Dropdown(description='SageMaker Built-In Image Classification Models:', index=148, layout=Layout(width='max-co…

### 2.1. Retrieve Training artifacts
***
Here, for the selected model, we retrieve the training docker container, the training algorithm source, the pre-trained base model, and a python dictionary of the training hyper-parameters that the algorithm accepts with their default values. Note that the model_version="*" fetches the lates model. Also, we do need to specify the training_instance_type to fetch train_image_uri.
***

In [19]:
# use this
from sagemaker import image_uris, model_uris, script_uris, hyperparameters

model_id, model_version = dropdown.value, "*"
training_instance_type = "ml.g4dn.xlarge"

# Retrieve the docker image
train_image_uri = image_uris.retrieve(
    region=None,
    framework=None,
    model_id=model_id,
    model_version=model_version,
    image_scope="training",
    instance_type=training_instance_type,
)
# Retrieve the training script
train_source_uri = script_uris.retrieve(
    model_id=model_id, model_version=model_version, script_scope="training"
)
# Retrieve the pre-trained model tarball to further fine-tune
train_model_uri = model_uris.retrieve(
    model_id=model_id, model_version=model_version, model_scope="training"
)

### 2.2. Set Training parameters
***
Now that we are done with all the setup that is needed, we are ready to fine-tune our Image Classification model. 

To begin, let us create a [``sageMaker.estimator.Estimator``](https://sagemaker.readthedocs.io/en/stable/api/training/estimators.html) object. This estimator will launch the training job. 

There are two kinds of parameters that need to be set for training. 

* The first one are the parameters for the training job. 

These include: 

> (i) Training data path. This is S3 folder in which the input data is stored, 

> (ii) Output path: This the s3 folder in which the training output is stored. 

> (iii) Training instance type: This indicates the type of machine on which to run the training. Typically, we use GPU instances for these training. We defined the training instance type above to fetch the correct train_image_uri. 

* The second set of parameters are algorithm specific training hyper-parameters.
***

In [20]:
training_data_bucket = f"s3-avalanche-guard"
training_data_prefix = "data/cv/uibk/ResNetClassify/training"



training_dataset_s3_path = f"s3://{training_data_bucket}/{training_data_prefix}"

output_bucket = sess.default_bucket()
output_prefix = "AR-AVALANCHEGUARD-training"

s3_output_location = f"s3://{output_bucket}/{output_prefix}/output"

Algorithm specific 

In [21]:
from sagemaker import hyperparameters

# Retrieve the default hyper-parameters for fine-tuning the model
hyperparameters = hyperparameters.retrieve_default(model_id=model_id, model_version=model_version)

# [Optional] Override default hyperparameters with custom values
hyperparameters["epochs"] = "5"
print(hyperparameters)

{'train_only_top_layer': 'True', 'epochs': '5', 'batch_size': '32', 'optimizer': 'adam', 'learning_rate': '0.001', 'beta_1': '0.9', 'beta_2': '0.999', 'momentum': '0.9', 'epsilon': '1e-07', 'rho': '0.95', 'initial_accumulator_value': '0.1', 'reinitialize_top_layer': 'Auto', 'early_stopping': 'False', 'early_stopping_patience': '5', 'early_stopping_min_delta': '0.0', 'dropout_rate': '0.2', 'regularizers_l2': '0.0001', 'label_smoothing': '0.1', 'image_resize_interpolation': 'bilinear', 'augmentation': 'False', 'augmentation_random_flip': 'horizontal_and_vertical', 'augmentation_random_rotation': '0.2', 'augmentation_random_zoom': '0.1', 'binary_mode': 'False', 'eval_metric': 'accuracy', 'validation_split_ratio': '0.2', 'random_seed': '123'}


### 2.3. Train with Automatic Model Tuning ([HPO](https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning.html)) <a id='AMT'></a>
***
Amazon SageMaker automatic model tuning, also known as hyperparameter tuning, finds the best version of a model by running many training jobs on your dataset using the algorithm and ranges of hyperparameters that you specify. It then chooses the hyperparameter values that result in a model that performs the best, as measured by a metric that you choose. We will use a [HyperparameterTuner](https://sagemaker.readthedocs.io/en/stable/api/training/tuner.html) object to interact with Amazon SageMaker hyperparameter tuning APIs.
***

In [22]:
from sagemaker.tuner import ContinuousParameter

# Use AMT for tuning and selecting the best model
use_amt = True

# Define objective metric per framework, based on which the best model will be selected.
metric_definitions_per_model = {
    "tensorflow": {
        "metrics": [{"Name": "val_accuracy", "Regex": "val_accuracy: ([0-9\\.]+)"}],
        "type": "Maximize",
    },
    "pytorch": {
        "metrics": [{"Name": "val_accuracy", "Regex": "val Acc: ([0-9\\.]+)"}],
        "type": "Maximize",
    },
}

# You can select from the hyperparameters supported by the model, and configure ranges of values to be searched for training the optimal model.(https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-define-ranges.html)
hyperparameter_ranges = {
    "adam-learning-rate": ContinuousParameter(0.0001, 0.1, scaling_type="Logarithmic")
}

# Increase the total number of training jobs run by AMT, for increased accuracy (and training time).
max_jobs = 6
# Change parallel training jobs run by AMT to reduce total training time, constrained by your account limits.
# if max_jobs=max_parallel_jobs then Bayesian search turns to Random.
max_parallel_jobs = 2

### 2.4. Start Training
***
We start by creating the estimator object with all the required assets and then launch the training job.
***

In [23]:
from sagemaker.estimator import Estimator
from sagemaker.utils import name_from_base
from sagemaker.tuner import HyperparameterTuner

training_job_name = name_from_base(f"AR-{model_id}-transfer-learning")

# Create SageMaker Estimator instance
ic_estimator = Estimator(
    role=aws_role,
    image_uri=train_image_uri,
    source_dir=train_source_uri,
    model_uri=train_model_uri,
    entry_point="transfer_learning.py",
    instance_count=1,
    instance_type=training_instance_type,
    max_run=360000,
    hyperparameters=hyperparameters,
    output_path=s3_output_location,
    base_job_name=training_job_name,
)

if use_amt:
    metric_definitions = next(
        value for key, value in metric_definitions_per_model.items() if model_id.startswith(key)
    )

    hp_tuner = HyperparameterTuner(
        ic_estimator,
        metric_definitions["metrics"][0]["Name"],
        hyperparameter_ranges,
        metric_definitions["metrics"],
        max_jobs=max_jobs,
        max_parallel_jobs=max_parallel_jobs,
        objective_type=metric_definitions["type"],
        base_tuning_job_name=training_job_name,
    )

    # Launch a SageMaker Tuning job to search for the best hyperparameters
    hp_tuner.fit({"training": training_dataset_s3_path})
else:
    # Launch a SageMaker Training job by passing s3 path of the training data
    ic_estimator.fit({"training": training_dataset_s3_path}, logs=True)

No finished training job found associated with this estimator. Please make sure this estimator is only used for building workflow config


..................................................................................................................................................................................................................................................................................................................................................................!


## 2.5. Deploy & run Inference on the fine-tuned model
***
A trained model does nothing on its own. We now want to use the model to perform inference. 

For this example, that means predicting the class label of an image. 

We start by retrieving the artifacts for deploying an endpoint. Then we  deploy the `ic_estimator` that we fine-tuned.
***

In [24]:
from sagemaker.utils import name_from_base
inference_instance_type = "ml.g4dn.xlarge"#"ml.p2.xlarge"

# Retrieve the inference docker container uri
deploy_image_uri = image_uris.retrieve(
    region=None,
    framework=None,
    image_scope="inference",
    model_id=model_id,
    model_version=model_version,
    instance_type=inference_instance_type,
)
# Retrieve the inference script uri
deploy_source_uri = script_uris.retrieve(
    model_id=model_id, model_version=model_version, script_scope="inference"
)

endpoint_name = name_from_base(f"EP-AR-AVALANCHEGUARD-FT-{model_id}-")

# Use the estimator from the previous step to deploy to a SageMaker endpoint
finetuned_predictor = (hp_tuner if use_amt else ic_estimator).deploy(
    initial_instance_count=1,
    instance_type=inference_instance_type,
    entry_point="inference.py",
    image_uri=deploy_image_uri,
    source_dir=deploy_source_uri,
    endpoint_name=endpoint_name,
)


2024-07-01 06:41:17 Starting - Found matching resource for reuse
2024-07-01 06:41:17 Downloading - Downloading the training image
2024-07-01 06:41:17 Training - Training image download completed. Training in progress.
2024-07-01 06:41:17 Uploading - Uploading generated training model
2024-07-01 06:41:17 Completed - Resource retained for reuse
--------!

Now let us download images and run inference

In [25]:
s3_bucket = f"s3-avalanche-guard"
###key_prefix = "data/cv/uibk/ResNetClassify/validation/positive"
key_prefix = "data/cv/uibk/ResNetClassify/validation/negative"





def download_from_s3(images):
    for filename, image_key in images.items():
        print("Downloading {0} - {1} - {2} - {3}".format(s3_bucket,key_prefix,image_key,filename))
        boto3.client("s3").download_file(s3_bucket, f"{key_prefix}/{image_key}", filename)


avalanche_images = {
#    "img1.jpg": "2013-02-05 muehlgraben (97).jpg",
#    "img2.jpg": "2015-04-21 umbaltal (3).jpg",

#    "img1.jpg": "2013-02-05 muehlgraben (97).jpg",
#    "img2.jpg": "2015-04-21 umbaltal (3).jpg",

    "temp/img1.jpg": "2015-02-28 winnebacher weisserkogel (13).jpg",
    "temp/img2.jpg": "2016-03-04 sellrain (5).jpg"
}

download_from_s3(avalanche_images)

Downloading s3-avalanche-guard - data/cv/uibk/ResNetClassify/validation/negative - 2015-02-28 winnebacher weisserkogel (13).jpg - temp/img1.jpg
Downloading s3-avalanche-guard - data/cv/uibk/ResNetClassify/validation/negative - 2016-03-04 sellrain (5).jpg - temp/img2.jpg


In [26]:
from IPython.core.display import HTML

for image_filename in avalanche_images.keys():
    with open(image_filename, "rb") as file:
        img = file.read()
    query_response = finetuned_predictor.predict(
        img, {"ContentType": "application/x-image", "Accept": "application/json;verbose"}
    )
    model_predictions = json.loads(query_response)
    predicted_label = model_predictions["predicted_label"]
    display(
        HTML(
            f'<img src={image_filename} alt={image_filename} align="left" style="width: 250px;"/>'
            f"<figcaption>Predicted Label: {predicted_label}</figcaption>"
        )
    )

In [27]:
############### Run on the entire folder #####################
##############################################################
# Get bucket contents as a list
def list_all_s3_objects(bucket_name:str, prefix:str)-> [str]:
    s3 = boto3.client('s3')
    paginator = s3.get_paginator('list_objects_v2')
    page_iterator = paginator.paginate(Bucket=bucket_name,Prefix=prefix)

    objlist = []
    for page in page_iterator:
        if 'Contents' in page:
            for obj in page['Contents']:
                objlist.append( obj["Key"].replace(prefix, ""))

    return objlist
            

def download_from_s3(images):
    for filename, image_key in images.items():
        print("Downloading {0} - {1} - {2} - {3}".format(s3_bucket,key_prefix,image_key,filename))
        boto3.client("s3").download_file(s3_bucket, f"{key_prefix}{image_key}", filename)    


s3_bucket = f"s3-avalanche-guard"
key_prefix = "data/cv/uibk/ResNetClassify/validation/negative/"



from IPython.core.display import HTML

all_images_list = list_all_s3_objects(s3_bucket,key_prefix)


# Now for each image download locally and then run inference, delete after displaying
for idx,img in enumerate(all_images_list):
    if idx>10:
        break
    
    avalanche_images = {"temp/img01.jpg": img}
    print("Downloading {0}".format(avalanche_images))

    # download the image
    download_from_s3(avalanche_images)

    for image_filename in avalanche_images.keys():
        with open(image_filename, "rb") as file:
            img = file.read()
        query_response = finetuned_predictor.predict(
            img, {"ContentType": "application/x-image", "Accept": "application/json;verbose"}
        )
        model_predictions = json.loads(query_response)
        predicted_label = model_predictions["predicted_label"]
        display(
            HTML(
                f'<img src={image_filename} alt={image_filename} align="left" style="width: 250px;"/>'
                f"<figcaption>Predicted Label: {predicted_label}</figcaption>"
            )
        )
    
    sleep(30)
    # now delete the dopwnloaded image

        
        

Downloading {'temp/img01.jpg': '2015-01-01 hafelekar (3).jpg'}
Downloading s3-avalanche-guard - data/cv/uibk/ResNetClassify/validation/negative/ - 2015-01-01 hafelekar (3).jpg - temp/img01.jpg


NameError: name 'sleep' is not defined

Delete endpoint if necessary

In [16]:
# Delete the SageMaker endpoint and the attached resources
if DELETE_END_POINT_AT_END:
    finetuned_predictor.delete_model()
    finetuned_predictor.delete_endpoint()