# Comparison of Image Classification models and algorithms in Amazon SageMaker JumpStart 

---
At times, when you are solving a business problem using machine learning (ML), you might want to use multiple ML algorithms and compare them against each other to see which model gives you the best results on dimensions that you care about - model accuracy, inference time, and training time.

In this notebook, we demonstrate how you can compare multiple image classification models and algorithms offered by SageMaker JumpStart on dimensions such as model accuracy, inference time, and training time. Models in JumpStart are brought from hubs such as TensorFlow Hub and PyTorch Hub, and training scripts (algorithms) were written separately for each of these frameworks. In this notebook, you can also alter some of the hyper-parameters and examine their effect on the results. 

Image Classification refers to classifying an image to one of the class labels in the training dataset.

Amazon [SageMaker JumpStart](https://docs.aws.amazon.com/sagemaker/latest/dg/studio-jumpstart.html) offers a large suite of ML algorithms. You can use JumpStart to solve many Machine Learning tasks through one-click in SageMaker Studio, or through [SageMaker JumpStart API](https://sagemaker.readthedocs.io/en/stable/overview.html#use-prebuilt-models-with-sagemaker-jumpstart). 

Note: This notebook was tested on ml.t3.medium instance in Amazon SageMaker Studio with Python 3 (Data Science) kernel and in Amazon SageMaker Notebook instance with conda_python3 kernel.

---

1. [Set Up](#1.-Set-Up)
2. [Specify training and validation data paths](#2.-Specify-training-and-validation-data-paths)
3. [Set hyper-parameters](#3.-Hyper-parameters)
4. [List of models to run](#4.-Specify-models-to-run)
5. [Helper functions](#5.-Helper-functions)
6. [Run all models](#6.-Run-all-models)

## 1. Set-Up
***
Before executing the notebook, there are some initial steps required for setup. This notebook requires latest version of sagemaker and ipywidgets.
***

In [None]:
!pip install sagemaker ipywidgets --upgrade --quiet

In [None]:
import sagemaker, boto3, json
from sagemaker import get_execution_role
import boto3, uuid
import pandas as pd

aws_role = get_execution_role()
aws_region = boto3.Session().region_name
sess = sagemaker.Session()
s3 = boto3.client("s3")

# unique id to connect all runs
# if you run this notebook multiple times, this master id helps you 
# save each run's results as a separate csv file
master_uuid = str(uuid.uuid4())
print("master id for this run: ", master_uuid)

# Lists to store results
nameList = []
accList = []
timeList = []

## 2. Specify training and validation data paths
***
Training and validation data needs to be stored in the format specified below
- A directory with as many sub-directories as the number of classes. 
    - Each sub-directory should have images belonging to that class in .jpg format. 
    
The input directory should look like below if 
the training data contains images from two classes: roses and dandelion.

    input_directory
        |--roses
            |--abc.jpg
            |--def.jpg
        |--dandelion
            |--ghi.jpg
            |--jkl.jpg

We provide tf_flowers dataset as an example dataset for training and validation. This is only for illutration purpose. When you use this notebook, you need to replace the bucket and prefix references below with your own buckets containing separate datasets for training and validation.

tf_flower comprises images of five types of flowers. 
The dataset has been downloaded from [TensorFlow](https://www.tensorflow.org/datasets/catalog/tf_flowers). 
[Apache 2.0 License](https://jumpstart-cache-prod-us-west-2.s3-us-west-2.amazonaws.com/licenses/Apache-License/LICENSE-2.0.txt).
Citation:
<sub><sup>
@ONLINE {tfflowers,
author = "The TensorFlow Team",
title = "Flowers",
month = "jan",
year = "2019",
url = "http://download.tensorflow.org/example_images/flower_photos.tgz" }
</sup></sub> source: [TensorFlow Hub](model_url). 
***

In [None]:
# Set references to training data
training_data_bucket = f"jumpstart-cache-prod-{aws_region}"
training_data_prefix = "training-datasets/tf_flowers"

# Set references to validation data
validation_data_bucket = f"jumpstart-cache-prod-{aws_region}"
validation_data_prefix = "training-datasets/tf_flowers"

## 3. Hyper-parameters
As explained above, you can modify the three hyper-parameters shown below and examine their effect on the results

In [None]:
# Setting below hyper-parameters for this run

# Number of epochs
EPOCHS = "5"

# Learning rate
LR = "0.001"

# Batch size
BATCH_SIZE = "16"

## 4. Specify models to run

In [None]:
from sagemaker.jumpstart.notebook_utils import list_jumpstart_models

# All available models in JumpStart can be see through this code
# We are showing only the top five models for illustration purpose

filter_value = "task == ic"
ic_models = list_jumpstart_models(filter=filter_value)

print("Total image classification models available in JumpStart: ", len(ic_models))
print()
print("Showing five image classification models from JumpStart: \n", ic_models[0:5])

In [None]:
# We picked arbitraraily four models. You can replace the list below with other models

# The number of models you add to this list shouldn't exceed the number of training and inference instances
# available to your account in SageMaker, as all these models will be trained and inferred in parallel
models = ["tensorflow-ic-imagenet-mobilenet-v2-075-224-classification-4", 
          "tensorflow-ic-imagenet-inception-v3-classification-4", 
          "pytorch-ic-googlenet",
          "pytorch-ic-alexnet"]

## 5. Helper functions

In [None]:
import os
import time
import random

# Function to query the endpoint
def query_endpoint(img, endpoint_name):
    client = boto3.client('runtime.sagemaker')
    response = client.invoke_endpoint(EndpointName=endpoint_name, ContentType='application/x-image', Body=img, Accept='application/json;verbose')
    return response

# Function to parse predicion response
def parse_prediction(query_response):
    model_predictions = json.loads(query_response['Body'].read())
    predicted_label = model_predictions['predicted_label']
    labels = model_predictions['labels']
    probabilities = model_predictions['probabilities']
    return predicted_label, probabilities, labels 

# Function that returns all files under a given S3 bucket prefix
def listS3Files(bucket, prefix):
    file_prefix = []
    file_name = []
    s3 = boto3.resource('s3')
    my_bucket = s3.Bucket(bucket)
    for object_summary in my_bucket.objects.filter(Prefix=prefix):
        if object_summary.key[-1] != "/": # don't append parent directory name
            file_prefix.append(object_summary.key)
            split = object_summary.key.split("/")
            file_name.append(split[-1])
    return file_prefix

# Function to calculate model accuracy
# It will calculate validation accuracy if you supply a validation dataset in the setting above
from sklearn.metrics import accuracy_score
def calcModelAccuracy(endpoint_name, bucket, file_prefixes):
    #maximum images to test against
    size = 100
    if len(file_prefixes)<size: size = len(file_prefixes)
    actual_labels = []
    pred_labels = []
    for fp in file_prefixes[0:size]:
        if not fp.endswith(".jpg"): continue
        s3.download_file(bucket, f"{fp}", "temp.jpg")
        actual_label = fp.split("/")[-2]
        with open("temp.jpg", 'rb') as file: img = file.read()
        query_response = query_endpoint(img, endpoint_name)
        predicted_label, probabilities, labels = parse_prediction(query_response)
        actual_labels.append(actual_label)
        pred_labels.append(predicted_label)
        
    acc = accuracy_score(actual_labels, pred_labels)
    os.remove("temp.jpg")
    return acc

# This function downloads validation images to help measure inference time
def downloadImages(bucket, file_prefixes, size):
    images = []
    total_files = len(file_prefixes)
    if total_files==0: return images
    # download images randomly from validation set
    count = 0
    for i in range(size):
        num = random.randrange(total_files)
        fp = file_prefixes[num]
        # find file extension to make sure it is an image
        result = fp.split(".")
        if len(result)==0: continue
        fext = result[-1].lower()
        if fext in ["jpg", "jpeg", "png", "bmp"]:
            fname = f"temp-{count}.jpg"
            s3.download_file(bucket, fp, fname)
            with open(fname, 'rb') as file: img = file.read()
            images.append(img)
            count += 1
            os.remove(fname)
    return images

# Function to measure inference-time
# This function measures the time it takes to make an inference for all the supplied images
# and reports inference time per image in milliseconds (msec)
def timeIT(images, endpoint_name):
    if len(images)==0: return None
    start_time = time.time()
    for img in images:
        query_response = query_endpoint(img, endpoint_name)
    time_taken = (time.time() - start_time)/len(images)*1000 # converting to msec
    return time_taken

# Functions to save results
import pandas as pd
def writeResults(name, accuracy, time):
    nameList.append(name)
    accList.append(accuracy)
    timeList.append(time)
    
# This function saves the results
def saveResults():
    df = pd.DataFrame({"model-name": nameList, "accuracy": accList, "time-per-inference (msec)": timeList})
    csv_fn = f"./{master_uuid[0:8]}-{EPOCHS}-{LR}-{BATCH_SIZE}.csv"
    df.to_csv(csv_fn, index=False)
    return csv_fn

In [None]:
from sagemaker import image_uris, model_uris, script_uris, hyperparameters
from sagemaker.tuner import ContinuousParameter
from sagemaker.estimator import Estimator
from sagemaker.utils import name_from_base
from sagemaker.tuner import HyperparameterTuner

# Function to finetune the model. It uses "ml.p3.8xlarge" as a training instance.
# Here, we retrieve the training docker container, the training algorithm source, 
# the pre-trained base model, and a python dictionary of the training hyper-parameters 
# that the algorithm accepts with their default values. Note that the model_version="*" 
# fetches the latest model. Also, we do need to specify the training_instance_type to fetch train_image_uri.
def fineTuneModel(model_id):
    model_version = "*"
    uuid = master_uuid[0:8]
    training_job_name = f"jumpstart-example-train-model-compare-{uuid}-{model_id}-FT"
    training_instance_type = "ml.p3.8xlarge"

    train_image_uri = image_uris.retrieve(
        region=None,
        framework=None,
        model_id=model_id,
        model_version=model_version,
        image_scope="training",
        instance_type=training_instance_type,
    )
    # Retrieve the training script
    train_source_uri = script_uris.retrieve(
        model_id=model_id, model_version=model_version, script_scope="training"
    )
    # Retrieve the pre-trained model tarball to further fine-tune
    train_model_uri = model_uris.retrieve(
        model_id=model_id, model_version=model_version, model_scope="training"
    )

    #There are two kinds of parameters that need to be set for training.
    # The first one are the parameters for the training job. These include: (i) Training data path. 
    # This is S3 folder in which the input data is stored, 
    # (ii) Output path: This the s3 folder in which the training output is stored. 
    # (iii) Training instance type: This indicates the type of machine on which to run the training. 
    # Typically, we use GPU instances for these training. We defined the training instance type 
    # above to fetch the correct train_image_uri.
    # The second set of parameters are algorithm specific training hyper-parameters.
    
    training_dataset_s3_path = f"s3://{training_data_bucket}/{training_data_prefix}"

    output_bucket = sess.default_bucket()
    output_prefix = uuid + "-jumpstart-example-ic-training-" + model_id

    s3_output_location = f"s3://{output_bucket}/{output_prefix}/output"

    # Retrieve the default hyper-parameters for fine-tuning the model
    hps = hyperparameters.retrieve_default(model_id=model_id, model_version=model_version)

    # [Optional] Override default hyperparameters with custom values
    hps["epochs"] = EPOCHS
    hps["adam-learning-rate"] = LR
    hps["batch-size"] = BATCH_SIZE

    # Create SageMaker Estimator instance
    ic_estimator = Estimator(
        role=aws_role,
        image_uri=train_image_uri,
        source_dir=train_source_uri,
        model_uri=train_model_uri,
        entry_point="transfer_learning.py",
        instance_count=1,
        instance_type=training_instance_type,
        max_run=360000,
        hyperparameters=hps,
        output_path=s3_output_location,
        base_job_name=training_job_name,
    )

    # Launch a SageMaker Training job by passing s3 path of the training data
    ic_estimator.fit({"training": training_dataset_s3_path}, logs=True, wait=False)
        
    training_job_name = ic_estimator.latest_training_job.name
    estimator = ic_estimator
    return training_job_name, estimator

In [None]:
# Function to deploy a fine tune model. Model is deployed to a "ml.p3.2xlarge" instance
# A trained model does nothing on its own. We now want to use the model to perform inference. 
# For this example, that means predicting the class label of an image. 
# Run inference on the pre-trained model. We start by retrieving the artifacts for deploying an endpoint.

def deployFineTunedModel(model_id, ic_estimator):
    model_version = "*"
    uuid = master_uuid[0:8]
    inference_instance_type = "ml.p3.2xlarge"

    # Retrieve the inference docker container uri
    deploy_image_uri = image_uris.retrieve(
        region=None,
        framework=None,
        image_scope="inference",
        model_id=model_id,
        model_version=model_version,
        instance_type=inference_instance_type,
    )
    # Retrieve the inference script uri
    deploy_source_uri = script_uris.retrieve(
        model_id=model_id, model_version=model_version, script_scope="inference"
    )

    endpoint_name = name_from_base(f"jumpstart-example-infer-model-compare-{model_id}-")

    # Use the estimator to deploy to a SageMaker endpoint
    finetuned_predictor = (ic_estimator).deploy(
        initial_instance_count=1,
        instance_type=inference_instance_type,
        entry_point="inference.py",
        image_uri=deploy_image_uri,
        source_dir=deploy_source_uri,
        endpoint_name=endpoint_name,
        wait = False
    )
    
    return endpoint_name, finetuned_predictor

## 6. Run all models

In [None]:
# run all models
import time

# get file prefixes for all validation data files and download inference images
INF_TEST_NUM_IMAGES = 100
val_file_prefixes = listS3Files(validation_data_bucket, validation_data_prefix)
images = downloadImages(validation_data_bucket, val_file_prefixes, INF_TEST_NUM_IMAGES)

def run():
    uuid = master_uuid[0:8]
    client = boto3.client('sagemaker')
        
    # fine-tuned training
    tjNames = []
    estimators = []
    for model_id in models:
        training_job_name, ic_estimator = fineTuneModel(model_id)
        tjNames.append(training_job_name)
        estimators.append(ic_estimator)
        time.sleep(10)
        
    while(True):
        count = 0
        for tj in tjNames:
            response = client.describe_training_job(TrainingJobName=tj)
            print(response['TrainingJobStatus'])
            if (response['TrainingJobStatus']=='Completed'):
                count += 1
                print("training job completed: " + tj)
        if count==len(tjNames): break
        time.sleep(60)
        
    # fine tuned deploy
    endpoints = []
    predictors = []
    for i in range(len(models)):
        model_id = models[i]
        ep, pred = deployFineTunedModel(model_id, estimators[i])
        endpoints.append(ep)
        predictors.append(pred)
        
    while(True):
        count = 0
        for ep in endpoints:
            response = client.describe_endpoint(EndpointName=ep)
            print(response['EndpointStatus'])
            if (response['EndpointStatus']=='InService'):
                count += 1
                print("endpoint in service" + ep)
        if count==len(endpoints): break
        time.sleep(60)
            
    for i in range(len(models)):
        print("making inferences for model: " + models[i])
        name = uuid + "-" + models[i] + "-FT"
        accuracy = calcModelAccuracy(endpoints[i], validation_data_bucket, val_file_prefixes)
        mytime = timeIT(images, endpoints[i])
        writeResults(name, accuracy, mytime)
        predictors[i].delete_model()
        predictors[i].delete_endpoint()
        
    # Save results to a csv file
    csv_fn = saveResults()
    return csv_fn

In [None]:
csv_fn = run()

### Results are shown below

In [None]:
pd.read_csv(csv_fn)

## Conclusion

In this post, we demonstrated how to use JumpStart to build high performing image classification models on multiple dimensions of interest, such as model accuracy, training time, and inference latency. We provided the code to run this exercise on your own dataset; you can pick any models of interest that are presently available for image classification in the JumpStart model hub. You can obtain training times from SageMaker console under Training / Training Jobs. We encourage you to give it a try today. For more details on JumpStart, refer to [SageMaker JumpStart](https://docs.aws.amazon.com/sagemaker/latest/dg/studio-jumpstart.html)