- Upload CIFAR10 data to an S3 bucket.<br>
- Finetune 3 hyperparameters. You can choose one of the hyperparameters that is already added as a command line argument in the training script, or you can add one of your own.<br>
- Deploy the best trained model, query it and get the result.<br>

**hpo_deploy.ipynb** Tasks
1. Upload the data to an S3 bucket through sagemaker_session object, boto3 or the AWS CLI.
2. Initialise your hyperparameters.
3. Create your HyperparameterTuner Object
4. Train your model
5. Query the endpoint

In [2]:
!pip install --upgrade sagemaker

Collecting sagemaker
  Using cached sagemaker-2.135.1.post0-py2.py3-none-any.whl
Collecting importlib-metadata<5.0,>=1.4.0
  Using cached importlib_metadata-4.13.0-py3-none-any.whl (23 kB)
Installing collected packages: importlib-metadata, sagemaker
  Attempting uninstall: importlib-metadata
    Found existing installation: importlib-metadata 6.0.0
    Uninstalling importlib-metadata-6.0.0:
      Successfully uninstalled importlib-metadata-6.0.0
  Attempting uninstall: sagemaker
    Found existing installation: sagemaker 2.132.0
    Uninstalling sagemaker-2.132.0:
      Successfully uninstalled sagemaker-2.132.0
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
pytest-astropy 0.8.0 requires pytest-cov>=2.0, which is not installed.
pytest-astropy 0.8.0 requires pytest-filter-subpackage>=0.1, which is not installed.
docker-compose 1.29.2 requires PyYAML<6,>=3.

In [5]:
# !conda install pytorch torchvision cpuonly -c pytorch
# import torchvision
# torchvision.__version__

## Hyperparameter Tuning in SageMaker

In [6]:
import sagemaker
from sagemaker.tuner import (
    IntegerParameter,
    CategoricalParameter,
    ContinuousParameter,
    HyperparameterTuner,
)

sagemaker_session = sagemaker.Session()

bucket = sagemaker_session.default_bucket()
prefix = "sagemaker/DEMO-pytorch-cifar"

role = sagemaker.get_execution_role()

### Fetching CIFAR10 dataset and Uploading to S3

In [7]:
from torchvision.datasets import CIFAR10
from torchvision import transforms

local_dir = 'data'
# CIFAR10.mirrors attribute to copy the import CIFAR10 dataset from PyTorch to S3 bucket
CIFAR10.mirrors = ["https://sagemaker-sample-files.s3.amazonaws.com/datasets/image/CIFAR10/"]
CIFAR10(
    local_dir,
    download=True,
    transform=transforms.Compose(
        [transforms.ToTensor()]
    )
)

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to data/cifar-10-python.tar.gz


HBox(children=(FloatProgress(value=0.0, max=170498071.0), HTML(value='')))


Extracting data/cifar-10-python.tar.gz to data


Dataset CIFAR10
    Number of datapoints: 50000
    Root location: data
    Split: Train
    StandardTransform
Transform: Compose(
               ToTensor()
           )

#### Upload the downloaded data to S3 bucket

In [8]:
# TODO: Upload the data to an S3 bucket. You can use the sagemaker_session object, boto3 or the AWS CLI
inputs = sagemaker_session.upload_data(path="data", bucket=bucket, key_prefix=prefix)
print("input spec (in this case, just an S3 path): {}".format(inputs))

input spec (in this case, just an S3 path): s3://sagemaker-us-east-1-293789295245/sagemaker/DEMO-pytorch-cifar


### Create an estimator to train the model

In [9]:
from sagemaker.pytorch import PyTorch

estimator = PyTorch(
    entry_point="cifar.py",
    role=role,
    py_version='py36',
    framework_version="1.8",
    instance_count=1,
    instance_type="ml.m5.large"
)

### Hyperparameter Ranges

In [10]:
#TODO: Initialise your hyperparameters
hyperparameter_ranges = {
    "lr": ContinuousParameter(0.001, 0.1),  # Any value in range of continuous parameters between 
    "batch-size": CategoricalParameter([32, 64, 128, 256, 512]), # Any categorical parameter amongst what is specified
    "epochs": IntegerParameter(2, 4)    # Any Ineger among 2,3 and 4
} 

### Metric Optimization

In [11]:
objective_metric_name = "average test loss"
objective_type = "Minimize"
metric_definitions = [{"Name": "average test loss", "Regex": "Test set: Average loss: ([0-9\\.]+)"}]

### HPO Tuner

In [12]:
#TODO: Create your HyperparameterTuner Object
tuner = HyperparameterTuner(
    estimator,               # Pass the estimator
    objective_metric_name,   # Pass the objective metric name to be optimized
    hyperparameter_ranges,   # Hyperparameter ranges
    metric_definitions,      # Metric definitions
    max_jobs=4,              # No. of training jobs to be carried out using combinations of hyperparameters
    max_parallel_jobs=2,     # No. of jobs to be carried out at a time using parallelism
    objective_type=objective_type,  # Type of optimization to the objective metric
)

### Fit the tuner

In [13]:
#TODO: Train your model
tuner.fit({"training": inputs})

No finished training job found associated with this estimator. Please make sure this estimator is only used for building workflow config
No finished training job found associated with this estimator. Please make sure this estimator is only used for building workflow config


.........................................................................!


### Deploy the endpoint

**Deploys the best estimator only**

In [14]:
predictor = tuner.deploy(initial_instance_count=1, instance_type="ml.t2.medium")


2023-03-02 10:33:37 Starting - Found matching resource for reuse
2023-03-02 10:33:37 Downloading - Downloading input data
2023-03-02 10:33:37 Training - Training image download completed. Training in progress.
2023-03-02 10:33:37 Uploading - Uploading generated training model
2023-03-02 10:33:37 Completed - Resource retained for reuse
-------!

### Query the Endpoint

In [52]:
import gzip 
import numpy as np
import random
import os

file = 'data/cifar-10-batches-py/data_batch_1'
def unpickle(file):
    import pickle
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return dict

data=unpickle(file)
data=np.reshape(data[b'data'][0], (3, 32, 32))     # Reshape vector of 3072 i.e. 3*32*32 to (3,32,32)
data = np.array(data).astype(np.float32)

In [58]:
payload = np.expand_dims(data, axis=0)   # Brings data i.e. single image (3,32,32) to dimensions (1,3,32,32). 
response = predictor.predict(payload) # TODO: Query the endpoint
print(response)

labeled_predictions = list(zip(range(10), response[0]))
print("Labeled predictions: ")
print(labeled_predictions)
print()

labeled_predictions.sort(key=lambda label_and_prob: 1.0 - label_and_prob[1])
print("Most likely answer: {}".format(labeled_predictions[0]))

[[ -309.66009521  -129.29052734  -358.48867798  -305.5826416
   -482.46270752  -154.92604065  -260.54373169     0.
  -1023.86523438  -209.98905945]]
Labeled predictions: 
[(0, -309.66009521484375), (1, -129.29052734375), (2, -358.4886779785156), (3, -305.5826416015625), (4, -482.46270751953125), (5, -154.92604064941406), (6, -260.5437316894531), (7, 0.0), (8, -1023.865234375), (9, -209.9890594482422)]

Most likely answer: (7, 0.0)


### Cleanup

After you have finished with this exercise, remember to delete the prediction endpoint to release the instance associated with it

In [59]:
predictor.delete_endpoint()