# Train and Deploy a Deep Learning Model on AWS

In this notebook we use Sagemaker to finetune a pretrained model with hyperparameter tuning. Once the best hyperparameters have been found, we train an estimator using Sagemaker debugging and profiling. This model is then deployed and tested.

In [None]:
!pip install smdebug

In [None]:
import sagemaker
import boto3

from sagemaker.pytorch import PyTorch
from sagemaker.tuner import (
    IntegerParameter,
    CategoricalParameter,
    ContinuousParameter,
    HyperparameterTuner,
)

In [None]:
session = sagemaker.Session()

bucket_sagemaker = session.default_bucket()
print("Default Bucket: {}".format(bucket_sagemaker))

region = session.boto_region_name
print("AWS Region: {}".format(region))

role = sagemaker.get_execution_role()
print("RoleArn: {}".format(role))

## Dataset
In this project we use the dog breed classication dataset to classify between different breeds of dogs in images.

#### Download and unzip data

In [None]:
# !wget https://s3-us-west-1.amazonaws.com/udacity-aind/dog-project/dogImages.zip # Slower
!aws s3 cp s3://udacity-aind/dog-project/dogImages.zip ./ # Much faster!
!unzip dogImages.zip

In [None]:
! ls

How many images do we have?

In [None]:
! find train -type f | ws -l

In [None]:
! find test -type f | ws -l

In [None]:
! find valid -type f | ws -l

#### Upload data to an S3 bucket

In [None]:
BUCKET_DATA_PATH = "s3://dog-breed-classification/data/"

In [None]:
!aws s3 sync ./dogImages/ {BUCKET_DATA_PATH}

In [None]:
# Alternative:
# s3_data_path = sagemaker_session.upload_data(path="data", bucket=BUCKET_DATA)
# print(s3_data_path)

#TODO: test this!!!

## Hyperparameter Tuning
In this section, we finetune a pretrained model with hyperparameter tuning.

**Note:** We will need to use the `hpo.py` script to perform hyperparameter tuning.

We start by instantiating the estimator. For the estimator we need:
- `entry_point`: The path of the training script
- `base_job_name`: The name of the job
- `instance_type`: The type of training instance you want to use
- `instance_count`: The number of training instances to use
- `framework_version`: The version of pytorch you want in your training instance
- `py_version`: The version of Python you want in your training instance

In [2]:
estimator = PyTorch(
    entry_point="hpo.py",
    role=role,
    py_version='py39',
    framework_version="1.8",
    instance_count=1,
    instance_type="ml.m5.large"
)

NameError: name 'PyTorch' is not defined

TODO
How many images do we have?
How do we justify the hyperparameters?
Do we need to incorporate epocs?
Do we need to incorporate the domain?

The hyperparameters we want to tune are specified in a dictionary as shown below:

In [None]:
hyperparameter_ranges = {
    "lr": ContinuousParameter(0.001, 0.1),
    "batch-size": CategoricalParameter([32, 64, 128, 256, 512]),
}

We also need to specify the metric that we are trying to optimize for and how Sagemaker can identify it from the training logs. Since we are optimizing for loss, our objective needs to be minimized. Other metrics like accuracy will need to be maximized.

In [None]:
objective_metric_name = "average test loss"
objective_type = "Minimize"
metric_definitions = [{"Name": "average test loss", "Regex": "Test set: Average loss: ([0-9\\.]+)"}]

Having this we instanciate the `tuner` object:

In [None]:
tuner = HyperparameterTuner(
    estimator,
    objective_metric_name,
    hyperparameter_ranges,
    metric_definitions,
    max_jobs=4,
    max_parallel_jobs=2,
    objective_type=objective_type,
)

We fit the hyperparameter tuner:

In [None]:
tuner.fit({"training": BUCKET_DATA_PATH}) # TODO: check how this works in practice

Get the best estimators and the best Hyperparameters:

In [None]:
tuner.best_training_job()

In [None]:
best_estimator = tuner.best_estimator()
best_estimator

Get the hyperparameters of the best trained model

In [None]:
best_estimator.hyperparameters()

## Model Profiling and Debugging
Using the best hyperparameters, create and finetune a new model

We will need to use the `train_model.py` script to perform model profiling and debugging.

In [None]:
# TODO: Set up debugging and profiling rules and hooks

In [None]:
# TODO: Create and fit an estimator

estimator = # TODO: Your estimator here

In [None]:
# TODO: Plot a debugging output.

**TODO**: Is there some anomalous behaviour in your debugging output? If so, what is the error and how will you fix it?  
**TODO**: If not, suppose there was an error. What would that error look like and how would you have fixed it?

In [None]:
# TODO: Display the profiler output

## Model Deploying

In [None]:
predictor=estimator.deploy(initial_instance_count=1, instance_type="ml.t2.medium") 

In [None]:
# TODO: Run an prediction on the endpoint

image = # TODO: Your code to load and preprocess image to send to endpoint for prediction
response = predictor.predict(image)

In [None]:
# TODO: Remember to shutdown/delete your endpoint once your work is done
predictor.delete_endpoint()