# Deep learning: classifying dog breeds

This project consists of setting up infrastructure to manage the lifecycle of a machine learning model, using AWS resources such as Sagemaker and EC2 to create, train and deploy a pretrained model that can classify images from a dataset containing different dog breeds.

The focus of this proyect is to set up infrastructure that supports the functioning of machine learning models, therefore other tools to monitor and test the model's performance will be used in addition to the activities mentioned above.

In [None]:
#Install needed dependencies
!pip install smdebug
#!pip install sagemaker #Sagemaker instances already have this library

In [4]:
#Import libraries
import sagemaker
import boto3
import os

session = sagemaker.Session()
bucket = session.default_bucket()

## Dataset

This project uses the Stanford Cars dataset, which according to its creators:
> Contains 16,185 images of 196 classes of cars. The data is split into 8,144 training images and 8,041 testing images, where each class has been split roughly in a 50-50 split. Classes are typically at the level of Make, Model, Year, e.g. 2012 Tesla Model S or 2012 BMW M3 coupe.

In [None]:
# Command to download and unzip data

trainset = torchvision.datasets.StanfordCars(root='./data', split='train',
        download=True, transform=training_transform)

testset = torchvision.datasets.StanfordCars(root='./data', split='test',
        download=True, transform=test_transform)

#Upload to S3
os.environ["DEFAULT_S3_BUCKET"] = bucket #default bucket that sagemaker uses
!aws s3 sync ./data s3://${DEFAULT_S3_BUCKET}/carImages/

#### For curious readers:

Datasets included in the pytorch library are listed in this [link](https://pytorch.org/vision/stable/generated/torchvision.datasets.StanfordCars.html#torchvision.datasets.StanfordCars) <br><br>
For the STANFORD CARS dataset, which is used for this project, documentation can be consulted in the [Pytorch website](https://pytorch.org/vision/stable/generated/torchvision.datasets.StanfordCars.html#torchvision.datasets.StanfordCars). Moreover, the original documentation on this dataset can be found [here](https://ai.stanford.edu/~jkrause/cars/car_dataset.html)

## Hyperparameter Tuning
**TODO:** This is the part where you will finetune a pretrained model with hyperparameter tuning. Remember that you have to tune a minimum of two hyperparameters. However you are encouraged to tune more. You are also encouraged to explain why you chose to tune those particular hyperparameters and the ranges.

**Note:** You will need to use the `hpo.py` script to perform hyperparameter tuning.

In [None]:
#TODO: Declare your HP ranges, metrics etc.

In [None]:
#TODO: Create estimators for your HPs

estimator = # TODO: Your estimator here

tuner = # TODO: Your HP tuner here

In [None]:
# TODO: Fit your HP Tuner
tuner.fit() # TODO: Remember to include your data channels

In [None]:
# TODO: Get the best estimators and the best HPs

best_estimator = #TODO

#Get the hyperparameters of the best trained model
best_estimator.hyperparameters()

## Model Profiling and Debugging
TODO: Using the best hyperparameters, create and finetune a new model

**Note:** You will need to use the `train_model.py` script to perform model profiling and debugging.

In [None]:
# TODO: Set up debugging and profiling rules and hooks

In [None]:
# TODO: Create and fit an estimator

estimator = # TODO: Your estimator here

In [None]:
# TODO: Plot a debugging output.

**TODO**: Is there some anomalous behaviour in your debugging output? If so, what is the error and how will you fix it?  
**TODO**: If not, suppose there was an error. What would that error look like and how would you have fixed it?

In [None]:
# TODO: Display the profiler output

## Model Deploying

In [None]:
# TODO: Deploy your model to an endpoint

predictor=estimator.deploy() # TODO: Add your deployment configuration like instance type and number of instances

In [None]:
# TODO: Run an prediction on the endpoint

image = # TODO: Your code to load and preprocess image to send to endpoint for prediction
response = predictor.predict(image)

In [None]:
# TODO: Remember to shutdown/delete your endpoint once your work is done
predictor.delete_endpoint()