# Image Classification using AWS Sagemaker

This notebook lists all the steps that you need to complete the complete this project. You will need to complete all the TODOs in this notebook as well as in the README and the two python scripts included with the starter code.


In this project, I will be using AWS Sagemaker to finetune a pretrained model that can recognize/classify different dog breeds. throughout the project, I will be using Sagemaker profiling, debugger, hyperparameter tuning and other good ML engineering practices. The goal is to perform transfer learning using a SOTA(State Of The Art) model available in sagemaker.

In [2]:
# For instance, you will need the smdebug package
!pip install smdebug

Collecting smdebug
  Downloading smdebug-1.0.12-py2.py3-none-any.whl (270 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m270.1/270.1 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m00:01[0m
Collecting pyinstrument==3.4.2 (from smdebug)
  Downloading pyinstrument-3.4.2-py2.py3-none-any.whl (83 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m83.3/83.3 kB[0m [31m1.4 MB/s[0m eta [36m0:00:00[0m:00:01[0m
[?25hCollecting pyinstrument-cext>=0.2.2 (from pyinstrument==3.4.2->smdebug)
  Downloading pyinstrument_cext-0.2.4-cp37-cp37m-manylinux2010_x86_64.whl (20 kB)
Collecting urllib3<1.27,>=1.25.4 (from botocore<1.32.0,>=1.31.26->boto3>=1.10.32->smdebug)
  Obtaining dependency information for urllib3<1.27,>=1.25.4 from https://files.pythonhosted.org/packages/c5/05/c214b32d21c0b465506f95c4f28ccbcba15022e000b043b72b3df7728471/urllib3-1.26.16-py2.py3-none-any.whl.metadata
  Downloading urllib3-1.26.16-py2.py3-none-any.whl.metadata (48 kB)
[2K     [90m━━

In [3]:
# For instance you will need Boto3 and Sagemaker
import sagemaker
import boto3

## Dataset
the data contains 8351 total dog images from 133 possible dog breed classes, roughly 63 per class. The dataset split as follows:

6680 images for the training set;
835 images for the validation set;
836 images for the test set.

The Dataset is split into three directories: train, validation and testing. Each split has 133 sub directories, on for each dog breed.

In [None]:
# Command to download and unzip data
!wget https://s3-us-west-1.amazonaws.com/udacity-aind/dog-project/dogImages.zip
!unzip dogImages.zip

# uploading extracted images to s3 bucket
!aws s3 sync dogImages/ s3://sagemaker-studio-re6trz79jz/dogImages/

### dataset structure
![folder_structure](images/dataset_file_structure.png)
<br>
### uploaded dataset in S3
![image.png](images/dataInS3.png)

## Hyperparameter Tuning
**TODO:** This is the part where you will finetune a pretrained model with hyperparameter tuning. Remember that you have to tune a minimum of two hyperparameters. However you are encouraged to tune more. You are also encouraged to explain why you chose to tune those particular hyperparameters and the ranges.

**Note:** You will need to use the `hpo.py` script to perform hyperparameter tuning.

In [None]:
#TODO: Declare your HP ranges, metrics etc.

In [None]:
#TODO: Create estimators for your HPs

estimator = # TODO: Your estimator here

tuner = # TODO: Your HP tuner here

In [None]:
# TODO: Fit your HP Tuner
tuner.fit() # TODO: Remember to include your data channels

In [None]:
# TODO: Get the best estimators and the best HPs

best_estimator = #TODO

#Get the hyperparameters of the best trained model
best_estimator.hyperparameters()

## Model Profiling and Debugging
TODO: Using the best hyperparameters, create and finetune a new model

**Note:** You will need to use the `train_model.py` script to perform model profiling and debugging.

In [None]:
# TODO: Set up debugging and profiling rules and hooks

In [None]:
# TODO: Create and fit an estimator

estimator = # TODO: Your estimator here

In [None]:
# TODO: Plot a debugging output.

**TODO**: Is there some anomalous behaviour in your debugging output? If so, what is the error and how will you fix it?  
**TODO**: If not, suppose there was an error. What would that error look like and how would you have fixed it?

In [None]:
# TODO: Display the profiler output

## Model Deploying

In [None]:
# TODO: Deploy your model to an endpoint

predictor=estimator.deploy() # TODO: Add your deployment configuration like instance type and number of instances

In [None]:
# TODO: Run an prediction on the endpoint

image = # TODO: Your code to load and preprocess image to send to endpoint for prediction
response = predictor.predict(image)

In [None]:
# TODO: Remember to shutdown/delete your endpoint once your work is done
predictor.delete_endpoint()