# Training and Deploying a Custom Image Classifier with Amazon SageMaker

## Configure where to fetch our training data

All of our images live inside an S3 bucket, organized into folders in a structure similar to this:

```
my_training_classes
├── person
│   ├── han.jpg
│   ├── leia.jpg
|   ├── luke.jpg
│   └── . . .
└── ship
│   ├── millenium_falcon.jpg
│   ├── tie-fighter.jpg    
│   ├── x-wing.jpg
│   ├── . . .
└── . . .
```

In [12]:
pwd

'/Users/seanbrown/projects/WATSneakers'

In [13]:
import sys
import logging

logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
logger.addHandler(logging.StreamHandler(sys.stdout))
logger.info('that should work :-)')

that should work :-)
that should work :-)


In [14]:
# An S3 Bucket Name
data_bucket_name='sagemakertestwat'

# A prefix name inside the S3 bucket containing sub-folders of images (one per label class)
dataset_name = 'INC_DATA' 

logger.info("data bucket name is: "+data_bucket_name);
logger.info("dataset name is: "+dataset_name);

data bucket name is: sagemakertestwat
data bucket name is: sagemakertestwat
dataset name is: INC_DATA
dataset name is: INC_DATA


## Setting up the environment
Here we set up the linkage and authentication to AWS services

- The role used to give learning and hosting access to your data. This will automatically be obtained from the role used to start the notebook
- A `session` variable that holds some configuration state for interacting with SageMaker from Python and contains some methods for preparing input data
- A reference to the Amazon sagemaker image classification docker image 

More info about the SageMaker built-in Image Classification algorithm here: https://docs.aws.amazon.com/sagemaker/latest/dg/image-classification.html

In [15]:
import sagemaker
from sagemaker import get_execution_role
from sagemaker.amazon.amazon_estimator import get_image_uri

role = get_execution_role()
sess = sagemaker.Session()

training_image = get_image_uri(sess.boto_region_name, 'image-classification', repo_version="latest")

logger.info(training_image)


Changing event name from creating-client-class.iot-data to creating-client-class.iot-data-plane
Changing event name from creating-client-class.iot-data to creating-client-class.iot-data-plane
Changing event name from before-call.apigateway to before-call.api-gateway
Changing event name from before-call.apigateway to before-call.api-gateway
Changing event name from request-created.machinelearning.Predict to request-created.machine-learning.Predict
Changing event name from request-created.machinelearning.Predict to request-created.machine-learning.Predict
Changing event name from before-parameter-build.autoscaling.CreateLaunchConfiguration to before-parameter-build.auto-scaling.CreateLaunchConfiguration
Changing event name from before-parameter-build.autoscaling.CreateLaunchConfiguration to before-parameter-build.auto-scaling.CreateLaunchConfiguration
Changing event name from before-parameter-build.route53 to before-parameter-build.route-53
Changing event name from before-parameter-build

Loading JSON file: /Users/seanbrown/miniconda3/lib/python3.10/site-packages/botocore/data/sagemaker-metrics/2022-09-30/service-2.json
Loading JSON file: /Users/seanbrown/miniconda3/lib/python3.10/site-packages/botocore/data/sagemaker-metrics/2022-09-30/endpoint-rule-set-1.json.gz
Loading JSON file: /Users/seanbrown/miniconda3/lib/python3.10/site-packages/botocore/data/sagemaker-metrics/2022-09-30/endpoint-rule-set-1.json.gz
Event creating-client-class.sagemaker-metrics: calling handler <function add_generate_presigned_url at 0x7f88973ed510>
Event creating-client-class.sagemaker-metrics: calling handler <function add_generate_presigned_url at 0x7f88973ed510>
Setting metrics.sagemaker timeout as (60, 60)
Setting metrics.sagemaker timeout as (60, 60)
Registering retry handlers for service: sagemaker-metrics
Registering retry handlers for service: sagemaker-metrics
Loading JSON file: /Users/seanbrown/miniconda3/lib/python3.10/site-packages/boto3/data/s3/2006-03-01/resources-1.json
Loading 

Sending http request: <AWSPreparedRequest stream_output=False, method=POST, url=https://sts.us-east-1.amazonaws.com/, headers={'Content-Type': b'application/x-www-form-urlencoded; charset=utf-8', 'User-Agent': b'Boto3/1.26.90 Python/3.10.8 Darwin/22.3.0 Botocore/1.29.90', 'X-Amz-Date': b'20230405T050446Z', 'Authorization': b'AWS4-HMAC-SHA256 Credential=AKIA33EHRS26U4KVLXVO/20230405/us-east-1/sts/aws4_request, SignedHeaders=content-type;host;x-amz-date, Signature=bed8595614a4e214ed39ec8b540d49c2b74df53d12b52fd31054ec6ec962c149', 'amz-sdk-invocation-id': b'2d0bb74c-e0af-41e8-bab4-41b67641e5d8', 'amz-sdk-request': b'attempt=1', 'Content-Length': '43'}>
Certificate path: /Users/seanbrown/miniconda3/lib/python3.10/site-packages/certifi/cacert.pem
Certificate path: /Users/seanbrown/miniconda3/lib/python3.10/site-packages/certifi/cacert.pem
Starting new HTTPS connection (1): sts.us-east-1.amazonaws.com:443
Starting new HTTPS connection (1): sts.us-east-1.amazonaws.com:443
https://sts.us-east-

Sending http request: <AWSPreparedRequest stream_output=False, method=POST, url=https://iam.amazonaws.com/, headers={'Content-Type': b'application/x-www-form-urlencoded; charset=utf-8', 'User-Agent': b'Boto3/1.26.90 Python/3.10.8 Darwin/22.3.0 Botocore/1.29.90', 'X-Amz-Date': b'20230405T050447Z', 'Authorization': b'AWS4-HMAC-SHA256 Credential=AKIA33EHRS26U4KVLXVO/20230405/us-east-1/iam/aws4_request, SignedHeaders=content-type;host;x-amz-date, Signature=42be3babf0b16ad3c385653d010fbf564d3d4107f909a98d49a7995641054b30', 'amz-sdk-invocation-id': b'1ae24941-650d-441c-bb3f-0b10b20071b7', 'amz-sdk-request': b'attempt=1', 'Content-Length': '53'}>
Certificate path: /Users/seanbrown/miniconda3/lib/python3.10/site-packages/certifi/cacert.pem
Certificate path: /Users/seanbrown/miniconda3/lib/python3.10/site-packages/certifi/cacert.pem
Starting new HTTPS connection (1): iam.amazonaws.com:443
Starting new HTTPS connection (1): iam.amazonaws.com:443
https://iam.amazonaws.com:443 "POST / HTTP/1.1" 40

ValueError: The current AWS identity is not a role: arn:aws:iam::814180570813:user/sean.brown, therefore it cannot be used as a SageMaker execution role

## Preparing data for our model
Before we can train our model, we need to:

- Create some files that will teach SageMaker about the images in each of our classes
- Upload these additional files to S3
- Configure our model to use these files for training and validating

### Find the im2rec.py script on this system
The SageMaker image classifier algorithm needs to know about which images belong to which classes. We provide this data using either LST or RecordIO files. We'll use a python script called `im2rec.py` to create these files.

More info here: https://docs.aws.amazon.com/sagemaker/latest/dg/image-classification.html#IC-inputoutput

In [21]:
# Find im2rec in our environment and set up some other vars in our environemnt

base_dir='/tmp'

%env BASE_DIR=$base_dir
%env S3_DATA_BUCKET_NAME = $data_bucket_name
%env DATASET_NAME = $dataset_name

import sys,os

# get im2rec
suffix='~/mxnet/tools/im2rec.py'
os.system(f"mkdir -p {suffix}")
os.system(f"wget -P {suffix} https://raw.githubusercontent.com/apache/incubator-mxnet/master/tools/im2rec.py")
os.system(f"ls {suffix}")
#im2rec = list(filter( (lambda x: os.path.isfile(x + suffix )), sys.path))[0] + suffix
im2rec = str(os.system("pwd")) + suffix
logger.info("im2rec is: "+im2rec)
%env IM2REC=$im2rec

env: BASE_DIR=/tmp
env: S3_DATA_BUCKET_NAME=sagemakertestwat
env: DATASET_NAME=INC_DATA
im2rec.py
im2rec.py.1
im2rec.py.2
/Users/seanbrown/projects/WATSneakers
im2rec is: 0~/mxnet/tools/im2rec.py
im2rec is: 0~/mxnet/tools/im2rec.py
env: IM2REC=0~/mxnet/tools/im2rec.py


--2023-04-04 22:19:56--  https://raw.githubusercontent.com/apache/incubator-mxnet/master/tools/im2rec.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 2606:50c0:8002::154, 2606:50c0:8000::154, 2606:50c0:8003::154, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|2606:50c0:8002::154|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 15812 (15K) [text/plain]
Saving to: ‘/Users/seanbrown/mxnet/tools/im2rec.py/im2rec.py.2’

     0K .......... .....                                      100% 3.20M=0.005s

2023-04-04 22:19:56 (3.20 MB/s) - ‘/Users/seanbrown/mxnet/tools/im2rec.py/im2rec.py.2’ saved [15812/15812]



### Get our training images from S3
In order to create training and validation RecordIO files, we need to download our images to our local filesystem.

In [None]:
# Pull our images from S3
!aws s3 sync s3://$S3_DATA_BUCKET_NAME/$DATASET_NAME $BASE_DIR/$DATASET_NAME --quiet

### Create RecordIO files from our training images
The `im2rec.py` script can create LST files and/or RecordIO files from our training data. 

More info here: https://mxnet.incubator.apache.org/versions/master/faq/recordio.html

In [None]:
%%bash
# Use the IM2REC script to convert our images into RecordIO files

# Clean up our working dir of existing LST and REC files
cd $BASE_DIR
#rm *.rec
#rm *.lst

# First we need to create two LST files (training and test lists), noting the correct label class for each image
# We'll also save the output of the LST files command, since it includes a list of all of our label classes
echo "Creating LST files"
python $IM2REC --list --recursive --pass-through --test-ratio=0.3 --train-ratio=0.7 $DATASET_NAME $DATASET_NAME > ${DATASET_NAME}_classes

echo "Label classes:"
cat ${DATASET_NAME}_classes

# Then we create RecordIO files from the LST files
echo "Creating RecordIO files"
python $IM2REC --num-thread=4 ${DATASET_NAME}_train.lst $DATASET_NAME
python $IM2REC --num-thread=4 ${DATASET_NAME}_test.lst $DATASET_NAME
ls -lh *.rec

### Upload our training and test data RecordIO files so we can train with them
Now that we have our training and test .rec files, we upload them to S3 so SageMaker can use them for training

In [None]:
# Upload our train and test RecordIO files to S3 in the bucket that our sagemaker session is using
bucket = sess.default_bucket()

s3train_path = 's3://{}/{}/train/'.format(bucket, dataset_name)
s3validation_path = 's3://{}/{}/validation/'.format(bucket, dataset_name)

logger.info("s3 file training path is: "+s3train_path)
logger.info("s3 file validation path is: "+s3validation_path)

# Clean up any existing data
!aws s3 rm s3://{bucket}/{dataset_name}/train --recursive
!aws s3 rm s3://{bucket}/{dataset_name}/validation --recursive

# Upload the rec files to the train and validation channels
!aws s3 cp /tmp/{dataset_name}_train.rec $s3train_path
!aws s3 cp /tmp/{dataset_name}_test.rec $s3validation_path

### Configure the data for our model training to use
Finally, we tell SageMaker where to find these RecordIO files to use for training

In [None]:
train_data = sagemaker.session.s3_input(
    s3train_path, 
    distribution='FullyReplicated', 
    content_type='application/x-recordio', 
    s3_data_type='S3Prefix'
)

validation_data = sagemaker.session.s3_input(
    s3validation_path, 
    distribution='FullyReplicated', 
    content_type='application/x-recordio', 
    s3_data_type='S3Prefix'
)

data_channels = {'train': train_data, 'validation': validation_data}

## Training
Now it's time to train our model!

### Create an image classifier object with some base configuration
More info here: https://sagemaker.readthedocs.io/en/stable/estimators.html#sagemaker.estimator.Estimator

In [None]:
s3_output_location = 's3://{}/{}/output'.format(bucket, dataset_name)

image_classifier = sagemaker.estimator.Estimator(training_image,role, train_instance_count=1, train_instance_type='ml.p3.2xlarge', output_path=s3_output_location, sagemaker_session=sess)


### Set some training hyperparameters

Finally, before we train, we provide some additional configuration parameters for the training.

More info here: https://docs.aws.amazon.com/sagemaker/latest/dg/IC-Hyperparameter.html

In [None]:
num_classes=! ls -l {base_dir}/{dataset_name} | wc -l
num_classes=int(num_classes[0]) - 1

logger.info("number of classes is: "+str(num_classes))

num_training_samples=! cat {base_dir}/{dataset_name}_train.lst | wc -l
num_training_samples = int(num_training_samples[0])

logger.info("number of training samples is: "+str(num_training_samples))

# Learn more about the Sagemaker built-in Image Classifier hyperparameters here: https://docs.aws.amazon.com/sagemaker/latest/dg/IC-Hyperparameter.html

# These hyperparameters we won't want to change, as they define things like
# the size of the images we'll be sending for input, the number of training classes we have, etc.
base_hyperparameters=dict(
    use_pretrained_model=1,
    num_layers=34,
    image_shape='3,215,215',
    resize=430,
    epochs=440,
    augmentation_type='crop',
    optimizer='sgd',
    num_classes=num_classes,
    num_training_samples=num_training_samples,
)

#logger.info("base parameters are: "+base_hyperparameters)

# These are hyperparameters we may want to tune, as they can affect the model training success:
hyperparameters={
    **base_hyperparameters, 
    **dict(
        learning_rate=0.001,
        mini_batch_size=5,
    )
}


image_classifier.set_hyperparameters(**hyperparameters)

#logger.info("hyperparameters params are: "+hyperparameters)

hyperparameters

### Start the training
Train our model!

This will take some time because it's provisioning a new container runtime to train our model, then the actual training happens, then the trained model gets uploaded to S3 and the container is shut down.

More info here: https://sagemaker.readthedocs.io/en/stable/estimators.html#sagemaker.estimator.Estimator.fit

In [None]:
hyperparameters={
    **base_hyperparameters, 
    **dict(
        learning_rate=0.0001,
        mini_batch_size=5,
    )
}

In [None]:
# Given the base estimator, create a new one for incremental training
incr_ic = sagemaker.estimator.Estimator(training_image,
                                        role,
                                        instance_count=1,
                                        instance_type='ml.p3.2xlarge',
                                        volume_size=50,
                                        max_run=360000,
                                        input_mode='File',
                                        output_path=s3_output_location,
                                        sagemaker_session=sess,
                                        hyperparameters=hyperparameters,
                                        model_uri="s3://sagemaker-us-east-1-814180570813/dev/output/IC-dev-1673238974/output/model.tar.gz") # This parameter will ingest the previous job's model as a new channel


incr_ic.fit(inputs=data_channels, logs=True)

In [None]:
%pip install boto3

In [None]:
pip install s3-client

In [None]:
import boto3

s3 = boto3.resource('s3')
bucket = 'sagemakertestwat'
bucketdev = 'sagemakertestwat-dev'
src_dataset_name = 'INC_DATA'
dst_dataset_name = 'data'

s3srcpath = 's3://{}/{}/'.format(bucket, src_dataset_name)
s3dstpath = 's3://{}/{}/'.format(bucket, dst_dataset_name)

# Copy 200 shoes from INC_DATA to data
!aws s3 mv $s3srcpath $s3dstpath --recursive

# Clean up INC_DATA
#!aws s3 rm $s3srcpath --recursive

AWS_ACCESS_KEY_ID = 'AKIA33EHRS264WAZWMFW'
AWS_SECRET_ACCESS_KEY = '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'

#c = boto.connect_s3(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)

#src = c.get_bucket('sagemakertestwat-dev')
#dst = c.get_bucket('sagemakertestwat')

#s3srcfile = 's3://{}/{}/{}'.format(bucket, src_dataset_name,)
#s3dstpath = 's3://{}/{}/{}'.format(bucket, dst_dataset_name)


dst_dataset_name = src_dataset_name

count = 0

'''
for k in src.list():
    for key in dst.list(prefix="data/", delimiter="/"):
        if k.key == key:
            continue
    # copy stuff to your destination here
    s3srcfile = 's3://{}/{}'.format(bucketdev, k.key)
    s3dstpath = 's3://{}/{}'.format(bucket,dst_dataset_name)
    
    print("srcfile is "+s3srcfile)
    print("dstpath is "+s3dstpath)
    
    !aws s3 cp $s3srcfile $s3dstpath
    
    count += 1
    if count == 200:
        break
        
'''

#def list_folders(s3_client, bucket_name):  
count = 0
s3_client = boto3.client('s3')
response = s3_client.list_objects_v2(Bucket=bucketdev, Prefix='', Delimiter='/') 

foldersToAdd = []

# loop through folders in sagemakertestwat-dev
for content in response.get('CommonPrefixes', []):  
    folder = content.get('Prefix')
    print(folder)
    
    incdataresponse = s3_client.list_objects_v2(Bucket=bucket, Prefix='data/', Delimiter='/') 
    #print(incdataresponse)
    
    folderMatch = False
    
    #loop through folders in data folder of sagemakertestwat
    '''
    for contentTwo in incdataresponse.get('CommonPrefixes', []): 
        folderTwo = contentTwo.get('Prefix')
        #print("folderTwo is " + folderTwo)
        if folder in folderTwo: 
            print("match found")
            folderMatch = True
            break
    '''
    
    #if folderMatch == False:
        
    #s3srcfolder = 's3://{}/{}'.format(bucketdev, folder)
    #s3dstpath = 's3://{}/{}/{}'.format(bucket,dst_dataset_name, folder)
    # s3://sagemakertestwat/INC_DATA/
        
        #copy_source = {'Bucket': bucketdev,'Key': folder}
        #s3.meta.client.copy(copy_source, bucket, dst_dataset_name)
            
    for contentTwo in incdataresponse.get('CommonPrefixes', []): 
        folderTwo = contentTwo.get('Prefix')
        #print("folderTwo is " + folderTwo)
        if folder in folderTwo:
            print("match found "+folder)
            folderMatch = True
                
    if folderMatch == False and count < 200:
        foldersToAdd.append(folder)
        print("adding: "+folder)
        count += 1
        print(count)
        
    if count >= 200:
        break

for fldr in foldersToAdd:
    
    s3srcfolder = 's3://{}/{}'.format(bucketdev, fldr)
    
    newfldr = fldr.replace(" ","_")
    newfldr = newfldr.replace("(", "")
    newfldr = newfldr.replace(")", "")
    
    s3newsrcfolder = 's3://{}/{}'.format(bucketdev, newfldr)
    
    !aws s3 mv "$s3srcfolder" "$s3newsrcfolder" --recursive
    
    s3dstpath = 's3://{}/{}/{}'.format(bucket,dst_dataset_name, newfldr)
    
    print(s3newsrcfolder)
    print(s3dstpath)
    
    !aws s3 sync $s3newsrcfolder $s3dstpath    



## Deploy the trained model
Once a model has been trained, we can use the same `image_classifier` object to create a deployed, fully-managed endpoint.}

More info here: https://sagemaker.readthedocs.io/en/stable/estimators.html#sagemaker.estimator.Estimator.deploy

In [None]:
%%time
# Deploying a model to an endpoint takes a few minutes to complete

deployed_endpoint = image_classifier.deploy(
    initial_instance_count = 1,
    instance_type = 'ml.t2.medium'
)

### Clean up

When we're done with the endpoint, we can just delete it and the backing instances will be released.  Run the following cell to delete the endpoint.

## Calling a deployed endpoint from Python code

If you want to try using a deployed endpoint from Python, here's a function that you can use. It takes in a path to the image you'd like to classify, and a list of all the classes used for training.

In [None]:
import json
import numpy as np
import os

def classify_deployed(file_name, classes):
    payload = None
    with open(file_name, 'rb') as f:
        payload = f.read()
        payload = bytearray(payload)

    deployed_endpoint.content_type = 'application/x-image'
    result = json.loads(deployed_endpoint.predict(payload))
    best_prob_index = np.argmax(result)
    return (classes[best_prob_index], result[best_prob_index])



## (Optional) Perform Hyperparameter Tuning

Often, you might not know which values for hyperparameters like `learning_rate` and `mini_batch_size` will yield acceptible results. Traditionally, this meant manually running many training jobs with different hyperparameter values, looking at each trained model's performance, and then picking a winner. 

This type of manual tuning is _very_ time consuming, so you can automate this process using automatic model tuning with SageMaker. Here's some example code to illustrate how to start one of these jobs using the SageMaker Python SDK.

More info here about automatic model tuning: https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning.html

More info about model tuning for the Image Classification algorithm: https://docs.aws.amazon.com/sagemaker/latest/dg/IC-tuning.html

In [None]:
from sagemaker.tuner import HyperparameterTuner, IntegerParameter, CategoricalParameter, ContinuousParameter
hyperparameter_ranges = {'optimizer': CategoricalParameter(['sgd', 'adam']),
                         'learning_rate': ContinuousParameter(0.0001, 0.1),
                         'mini_batch_size': IntegerParameter(2, 32),
                        }

objective_metric_name = 'validation:accuracy'

tuner = HyperparameterTuner(image_classifier,
                            objective_metric_name,
                            hyperparameter_ranges,
                            max_jobs=50,
                            max_parallel_jobs=3)

tuner.fit(inputs=data_channels, logs=True, include_cls_metadata=False)

## Great resources to continue your Deep Learning journey

[3Blue1Brown’s YouTube series on Neural Networks ~ 60 Minutes](https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi)

[Fast.ai’s Practical Deep Learning for Coders ~ 14 Hours](http://www.fast.ai/)
    
[Amazon's Machine Learning University ~ More than 45 hours of courses, videos, and labs](https://aws.amazon.com/training/learning-paths/machine-learning/)
    
[Neural Networks and Deep Learning, by Michael Neilsen ~ 6 Chapter Book](http://neuralnetworksanddeeplearning.com/)

[Amazon SageMaker - Fully-managed Platform](https://aws.amazon.com/sagemaker/)
    
[@gabehollombe's](https://twitter.com/gabehollombe) deep learning tools and demos
- [Jupyter Notebooks](https://github.com/gabehollombe-aws/jupyter-notebooks)
- [Webcam S3 Uploader Tool](https://github.com/gabehollombe-aws/webcam-s3-uploader)
- [SageMaker Inference Web Tool](https://github.com/gabehollombe-aws/webcam-sagemaker-inference)
