# ID Classification Model

In [2]:
# SOCOFing Bucket Upload

#example
#data_bucket_name='SomeBucketName'

#SOCOFing Bucket
data_bucket_name='s3://zacbenson-csrp-socofing'

# A prefix name inside the S3 bucket containing sub-folders of images (one per label class)
dataset_name = 'socofing-snn' 

## Setting up the environment
Here we set up the linkage and authentication to AWS services

- The role used to give learning and hosting access to your data. This will automatically be obtained from the role used to start the notebook
- A `session` variable that holds some configuration state for interacting with SageMaker from Python and contains some methods for preparing input data
- A reference to the Amazon sagemaker image classification docker image 

More info about the SageMaker built-in Image Classification algorithm here: https://docs.aws.amazon.com/sagemaker/latest/dg/image-classification.html

In [6]:
import sagemaker
from sagemaker import get_execution_role
from sagemaker import image_uris

role = get_execution_role()
sess = sagemaker.Session()


training_image = sagemaker.image_uris.retrieve('image-classification', sess.boto_region_name)

## Preparing data for our model
Before we can train our model, we need to:

- Create some files that will teach SageMaker about the images in each of our classes
- Upload these additional files to S3
- Configure our model to use these files for training and validating

### Find the im2rec.py script on this system
The SageMaker image classifier algorithm needs to know about which images belong to which classes. We provide this data using either LST or RecordIO files. We'll use a python script called `im2rec.py` to create these files.

More info here: https://docs.aws.amazon.com/sagemaker/latest/dg/image-classification.html#IC-inputoutput

In [4]:
# Find im2rec in our environment and set up some other vars in our environemnt

base_dir='/tmp'

%env BASE_DIR=$base_dir
%env S3_DATA_BUCKET_NAME = $data_bucket_name
%env DATASET_NAME = $dataset_name

import sys,os

suffix='/mxnet/tools/im2rec.py'
im2rec = list(filter( (lambda x: os.path.isfile(x + suffix )), sys.path))[0] + suffix
%env IM2REC=$im2rec

env: BASE_DIR=/tmp
env: S3_DATA_BUCKET_NAME=s3://zacbenson-csrp-socofing
env: DATASET_NAME=socofing-snn
env: IM2REC=/home/ec2-user/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/tools/im2rec.py


### Get our training images from S3
In order to create training and validation RecordIO files, we need to download our images to our local filesystem.

In [5]:
# Pull our images from S3
print('Download Started.')
!aws s3 sync $S3_DATA_BUCKET_NAME/$DATASET_NAME $BASE_DIR/$DATASET_NAME --quiet && echo Download Complete || echo Download Failed

Download Started.
Download Complete


### Create RecordIO files from our training images
The `im2rec.py` script can create LST files and/or RecordIO files from our training data. 

More info here: https://mxnet.incubator.apache.org/versions/master/faq/recordio.html

In [6]:
%%bash
# Use the IM2REC script to convert our images into RecordIO files

# Clean up our working dir of existing LST and REC files
cd $BASE_DIR
rm *.rec
rm *.lst

# First we need to create two LST files (training and test lists), noting the correct label class for each image
# We'll also save the output of the LST files command, since it includes a list of all of our label classes
echo "Creating LST files"
python $IM2REC --list --recursive --pass-through --test-ratio=0.3 --train-ratio=0.7 $DATASET_NAME $DATASET_NAME > ${DATASET_NAME}_classes

echo "Label classes:"
cat ${DATASET_NAME}_classes

# Then we create RecordIO files from the LST files
echo "Creating RecordIO files"
python $IM2REC --num-thread=4 ${DATASET_NAME}_train.lst $DATASET_NAME
python $IM2REC --num-thread=4 ${DATASET_NAME}_test.lst $DATASET_NAME
ls -lh *.rec

Creating LST files
Label classes:
id-1 0
id-10 1
id-100 2
id-101 3
id-102 4
id-103 5
id-104 6
id-105 7
id-106 8
id-107 9
id-108 10
id-109 11
id-11 12
id-110 13
id-111 14
id-112 15
id-113 16
id-114 17
id-115 18
id-116 19
id-117 20
id-118 21
id-119 22
id-12 23
id-120 24
id-121 25
id-122 26
id-123 27
id-124 28
id-125 29
id-126 30
id-127 31
id-128 32
id-129 33
id-13 34
id-130 35
id-131 36
id-132 37
id-133 38
id-134 39
id-135 40
id-136 41
id-137 42
id-138 43
id-139 44
id-14 45
id-140 46
id-141 47
id-142 48
id-143 49
id-144 50
id-145 51
id-146 52
id-147 53
id-148 54
id-149 55
id-15 56
id-150 57
id-151 58
id-152 59
id-153 60
id-154 61
id-155 62
id-156 63
id-157 64
id-158 65
id-159 66
id-16 67
id-160 68
id-161 69
id-162 70
id-163 71
id-164 72
id-165 73
id-166 74
id-167 75
id-168 76
id-169 77
id-17 78
id-170 79
id-171 80
id-172 81
id-173 82
id-174 83
id-175 84
id-176 85
id-177 86
id-178 87
id-179 88
id-18 89
id-180 90
id-181 91
id-182 92
id-183 93
id-184 94
id-185 95
id-186 96
id-187 97
id-188 

### Upload our training and test data RecordIO files so we can train with them
Now that we have our training and test .rec files, we upload them to S3 so SageMaker can use them for training

In [7]:
# Upload our train and test RecordIO files to S3 in the bucket that our sagemaker session is using
bucket = sess.default_bucket()

s3train_path = 's3://{}/{}/train/'.format(bucket, dataset_name)
s3validation_path = 's3://{}/{}/validation/'.format(bucket, dataset_name)

# Clean up any existing data
!aws s3 rm s3://{bucket}/{dataset_name}/train --recursive
!aws s3 rm s3://{bucket}/{dataset_name}/validation --recursive

# Upload the rec files to the train and validat=ion channels[]
!aws s3 cp /tmp/{dataset_name}_train.rec $s3train_path
!aws s3 cp /tmp/{dataset_name}_test.rec $s3validation_path

delete: s3://sagemaker-us-east-2-951232522638/socofing-snn/train/socofing-snn_train.rec
delete: s3://sagemaker-us-east-2-951232522638/socofing-snn/validation/socofing-snn_test.rec
upload: ../../../../tmp/socofing-snn_train.rec to s3://sagemaker-us-east-2-951232522638/socofing-snn/train/socofing-snn_train.rec
upload: ../../../../tmp/socofing-snn_test.rec to s3://sagemaker-us-east-2-951232522638/socofing-snn/validation/socofing-snn_test.rec


### Configure the data for our model training to use
Finally, we tell SageMaker where to find these RecordIO files to use for training

In [8]:
train_data = sagemaker.inputs.TrainingInput(
    s3train_path, 
    distribution='FullyReplicated', 
    content_type='application/x-recordio', 
    s3_data_type='S3Prefix'
)
validation_data = sagemaker.inputs.TrainingInput(
    s3validation_path, 
    distribution='FullyReplicated', 
    content_type='application/x-recordio', 
    s3_data_type='S3Prefix'
)

data_channels = {'train': train_data, 'validation': validation_data}

## Training
Now it's time to train our model!

### Create an image classifier object with some base configuration
More info here: https://sagemaker.readthedocs.io/en/stable/estimators.html#sagemaker.estimator.Estimator

In [9]:
s3_output_location = 's3://{}/{}/output'.format(bucket, dataset_name)

image_classifier = sagemaker.estimator.Estimator(
    training_image,
    role, 
    instance_count=1, 
    instance_type='ml.p3.2xlarge',
    output_path=s3_output_location,
    sagemaker_session=sess
)

### Set some training hyperparameters

Finally, before we train, we provide some additional configuration parameters for the training.

More info here: https://docs.aws.amazon.com/sagemaker/latest/dg/IC-Hyperparameter.html

In [10]:
num_classes=! ls -l {base_dir}/{dataset_name} | wc -l
num_classes=int(num_classes[0]) - 1

num_training_samples=! cat {base_dir}/{dataset_name}_train.lst | wc -l
num_training_samples = int(num_training_samples[0])

# Learn more about the Sagemaker built-in Image Classifier hyperparameters here: https://docs.aws.amazon.com/sagemaker/latest/dg/IC-Hyperparameter.html

# These hyperparameters we won't want to change, as they define things like
# the size of the images we'll be sending for input, the number of training classes we have, etc.
base_hyperparameters=dict(
    use_pretrained_model=1,
    image_shape='3,224,208',
    num_classes=num_classes,
    num_training_samples=num_training_samples,
)

# These are hyperparameters we may want to tune, as they can affect the model training success:
hyperparameters={
    **base_hyperparameters, 
    **dict(
        learning_rate=0.001,
        mini_batch_size=5,
    )
}


image_classifier.set_hyperparameters(**hyperparameters)

hyperparameters

{'use_pretrained_model': 1,
 'image_shape': '3,224,208',
 'num_classes': 600,
 'num_training_samples': 24439,
 'learning_rate': 0.001,
 'mini_batch_size': 5}

### Start the training
Train our model!

This will take some time because it's provisioning a new container runtime to train our model, then the actual training happens, then the trained model gets uploaded to S3 and the container is shut down.

More info here: https://sagemaker.readthedocs.io/en/stable/estimators.html#sagemaker.estimator.Estimator.fit

In [None]:
%%time

import time
now = str(int(time.time()))
training_job_name = 'IC-' + dataset_name.replace('_', '-') + '-' + now

image_classifier.fit(inputs=data_channels, job_name=training_job_name, logs=True)

job = image_classifier.latest_training_job
model_path = f"{base_dir}/{job.name}"

print(f"\n\n Finished training! The model is available for download at: {image_classifier.output_path}/{job.name}/output/model.tar.gz")

2021-06-28 20:39:23 Starting - Starting the training job...
2021-06-28 20:39:47 Starting - Launching requested ML instancesProfilerReport-1624912763: InProgress
...
2021-06-28 20:40:17 Starting - Preparing the instances for training.........
2021-06-28 20:41:51 Downloading - Downloading input data...
2021-06-28 20:42:08 Training - Downloading the training image...
2021-06-28 20:42:53 Training - Training image download completed. Training in progress..[34mDocker entrypoint called with argument(s): train[0m
[34m[06/28/2021 20:42:57 INFO 140273260308288] Reading default configuration from /opt/amazon/lib/python3.7/site-packages/image_classification/default-input.json: {'use_pretrained_model': 0, 'num_layers': 152, 'epochs': 30, 'learning_rate': 0.1, 'lr_scheduler_factor': 0.1, 'optimizer': 'sgd', 'momentum': 0, 'weight_decay': 0.0001, 'beta_1': 0.9, 'beta_2': 0.999, 'eps': 1e-08, 'gamma': 0.9, 'mini_batch_size': 32, 'image_shape': '3,224,224', 'precision_dtype': 'float32'}[0m
[34m[06

## Deploy the trained model
Once a model has been trained, we can use the same `image_classifier` object to create a deployed, fully-managed endpoint.}

More info here: https://sagemaker.readthedocs.io/en/stable/estimators.html#sagemaker.estimator.Estimator.deploy

Create Model:

In [7]:
%%time
import boto3
from time import gmtime, strftime

sage = boto3.Session().client(service_name="sagemaker")

model_name = "ID-classification-model" + time.strftime(
    "-%Y-%m-%d-%H-%M-%S", time.gmtime()
)
print(model_name)
info = sage.describe_training_job(TrainingJobName='IC-socofing-snn-1624908567')
model_data = info["ModelArtifacts"]["S3ModelArtifacts"]
print(model_data)

#hosting_image = get_image_uri(boto3.Session().region_name, "image-classification")
hosting_image = sagemaker.image_uris.retrieve('image-classification', sess.boto_region_name)
primary_container = {
    "Image": hosting_image,
    "ModelDataUrl": model_data,
}

create_model_response = sage.create_model(
    ModelName=model_name, ExecutionRoleArn=role, PrimaryContainer=primary_container
)

print(create_model_response["ModelArn"])

ID-classification-model-2021-06-29-14-47-57
s3://sagemaker-us-east-2-951232522638/socofing-snn/output/IC-socofing-snn-1624908567/output/model.tar.gz
arn:aws:sagemaker:us-east-2:951232522638:model/id-classification-model-2021-06-29-14-47-57
CPU times: user 86.9 ms, sys: 7.32 ms, total: 94.2 ms
Wall time: 518 ms


End Point Configurations

In [8]:
from time import gmtime, strftime

job_name_prefix = "ID-classification-model"

timestamp = time.strftime("-%Y-%m-%d-%H-%M-%S", time.gmtime())
endpoint_config_name = job_name_prefix + "-epc-" + timestamp
endpoint_config_response = sage.create_endpoint_config(
    EndpointConfigName=endpoint_config_name,
    ProductionVariants=[
        {
            "InstanceType": "ml.m4.xlarge",
            "InitialInstanceCount": 1,
            "ModelName": model_name,
            "VariantName": "AllTraffic",
        }
    ],
)

print("Endpoint configuration name: {}".format(endpoint_config_name))
print("Endpoint configuration arn:  {}".format(endpoint_config_response["EndpointConfigArn"]))

Endpoint configuration name: ID-classification-model-epc--2021-06-29-14-48-03
Endpoint configuration arn:  arn:aws:sagemaker:us-east-2:951232522638:endpoint-config/id-classification-model-epc--2021-06-29-14-48-03


Create Endpoint

In [9]:
%%time
import time

timestamp = time.strftime("-%Y-%m-%d-%H-%M-%S", time.gmtime())
endpoint_name = job_name_prefix + "-ep-" + timestamp
print("Endpoint name: {}".format(endpoint_name))

endpoint_params = {
    "EndpointName": endpoint_name,
    "EndpointConfigName": endpoint_config_name,
}
endpoint_response = sage.create_endpoint(**endpoint_params)
print("EndpointArn = {}".format(endpoint_response["EndpointArn"]))

Endpoint name: ID-classification-model-ep--2021-06-29-14-48-09
EndpointArn = arn:aws:sagemaker:us-east-2:951232522638:endpoint/id-classification-model-ep--2021-06-29-14-48-09
CPU times: user 5.7 ms, sys: 518 µs, total: 6.22 ms
Wall time: 157 ms


Get the status of the endpoint

In [10]:
response = sagemaker.describe_endpoint(EndpointName=endpoint_name)
status = response["EndpointStatus"]
print("EndpointStatus = {}".format(status))


# wait until the status has changed
sagemaker.get_waiter("endpoint_in_service").wait(EndpointName=endpoint_name)


# print the status of the endpoint
endpoint_response = sagemaker.describe_endpoint(EndpointName=endpoint_name)
status = endpoint_response["EndpointStatus"]
print("Endpoint creation ended with EndpointStatus = {}".format(status))

if status != "InService":
    raise Exception("Endpoint creation failed.")

AttributeError: module 'sagemaker' has no attribute 'describe_endpoint'

## Calling a deployed endpoint from Python code

If you want to try using a deployed endpoint from Python, here's a function that you can use. It takes in a path to the image you'd like to classify, and a list of all the classes used for training.

In [None]:
import json
import numpy as np
import os

def classify_deployed(file_name, classes):
    payload = None
    with open(file_name, 'rb') as f:
        payload = f.read()
        payload = bytearray(payload)

    deployed_endpoint.content_type = 'application/x-image'
    result = json.loads(deployed_endpoint.predict(payload))
    best_prob_index = np.argmax(result)
    return (classes[best_prob_index], result[best_prob_index])



### Clean up

When we're done with the endpoint, we can just delete it and the backing instances will be released.  Run the following cell to delete the endpoint.

In [None]:
deployed_endpoint.delete_endpoint()