<div style="text-align: right"> &uarr;   Ensure Kernel is set to  &uarr;  </div><br><div style="text-align: right"> 
conda_python3  </div>

# PyTorch Estimator Bring your own Script

In this notebook we will go through and run a PyTorch model to classify the junctions as priority, signal and roundabout as seen in data prep.

The outline of this notebook is 

1. to prepare a training script (provided).

2. use the AWS provided PyTorch container and provide our script to it.

3. Run training.

4. deploy model to end point.

5. Test using an image in couple of possible ways 

Upgrade Sagemaker so we can access the latest containers

In [None]:
!pip install -U sagemaker

Next we will import the libraries and set up the initial variables we will be using in this lab

In [None]:
import os
import sagemaker
import numpy as np
from sagemaker.pytorch import PyTorch

ON_SAGEMAKER_NOTEBOOK = False

sagemaker_session = sagemaker.Session()
if ON_SAGEMAKER_NOTEBOOK:
    role = sagemaker.get_execution_role()
else:
    role = "arn:aws:iam::ACCOUNTNUMHERE:role/service-role/AmazonSageMaker-ExecutionRole"

import boto3
client = boto3.client('sagemaker-runtime')

In the cell below, replace **your-unique-bucket-name** with the name of bucket you created in the data-prep notebook

In [None]:
bucket = "labeled-images"
# key = "data-folder"   (in case you structure your data as your-bucket/data-folder) 
training_data_uri="s3://{}".format(bucket)

### PyTorch Estimator

Use AWS provided open source containers, these containers can be extended by starting with the image provided by AWS and the add additional installs in dockerfile

or you can use requirements.txt in source_dir to install additional libraries.

Below code is for PyTorch


In [None]:
estimator = PyTorch(entry_point='ptModelCode.py',
                    role=role,
                    framework_version='1.8',
                    instance_count=1,
                    instance_type='ml.p3.2xlarge',
                    py_version='py3',
                    # available hyperparameters: emsize, nhid, nlayers, lr, clip, epochs, batch_size,
                    #                            bptt, dropout, tied, seed, log_interval
                    )

Now we call the estimators fit method with the URI location of the training data to start the training <br>
**Note:** This cell takes approximately **20 mins** to run

In [None]:
%%time
estimator.fit(training_data_uri)

## **NOTE:** <br>
If at this point your kernel disconnects from the server (you can tell because the kernel in the top right hand corner will say **No Kernel**),<br>you can reattach to the training job (so you dont to start the training job again).<br>Follow the steps below
1. Scoll your notebook to the top and set the kernel to the recommended kernel specified in the top right hand corner of the notebook
2. Go to your SageMaker console, Go to Training Jobs and copy the name of the training job you were disconnected from
3. Scoll to the bottom of this notebook, paste your training job name to replace the **your-training-job-name** in the cell
4. Replace **your-unique-bucket-name** with the name of bucket you created in the data-prep notebook
5. Run the edited cell
6. Return to this cell and continue executing the rest of this notebook

We can call the model_data method on the estimator to find the location of the trained model artifacts

In [None]:
estimator.model_data
latest_model = estimator.model_data

In [None]:
estimator

In [None]:

latest_model = "s3://sagemaker-us-west-2-ACCOUNTNUMBER/pytorch-training-2023-05-30-18-32-57-416/output/model.tar.gz"

#### Deploying a model
Once trained, deploying a model is a simple call.

**Note:** Replace the **'your_model_uri'** with the URI from the cell above

In [None]:
from sagemaker.pytorch import PyTorchModel
pytorch_model = PyTorchModel(model_data=latest_model, 
                             role=role, 
                             entry_point='ptInfCode.py', 
                             framework_version='1.8',
                             py_version='py3')
predictor = pytorch_model.deploy(instance_type='ml.m5.2xlarge', initial_instance_count=1)

Now lets get the endpoint name from predictor

In [None]:
print(predictor.endpoint_name)

Now that our endpoint is up and running, lets test it with all of our unseen images and see how well it does


In [None]:
    print ("Bucket:",bucket)
    s3_client = boto3.client('s3')
    test_files=[]
    response = s3_client.list_objects_v2(
        Bucket=bucket,
        Prefix='test'
    )
    for item in response['Contents']:
        test_files.append(item['Key'])


In [None]:
test_files

In [None]:
# Test manual model
import io
import json
import tempfile
import pandas as pd


s3 = boto3.resource('s3', region_name='us-west-2')
s3_bucket = s3.Bucket(bucket)

endpoint_name = predictor.endpoint_name

# image category, fight probabily, no fight probability
inference_data = []
df = None
for file_object in test_files:
    #print(file_object)
    object = s3_bucket.Object(file_object)

    tmp = tempfile.NamedTemporaryFile()

    with open(tmp.name, 'wb') as f:
        object.download_fileobj(f)
    
    # whatever you need to do
    response = client.invoke_endpoint(
        EndpointName=endpoint_name,
        ContentType='application/x-image',
        Body=open(tmp.name, 'rb').read())
    result = json.loads(response['Body'].read().decode("utf-8"))
    # file object name indicates if test image is a fight or no fight
    # result 
    inf_data_row = [file_object,file_object.split('/')[1], result[0]['Fight'], result[0]['No Fight']]
    inference_data.append(inf_data_row)

df = pd.DataFrame(inference_data, columns=['File','Category','FProb','NoFProb'])

# clean up inference instance
predictor.delete_endpoint()

In [None]:
df
df['FProb'] = pd.to_numeric(df['FProb'], errors='coerce')
df['NoFProb'] = pd.to_numeric(df['NoFProb'], errors='coerce')
df.groupby('Category')['FProb'].plot(legend=True, figsize=(10,10))

In [None]:
pd.set_option('display.max_rows', None)
pd.options.display.max_colwidth = 100
df
#show file names where fight was greater than 30%
df.loc[df["FProb"]>0.3]


In [None]:
POS_THRESHHOLD = 0.213
NEG_THRESHHOLD = 0.735

# convert probabilities to float
#df['FProb'] = pd.to_numeric(df['FProb'], errors='coerce')
#df['NoFProb'] = pd.to_numeric(df['NoFProb'], errors='coerce')

# separate frames into fight and no labeled photos
fight_fl = df['Category']=='Fight'
no_fight_fl = df['Category']=='NoFight'

fight_df = df[fight_fl]
no_fight_df = df[no_fight_fl]

fight_detected_fl = fight_df['FProb']>=POS_THRESHHOLD
fight_detected_df = fight_df[fight_detected_fl]

no_fight_detected_fl = no_fight_df['NoFProb']>=NEG_THRESHHOLD
no_fight_detected_df = no_fight_df[no_fight_detected_fl]

true_positive = fight_detected_df['Category'].count()
true_negative = no_fight_detected_df['Category'].count()
false_positive = no_fight_df['Category'].count()-true_negative
false_negative = fight_df['Category'].count()-true_positive

print("Labeled fights:", fight_df['Category'].count(), "Labeled No Fights:", no_fight_df['Category'].count())
print("True Positive:",true_positive,"True Negative:",true_negative, "False Negative:",false_negative, "False Positive:",false_positive)

precision=true_positive/(true_positive+false_positive)
recall=true_positive/(true_positive+false_negative)
accuracy=(true_positive+true_negative)/(df['Category'].count())
print("Precision:",precision)
print("Recall:",recall)
print("Accuracy:",accuracy)
print("F Score:", 2*(precision*recall)/(precision+recall))

Let's compare this identical test set with Custom Labels inference to verify Custom Labels is in fact doing a much better job.

In [None]:
#Custom Labels
import io
import json
import pandas as pd

IMAGE_MODEL_ARN = "arn:aws:rekognition:us-west-2:ACCOUNTNUMNBER:project/fight-detection-ratio-adjusted-1"

s3 = boto3.resource('s3', region_name='us-west-2')
s3_bucket = s3.Bucket(bucket)
s3_client = boto3.client('s3')
model_client = boto3.client('rekognition')

# image category, fight probabily, no fight probability
inference_data = []
# dek=0
for file_object in test_files:
    #print(file_object)
    response = model_client.detect_custom_labels(Image={'S3Object': {'Bucket': bucket, 'Name': file_object}},
                                                     MinConfidence=1,
                                                     ProjectVersionArn=IMAGE_MODEL_ARN)
    no_fight_prob = 0
    fight_prob = 0
    #print(response)
    #{'CustomLabels': [{'Name': 'No Fight', 'Confidence': 98.4469985961914}, {'Name': 'Fight', 'Confidence': 1.5530000925064087}], 'ResponseMetadata': {'RequestId': 'c5382ac8-05e9-453a-802d-d74d0516bdc2', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': 'c5382ac8-05e9-453a-802d-d74d0516bdc2', 'content-type': 'application/x-amz-json-1.1', 'content-length': '117', 'date': 'Thu, 13 Apr 2023 20:53:57 GMT'}, 'RetryAttempts': 0}}
    # First element will be the class probabilities
    # classes found is on or more dictionary objects with class name and confidence
    for classes_found in response['CustomLabels']:
        #print(classes_found)
        # Don't know which classes will be returned
        if classes_found["Name"] == "No Fight":
            no_fight_prob = classes_found["Confidence"]/100
        if classes_found["Name"] == "Fight":
            fight_prob = classes_found["Confidence"]/100
      
    #print (file_object, no_fight_prob, fight_prob)
#     dek+=1
#     if dek == 15:
#         break
   #result = json.loads(response['Body'].read().decode("utf-8"))
    # file object name indicates if test image is a fight or no fight
    # result 
    inf_data_row = [file_object,file_object.split('/')[1], fight_prob, no_fight_prob]
    inference_data.append(inf_data_row)
cl_df = pd.DataFrame(inference_data, columns=['File','Category','FProb','NoFProb'])



In [None]:
POS_THRESHHOLD = 0.213
NEG_THRESHHOLD = 0.735

# convert probabilities to float
cl_df['FProb'] = pd.to_numeric(cl_df['FProb'], errors='coerce')
cl_df['NoFProb'] = pd.to_numeric(cl_df['NoFProb'], errors='coerce')

# separate frames into fight and no labeled photos
cl_fight_fl = cl_df['Category']=='Fight'
cl_no_fight_fl = cl_df['Category']=='NoFight'

cl_fight_df = cl_df[cl_fight_fl]
cl_no_fight_df = cl_df[cl_no_fight_fl]

cl_fight_detected_fl = cl_fight_df['FProb']>=POS_THRESHHOLD
cl_fight_detected_df = cl_fight_df[cl_fight_detected_fl]

cl_no_fight_detected_fl = cl_no_fight_df['NoFProb']>=NEG_THRESHHOLD
cl_no_fight_detected_df = cl_no_fight_df[cl_no_fight_detected_fl]

cl_true_positive = cl_fight_detected_df['Category'].count()
cl_true_negative = cl_no_fight_detected_df['Category'].count()
cl_false_positive = cl_no_fight_df['Category'].count()-cl_true_negative
cl_false_negative = cl_fight_df['Category'].count()-cl_true_positive

print("Labeled fights:", cl_fight_df['Category'].count(), "Labeled No Fights:", cl_no_fight_df['Category'].count())
print("True Positive:",cl_true_positive,"True Negative:",cl_true_negative, "False Negative:",cl_false_negative, "False Positive:",cl_false_positive)

cl_precision=cl_true_positive/(cl_true_positive+cl_false_positive)
cl_recall=cl_true_positive/(cl_true_positive+cl_false_negative)
cl_accuracy=(cl_true_positive+cl_true_negative)/(cl_df['Category'].count())
print("CL Precision:",cl_precision)
print("CL Recall:",cl_recall)
print("CL Accuracy:",cl_accuracy)
print("CL F Score:", 2*(cl_precision*cl_recall)/(cl_precision+cl_recall))

In [None]:
cl_df.to_csv('custom_lables.csv')

Now let us view the JSON response

In [None]:
cl_df.groupby('Category')['FProb'].plot(legend=True, figsize=(10,10))



In [None]:
pd.set_option('display.max_rows', None)
#fight_df
no_fight_detected_df.count()
#fight_df

### Clean up

When we're done with the endpoint, we can just delete it and the backing instances will be released.  Run the following cell to delete the endpoint.

### Attach to a training job that has been left to run 

If your kernel becomes disconnected and your training has already started, you can reattach to the training job.<br>
In the cell below, replace **your-unique-bucket-name** with the name of bucket you created in the data-prep notebook<br>
Simply look up the training job name and replace the **your-training-job-name** and then run the cell below. <br>
Once the training job is finished, you can continue the cells after the training cell

In [None]:
import sagemaker
import boto3
from sagemaker.pytorch import PyTorch

sess = sagemaker.Session()
role = sagemaker.get_execution_role()
client = boto3.client('sagemaker-runtime')

bucket = "your-unique-bucket-name"

training_job_name = 'your-training-job-name'

if 'your-training' not in training_job_name:
    estimator = sagemaker.estimator.Estimator.attach(training_job_name=training_job_name, sagemaker_session=sess)