# PostScriptML: Modeling Notebook
### by Dolci Key 

 In following this notebook, you should be able to run an Amazon SageMaker Instance. This notebook is mainly a function in which I have used to run scripts (the model code/specifications) through. 
 
Please review the python scripts in the SCRIPTS folder of the repo for further insignt into the model parameters. Code was referenced from Paul Breton's code along of AWS SageMaker on Medium.com. He suggested using the scripts which are helpful in keeping models separated. AWS also stores the models in the S3 Bucket after they have finished running. 

This modeling process was iterative in that, I started with the vanilla baseline script, made improvements, added additional updates, ran the mode_script_ii, and then worked each iteratively from that one. 

## Import Libraries

In [1]:
import pandas as pd
import numpy as py

import matplotlib.pyplot as plt
%matplotlib inline 

import keras
import tensorflow as tf
from tensorflow import keras
from keras import layers
from keras.preprocessing import image
from tensorflow.keras.preprocessing import image
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.layers import BatchNormalization
from PIL import Image

import os
import gc

import pickle
from timeit import default_timer as timer

Using TensorFlow backend.





## Starting a AWS SageMaker Instance
Import specific libraries in an AWS SageMaker Notebook Instance. This will not work in a normal Jupyter Notebook environment. Once you have the libraries, you will start a session and connect your data. 

In [27]:
# AWS  Sagemaker Needed using AWS Sagemaker Notebook Instance

import sagemaker
from sagemaker.tensorflow import TensorFlow
from tensorflow.python.keras.preprocessing.image import load_img

In [28]:
sagemaker_session = sagemaker.Session()

In [29]:
role = sagemaker.get_execution_role()

## S3 Bucket Connection
Here I am connecting my S3 bucket. You MUST have 'sagemaker-' as the prefix on the name of your bucket for this to work. Please note that once the bucket is made, you cannot rename the bucket, however, you can move the data from one bucket to another if you make this mistake. 

A special thanks to Aren Carpenter for helping troubleshoot issues with the S3 buckets. 


In [30]:
bucket = 'sagemaker-postscriptml' # AWS S3 Bucket path to dataset
train_instance_type = 'ml.m4.xlarge' # AWS EC2 Instance used for training
deploy_instance_type = 'ml.m4.xlarge' # AWS EC2 Instance used for deployment
hyperparameters = {'learning_rate': 0.001, 'decay': 0.0001}

In [31]:
train_input_path = 's3://{}/TRAIN'.format(bucket) # Path to training data 
test_input_path = 's3://{}/TEST'.format(bucket) # Path to test data
validation_input_path = 's3://{}/VALIDATION'.format(bucket) # Path to validation data 

## Running Models with Scripts 

Once I had my data set up, I created the model using TensorFlow. These steps varied when using Python 3 or 2.7. Using 2.7 I was able to list the training and evaluation steps, otherwise in python 3, the version of python had to be specified and training/evaluation steps had to be moved to my hyperparameter dictionary. 

I read in the script from my SCRIPTS folder on my repo. There you can find each Script that I tested. I then logged each accuracy and also the highest step accuracy from the evaluation to keep track of my modeling scripts. 

I utilized the tutorial on Amazon SageMaker and scripting from Paul Breton to base my scripts and models on for AWS. 

## Metrics

I used binary cross-entropy as the loss function and accuracy (binary accuracy) as the metrics for this CNN. I worked to minimize loss as much as I could as the baseline had a decent accuracy, but the loss was 1.0. 

So far the best model is my model_script_iii which has a loss of .343 and a binary accuracy score of .9375.

My next goals after MVP will be incorporating Sigmoid activation fuction to give feedback on the image in additiona to coding for other metrics such as recall that will help minimize the false negatives.


In [37]:
model = TensorFlow(
  entry_point=os.path.join(os.path.dirname('__file__'), "SCRIPTS/vanilla_ii.py"), # Your entry script
  role=role,
  framework_version='1.12.0', # TensorFlow's version
  training_steps = 100,
  evaluation_steps = 30, 
  hyperparameters=hyperparameters, # For python 3 you have to specify evaluation and training steps in the above hyperparameters
  train_instance_count=1,   # "The number of GPUs instances to use"
  train_instance_type=train_instance_type,
)

## Running the model 

After running the code block, you will see: 
    
Training ...
2020-09-23 05:07:46 Starting - Starting the training job...
2020-09-23 05:07:49 Starting - Launching requested ML instances......
2020-09-23 05:09:11 Starting - Preparing the instances for training......
2020-09-23 05:10:07 Downloading - Downloading input data...
2020-09-23 05:10:47 Training - Downloading the training image...
2020-09-23 05:11:07 Training - Training image download completed. Training in progress.
        
This is starting and will then start to run the model. If it errors out, it will give a traceback at the end, so make sure to keep watching the code until you see steps running. 


In [38]:
print("Training ...")
model.fit({'training': train_input_path, 'evaluation': validation_input_path})

Training ...
2020-09-29 00:12:58 Starting - Starting the training job...
2020-09-29 00:13:00 Starting - Launching requested ML instances......
2020-09-29 00:14:08 Starting - Preparing the instances for training......
2020-09-29 00:14:59 Downloading - Downloading input data...
2020-09-29 00:15:45 Training - Downloading the training image..[34m2020-09-29 00:16:04,966 INFO - root - running container entrypoint[0m
[34m2020-09-29 00:16:04,967 INFO - root - starting train task[0m
[34m2020-09-29 00:16:04,997 INFO - container_support.training - Training starting[0m
[34mDownloading s3://sagemaker-us-east-2-997425579135/sagemaker-tensorflow-2020-09-29-00-12-57-911/source/sourcedir.tar.gz to /tmp/script.tar.gz[0m
[34m2020-09-29 00:16:08,968 INFO - tf_container - ----------------------TF_CONFIG--------------------------[0m
[34m2020-09-29 00:16:08,968 INFO - tf_container - {"environment": "cloud", "cluster": {"master": ["algo-1:2222"]}, "task": {"index": 0, "type": "master"}}[0m
[34m20

UnexpectedStatusException: Error for Training job sagemaker-tensorflow-2020-09-29-00-12-57-911: Failed. Reason: AlgorithmError: uncaught exception during training: 'DType' object has no attribute 'reshape'
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/container_support/training.py", line 36, in start
    fw.train()
  File "/usr/local/lib/python2.7/dist-packages/tf_container/train_entry_point.py", line 177, in train
    train_wrapper.train()
  File "/usr/local/lib/python2.7/dist-packages/tf_container/trainer.py", line 73, in train
    tf.estimator.train_and_evaluate(estimator=estimator, train_spec=train_spec, eval_spec=eval_spec)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/training.py", line 471, in train_and_evaluate
    return executor.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/training.py", line 637, in run
    getattr(self, task_to_run)()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/training.py", line 674, in run_master
    self._start_distributed_training(saving_listeners=saving_lis