### A simple LSTM model for binary classificationon text  is trained and deployed with sagemaker


In [1]:
# this dependency is used to get the role, which is necessary for training 
import sagemaker

In [2]:
role = sagemaker.get_execution_role()

In [3]:
#it is presumed that your training inputs live on s3 
bucket='tweet-train'
#needs to contain the training data in a pandas readable format, along with model.pkl, which are GloVE embeddings 
input_dir = 's3://{}/inputs/'.format(bucket)
#this is where the outputs of the job will be stored, e.g. artifacts (in our case the tensorflow serving model)
output_dir = 's3://{}/outputs'.format(bucket)

In [4]:
from sagemaker.tensorflow.estimator import TensorFlow
# this is the heart of the application, this will generate a saved model directory for tensorflow 2.0 after training
# and building the model. The tesnroflow 2.0 saved model directory will be located on s3 
#the training data is also located on s3 for convenience but this is expanded on later
estimator = TensorFlow(entry_point="train.py", # this file is the code used to actually train the instance 
                    source_dir="train", # this is for dependencies 
                    output_path=output_dir, # if specified, will write artifacts(saved model dir) to the specified output_dir
                    role=role, # IAM 
                    py_version ="py3",
                    framework_version='2.1.0',
                    train_instance_count=1,
                    train_instance_type='ml.p2.xlarge', # this is a beefy vm 
                    hyperparameters={
                        'epochs': 1,
                        'hidden_dim': 200,
                        'pad_length' : 30, # we are training on sentences, this specifies the length to pad to 
                        'batch_size' : 20 
                    })

In [6]:
#this is where it finally creates the model and fits to the data 
estimator.fit({'training': input_dir})

'create_image_uri' will be deprecated in favor of 'ImageURIProvider' class in SageMaker Python SDK v2.
's3_input' class will be renamed to 'TrainingInput' in SageMaker Python SDK v2.
'create_image_uri' will be deprecated in favor of 'ImageURIProvider' class in SageMaker Python SDK v2.


2020-09-06 19:25:48 Starting - Starting the training job...
2020-09-06 19:25:50 Starting - Launching requested ML instances......
2020-09-06 19:27:18 Starting - Preparing the instances for training......
2020-09-06 19:28:17 Downloading - Downloading input data...
2020-09-06 19:28:40 Training - Downloading the training image.........
2020-09-06 19:30:14 Training - Training image download completed. Training in progress..[34m2020-09-06 19:30:18,394 sagemaker-containers INFO     Imported framework sagemaker_tensorflow_container.training[0m
[34m2020-09-06 19:30:18,835 sagemaker-containers INFO     Invoking user script
[0m
[34mTraining Env:
[0m
[34m{
    "additional_framework_parameters": {},
    "channel_input_dirs": {
        "training": "/opt/ml/input/data/training"
    },
    "current_host": "algo-1",
    "framework_module": "sagemaker_tensorflow_container.training:main",
    "hosts": [
        "algo-1"
    ],
    "hyperparameters": {
        "pad_length": 30,
        "batch_size

If you look at the above, you can see how sagemaker puts everything together. There is a bit of spam related to embeddings, ignore it and scroll down until you see text again. 

Of importance is the following: 

Number of negative samples 4342

Number of positive samples 3271

Resampling negative as fraction 0.7533394748963611

This is from our training code. It undersamples to balance out the negative/positive ratio. The vm instance locally writes the results to /opt/ml/model/1. This is then copied to the specified output directory, which is an S3 bucket. 


In [7]:
#this is the location our model data is saved to, in our case the entire save model directory
estimator.model_data

's3://tweet-train/outputs/tensorflow-training-2020-09-06-19-25-48-160/output/model.tar.gz'

As you can see, it uses the bucket specified in the argument, i.e. **s3://tweet-train/outputs/**

In [9]:
#we can now simply call deploy on the very same estimator we constructed to use the estimator.model_data to deploy a 
#tensorflow serving model. 
predictor = estimator.deploy(initial_instance_count = 1, instance_type='ml.p2.xlarge')

Parameter image will be renamed to image_uri in SageMaker Python SDK v2.
'create_image_uri' will be deprecated in favor of 'ImageURIProvider' class in SageMaker Python SDK v2.


----------------!

The above indicates that the deployment was successful. Now we can feed our model individual sentences and the model will yield predictions; in our case whether the sentence indicates a disaster or not on twitter. 

The important thing to notice is that it is quite unusual to feed single sentences, rather than embedddings, unless the model has an embedding layer as its input layer. Instead, this particular model is a vanilla lstm that uses externally trained GloVE embeddings, so that the model itself expects 50 dimensional numpy vectors of float value. 

This special behavior is elucidated in tweet_model.py, where the tensorflow model has a signature specified of the following form in front of its predict function: 

 @tf.function(input_signature=[tf.TensorSpec(shape=[1],dtype=tf.string)])
 
 The above in conjunction with: 
 
 signatures = {"serving_default": predict,"predict": predict}
  
 and tf.saved_model.save(model,model_path, signatures=signatures) allow this particular behavior, so that the default function called will only accept string types and will call the predict function specified in the model. 
 
 If we hadn't done this, it would have attempted to directly apply the model to the value, yielding an error. 
 

In [10]:
predictor.predict("the horse man is go")

{'predictions': [[0.291848689]]}