I have received a couple of requests to provide a more concrete demonstration of how to use a TensorFlow SavedModel. The following is some example code that shows how to put this together.

### Imports and General Setup

The imports section is essentially the same as what was reviewed in previous tutorials.

In [1]:
%load_ext autoreload
% autoreload 2

In [2]:
!pip install --upgrade pip

Requirement already up-to-date: pip in /home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages (19.3.1)


In [3]:
!yes | pip uninstall tensorflow
! pip install tensorflow-gpu==2.0.0

yes: standard output: Broken pipe
yes: write error


In [4]:
!pip install -q transformers==2.1.1

In [5]:
import tensorflow as tf
from transformers import *
from transformers import BertTokenizer, TFBertForSequenceClassification, glue_convert_examples_to_features

In [6]:
tf.__version__

'2.0.0'

In [7]:
# XLA is the optimizing compiler for machine learning
# It can potentially increase speed by 15% with no source code changes
USE_XLA = False

# mixed precision results on https://github.com/huggingface/transformers/tree/master/examples
# Mixed precision can help to speed up training time
USE_AMP = False

In [8]:
tf.config.optimizer.set_jit(USE_XLA)
tf.config.optimizer.set_experimental_options({"auto_mixed_precision": USE_AMP})

Although predictions can be done using a CPU, I am running this notebook using a GPU (in order to reduce the time needed to make predictions).

In [9]:
# GPU USAGE
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))

Num GPUs Available:  1


### Step 1 - Use an example for testing

In the previous tutorial, we looked at building a sentiment classifier for Yelp reviews. At the end of that exercise we saved our model so that we can reuse it.

As seen in the command below, the SavedModel requires attention_mask, input_ids, and token_type_ids as inputs. These are the inputs that are required by the Google BERT model that we are using. Lucky for us, we can use the HuggingFace Transformers class to convert a sentence into the required inputs.



In [10]:
!saved_model_cli show --dir /home/ec2-user/SageMaker/tensorflow-tutorials/20191227 --tag_set serve --signature_def serving_default

The given SavedModel SignatureDef contains the following input(s):
  inputs['attention_mask'] tensor_info:
      dtype: DT_INT32
      shape: (-1, 128)
      name: serving_default_attention_mask:0
  inputs['input_ids'] tensor_info:
      dtype: DT_INT32
      shape: (-1, 128)
      name: serving_default_input_ids:0
  inputs['token_type_ids'] tensor_info:
      dtype: DT_INT32
      shape: (-1, 128)
      name: serving_default_token_type_ids:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['output_1'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 2)
      name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict


The following commands are going to load our model and the tokenizer which converts words into numbers.

In [11]:
savedmodel = tf.saved_model.load('/home/ec2-user/SageMaker/tensorflow-tutorials/20191227')

In [12]:
tokenizer = BertTokenizer.from_pretrained('bert-base-cased')

The next step in this process is to create something that the Transformers library can process. For our example, we are going to create a dictionary with the required tensors, feed that dictionary into a data pipeline, and have the Transformers library generate input based on the pipeline. 


In [13]:
example = {'idx': tf.constant(1, dtype=tf.int64), 'label': tf.constant(0, dtype=tf.int64) ,
           'sentence': tf.constant('This is the best store that I have ever visited', dtype=tf.string)}

In [14]:
example

{'idx': <tf.Tensor: id=42271, shape=(), dtype=int64, numpy=1>,
 'label': <tf.Tensor: id=42272, shape=(), dtype=int64, numpy=0>,
 'sentence': <tf.Tensor: id=42273, shape=(), dtype=string, numpy=b'This is the best store that I have ever visited'>}

If you are using the model for prediction, you would simply replace the sentence above with the sentence that you want to predict.

In [15]:
ds = tf.data.Dataset.from_tensors(example)
feature_ds = glue_convert_examples_to_features(ds, tokenizer, max_length=128, task='sst-2')
feature_dataset = feature_ds.batch(1)

Great! Now we have features in the format required by the Google BERT model. The following function is going to convert these features into an actual prediction.

In [16]:
def predict_dataset(feature_dataset, savedmodel):
    """
    :param feature_dataset: Contains information needed for BERT
    :param savedmodel: This is the model that has been pretrained in a sep process.
    :return: JSON output with the predicted classification. 
    """
    
    json_examples = []
    for feature_batch in feature_dataset.take(-1):
        feature_example = feature_batch[0]

        # The SavedModel is going to generate log probabilities (logits) as to whether the sentence
        # is negative (0) or positive (1).
        logits = savedmodel.signatures["serving_default"](attention_mask=feature_example['attention_mask'],
                            input_ids=feature_example['input_ids'],
                            token_type_ids=feature_example['token_type_ids'])['output_1']
        print(f"logits {logits}")
        
        # It is more helpful to have the actual probabilities of success. The TensorFlow softmax 
        # function will convert the logits into probabilities.
        probs = tf.nn.softmax(logits)
        
        # At this point we have probabilities (probs) of whether the sentence is negative or positive. 
        # These probabilites (by definition) will always sum to 100%.
        
        # It would be better though if we could just report out which probability is higher. 
        # This is done with the argmax function.
        
        prediction = tf.math.argmax(probs, axis=1)

        print(f"probs {probs}")
        print(f"prediction {prediction}")

        json_example = {"SENTIMENT_PREDICTION": str(prediction.numpy()[0])}
        json_examples.append(json_example)

    return json_examples

In [17]:
predict_dataset(feature_dataset, savedmodel)

logits [[-1.6893245  1.3287338]]
probs [[0.04661669 0.9533833 ]]
prediction [1]


[{'SENTIMENT_PREDICTION': '1'}]

As seen in the above example, we have taken the sentence "This is the best store that I have ever visited", and used a SavedModel. 

The SavedModel states that there is a 95% probability that this sentence has positive sentiment. Based on that prediction, the model states that it is positive ("1").

&nbsp;

Before I complete the tutorial, I am going to examine a "Negative" sentence, and put some of the code that we just used all into one function. That code is as follows

In [23]:
negative_example = {'idx': tf.constant(1, dtype=tf.int64), 'label': tf.constant(0, dtype=tf.int64) ,
                    'sentence': tf.constant('This store is absolutely horrible and I hate it!!',
                                            dtype=tf.string)}

In [24]:
negative_example

{'idx': <tf.Tensor: id=42333, shape=(), dtype=int64, numpy=1>,
 'label': <tf.Tensor: id=42334, shape=(), dtype=int64, numpy=0>,
 'sentence': <tf.Tensor: id=42335, shape=(), dtype=string, numpy=b'This store is absolutely horrible and I hate it!!'>}

In [25]:
def predict(example, tokenizer, savedmodel):
    """

    :param example: This is a single dictionary of tensors which contains a idx, a label, and a sentence
    :return: The prediction in JSON format. 1 is positive, and 0 is negative.
    """
    # The Transformers glue_convert_examples_to_features works well with datasets. 
    # It does not work well with a dictionary of examples. 
    ds = tf.data.Dataset.from_tensors(example)
    
    # Use the transformers library in order to convert an English sentence into something that 
    # BERT recognizes.
    
    # The conversion requires giving a label (even if we don't have one). The easiest way to get around this is to get around
    # this is to assign a default label of zero when you don't have a label. 
    
    feature_ds = glue_convert_examples_to_features(ds, tokenizer, max_length=128, task='sst-2')

    feature_dataset = feature_ds.batch(64)
    json_examples = predict_dataset(feature_dataset, savedmodel)

    return json_examples

In [26]:
json_result = predict(negative_example, tokenizer, savedmodel)

logits [[ 1.2307757 -1.2191985]]
probs [[0.9205595  0.07944044]]
prediction [0]


In [27]:
json_result

[{'SENTIMENT_PREDICTION': '0'}]

As seen above, the SavedModel successfully predicts that the sentence "This store is absolutely horrible and I hate it!!" is Negative ("0"). 

&nbsp;

Congratulations. You made it to the end of the tutorial and now you have a process that uses a SavedModel in order to make a prediction.

Also, for the first time in 10 years, I am back on the job market looking for consulting opportunities or full time employment.  If you think I can be of help to you, feel free to reach out. I am on twitter at [@ralphbrooks](https://twitter.com/ralphbrooks) .