In [None]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Emotion prediction with GoEmotions and PRADO



<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/tensorflow/models/blob/master/research/seq_flow_lite/demo/colab/emotion_colab.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/tensorflow/models/blob/master/research/seq_flow_lite/demo/colab/emotion_colab.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>

In this tutorial, we will work through training a neural emotion prediction model, using the tensorflow-models PIP package, and Bazel.

This tutorial is using GoEmotions, an emotion prediction dataset, available on [TensorFlow TFDS](https://www.tensorflow.org/datasets/catalog/goemotions). We will be training a sequence projection model architecture named PRADO, available on [TensorFlow Model Garden](https://github.com/tensorflow/models/blob/master/research/seq_flow_lite/models/prado.py). Finally, we will examine an application of emotion prediction to emoji suggestions from text.

## Setup

### Install the TensorFlow Model Garden pip package

`tf-nightly` is the nightly Model Garden package created daily automatically. We install it with pip.

In [None]:
!pip install tfds-nightly

### Install the Sequence Projection Models package

Install Bazel: This will allow us to build custom TensorFlow ops used by the PRADO architecture.

In [None]:
!sudo apt install curl gnupg
!curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
!echo "deb [arch=amd64] https://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
!sudo apt update
!sudo apt install bazel

Install the library:
* `seq_flow_lite` includes the PRADO architecture and custom ops.
* We download the code from GitHub, and then build and install the TF and TFLite ops used by the model.


In [None]:
!git clone https://www.github.com/tensorflow/models
!models/research/seq_flow_lite/demo/colab/setup_workspace.sh
!pip install models/research/seq_flow_lite
!rm -rf models/research/seq_flow_lite/tf_ops
!rm -rf models/research/seq_flow_lite/tflite_ops

## Training an Emotion Prediction Model

* First, we load the GoEmotions data from TFDS.
* Next, we prepare the PRADO model for training. We set up the model configuration, including hyperparameters and labels. We also prepare the dataset, which involves projecting the inputs from the dataset, and passing the projections to the model.  This is needed because a model training on TPU can not handle string inputs.
* Finally, we train and evaluate the model and produce model-level and per-label metrics.

***Start here on Runtime reset***, once the packages above are properly installed:
* Go to the `seq_flow_lite` directory.

In [None]:
%cd models/research/seq_flow_lite

* Import the Tensorflow and Tensorflow Dataset libraries.

In [None]:
import tensorflow as tf
import tensorflow_datasets as tfds

### The data: GoEmotions
In this tutorial, we use the [GoEmotions dataset from TFDS](https://www.tensorflow.org/datasets/catalog/goemotions).

GoEmotions is a corpus of comments extracted from Reddit, with human annotations to 27 emotion categories or Neutral.

*   Number of labels: 27.
*   Size of training dataset: 43,410.
*   Size of evaluation dataset: 5,427.
*   Maximum sequence length in training and evaluation datasets: 30.

The emotion categories are admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity, desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief, joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, surprise.


Load the data from TFDS:

In [None]:
ds = tfds.load('goemotions', split='train')

Print 5 sample data elements from the dataset:

In [None]:
for element in ds.take(5):
  print(element)

### The model: PRADO

We train an Emotion Prediction model, based on the [PRADO architecture](https://github.com/tensorflow/models/blob/master/research/seq_flow_lite/models/prado.py) from the [Sequence Projection Models package](https://github.com/tensorflow/models/tree/master/research/seq_flow_lite).

PRADO projects input sequences to fixed sized features. The idea behind this approach is to build embedding-free models that minimize the model size. Instead of using an embedding table to lookup embeddings, sequence projection models compute them on the fly, resulting in space-efficient models.

In this section, we prepare the PRADO model for training.

This GoEmotions dataset is not set up so that it can be directly fed into the PRADO model, so below, we also handle the necessary preprocessing by providing a dataset builder.

Prepare the model configuration:
* Enumerate the labels expected to be found in the GoEmotions dataset.
* Prepare the `MODEL_CONFIG` dictionary which includes training parameters for the model. See sample configs for the PRADO model [here](https://github.com/tensorflow/models/tree/master/research/seq_flow_lite/configs).

In [None]:
LABELS = [
    'admiration',
    'amusement',
    'anger',
    'annoyance',
    'approval',
    'caring',
    'confusion',
    'curiosity',
    'desire',
    'disappointment',
    'disapproval',
    'disgust',
    'embarrassment',
    'excitement',
    'fear',
    'gratitude',
    'grief',
    'joy',
    'love',
    'nervousness',
    'optimism',
    'pride',
    'realization',
    'relief',
    'remorse',
    'sadness',
    'surprise',
    'neutral',
]

# Model training parameters.
CONFIG = {
    'name': 'models.prado',
    'batch_size': 1024,
    'train_steps': 10000,
    'learning_rate': 0.0006,
    'learning_rate_decay_steps': 340,
    'learning_rate_decay_rate': 0.7,
}

# Limits the amount of logging output produced by the training run, in order to
# avoid browser slowdowns.
CONFIG['save_checkpoints_steps'] = int(CONFIG['train_steps'] / 10)

MODEL_CONFIG = {
    'labels': LABELS,
    'multilabel': True,
    'quantize': False,
    'max_seq_len': 128,
    'max_seq_len_inference': 128,
    'exclude_nonalphaspace_unicodes': False,
    'split_on_space': True,
    'embedding_regularizer_scale': 0.035,
    'embedding_size': 64,
    'bigram_channels': 64,
    'trigram_channels': 64,
    'feature_size': 512,
    'network_regularizer_scale': 0.0001,
    'keep_prob': 0.5,
    'word_novelty_bits': 0,
    'doc_size_levels': 0,
    'add_bos_tag': False,
    'add_eos_tag': False,
    'pre_logits_fc_layers': [],
    'text_distortion_probability': 0.0,
}

CONFIG['model_config'] = MODEL_CONFIG

Write a function that builds the datasets for the model.  It will load the data, handle batching, and generate projections for the input text.

In [None]:
from layers import base_layers
from layers import projection_layers

def build_dataset(mode, inspect=False):
  if mode == base_layers.TRAIN:
    split = 'train'
    count = None
  elif mode == base_layers.EVAL:
    split = 'test'
    count = 1
  else:
    raise ValueError('mode={}, must be TRAIN or EVAL'.format(mode))

  batch_size = CONFIG['batch_size']
  if inspect:
    batch_size = 1

  # Convert examples from their dataset format into the model format.
  def process_input(features):
    # Generate the projection for each comment_text input.  The final tensor 
    # will have the shape [batch_size, number of tokens, feature size].
    # Additionally, we generate a tensor containing the number of tokens for
    # each comment_text (seq_length).  This is needed because the projection
    # tensor is a full tensor, and we are not using EOS tokens.
    text = features['comment_text']
    text = tf.reshape(text, [batch_size])
    projection_layer = projection_layers.ProjectionLayer(MODEL_CONFIG, mode)
    projection, seq_length = projection_layer(text)

    # Convert the labels into an indicator tensor, using the LABELS indices.
    label = tf.stack([features[label] for label in LABELS], axis=-1)
    label = tf.cast(label, tf.float32)
    label = tf.reshape(label, [batch_size, len(LABELS)])

    model_features = ({'projection': projection, 'sequence_length': seq_length}, label)

    if inspect:
      model_features = (model_features[0], model_features[1], features)

    return model_features

  ds = tfds.load('goemotions', split=split)
  ds = ds.repeat(count=count)
  ds = ds.shuffle(buffer_size=batch_size * 2)
  ds = ds.batch(batch_size, drop_remainder=True)
  ds = ds.map(process_input,
              num_parallel_calls=tf.data.experimental.AUTOTUNE,
              deterministic=False)
  ds = ds.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)
  return ds

train_dataset = build_dataset(base_layers.TRAIN)
test_dataset = build_dataset(base_layers.EVAL)
inspect_dataset = build_dataset(base_layers.TRAIN, inspect=True)

Print a batch of examples in model format.  This will consist of:
* the projection tensors (projection and seq_length)
* the label tensor (second tuple value)

The projection tensor is a **[batch size, max_seq_length, feature_size]** floating point tensor.  The **[b, i]** vector is a feature vector of the **i**th token of the **b**th comment_text.  The rest of the tensor is zero-padded, and the
seq_length tensor indicates the number of features vectors for each comment_text.

The label tensor is an indicator tensor of the set of true labels for the example.

In [None]:
example = next(iter(train_dataset))
print("inputs = {}".format(example[0]))
print("labels = {}".format(example[1]))

In this version of the dataset, the original example has been added as the third element of the tuple.

In [None]:
example = next(iter(inspect_dataset))
print("inputs = {}".format(example[0]))
print("labels = {}".format(example[1]))
print("original example = {}".format(example[2]))

### Train and Evaluate

First we define a function to build the model.  We vary the model inputs depending on task.  For training and evaluation, we'll take the projection and sequence length as inputs.  Otherwise, we'll take strings as inputs.

In [None]:
from models import prado

def build_model(mode):
  # First we define our inputs.
  inputs = []
  if mode == base_layers.TRAIN or mode == base_layers.EVAL:
    # For TRAIN and EVAL, we'll be getting dataset examples,
    # so we'll get projections and sequence_lengths.
    projection = tf.keras.Input(
        shape=(MODEL_CONFIG['max_seq_len'], MODEL_CONFIG['feature_size']),
        name='projection',
        dtype='float32')

    sequence_length = tf.keras.Input(
        shape=(), name='sequence_length', dtype='float32')
    inputs = [projection, sequence_length]
  else:
    # Otherwise, we get string inputs which we need to project.
    input = tf.keras.Input(shape=(), name='input', dtype='string')
    projection_layer = projection_layers.ProjectionLayer(MODEL_CONFIG, mode)
    projection, sequence_length = projection_layer(input)
    inputs = [input]

  # Next we add the model layer.
  model_layer = prado.Encoder(MODEL_CONFIG, mode)
  logits = model_layer(projection, sequence_length)

  # Finally we add an activation layer.
  if MODEL_CONFIG['multilabel']:
    activation = tf.keras.layers.Activation('sigmoid', name='predictions')
  else:
    activation = tf.keras.layers.Activation('softmax', name='predictions')
  predictions = activation(logits)

  model = tf.keras.Model(
      inputs=inputs,
      outputs=[predictions])
  
  return model


Train the model:

In [None]:
# Remove any previous training data.
!rm -rf model

model = build_model(base_layers.TRAIN)

# Create the optimizer.
learning_rate = tf.keras.optimizers.schedules.ExponentialDecay(
    initial_learning_rate=CONFIG['learning_rate'],
    decay_rate=CONFIG['learning_rate_decay_rate'],
    decay_steps=CONFIG['learning_rate_decay_steps'],
    staircase=True)

optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)

# Define the loss function.
loss = tf.keras.losses.BinaryCrossentropy(from_logits=False)

model.compile(optimizer=optimizer, loss=loss)

epochs = int(CONFIG['train_steps'] / CONFIG['save_checkpoints_steps'])
model.fit(
    x=train_dataset,
    epochs=epochs,
    validation_data=test_dataset,
    steps_per_epoch=CONFIG['save_checkpoints_steps'])

model.save_weights('model/model_checkpoint')

Load a training checkpoint and evaluate:

In [None]:
model = build_model(base_layers.EVAL)

# Define metrics over each category.
metrics = []
for i, label in enumerate(LABELS):
  metric = tf.keras.metrics.Precision(
      thresholds=[0.5],
      class_id=i,
      name='precision@0.5/{}'.format(label))
  metrics.append(metric)
  metric = tf.keras.metrics.Recall(
      thresholds=[0.5],
      class_id=i,
      name='recall@0.5/{}'.format(label))
  metrics.append(metric)

# Define metrics over the entire task.
metric = tf.keras.metrics.Precision(thresholds=[0.5], name='precision@0.5/all')
metrics.append(metric)
metric = tf.keras.metrics.Recall(thresholds=[0.5], name='recall@0.5/all')
metrics.append(metric)

model.compile(metrics=metrics)
model.load_weights('model/model_checkpoint')
result = model.evaluate(x=test_dataset, return_dict=True)

Print evaluation metrics for the model, as well as per emotion label:

In [None]:
for label in LABELS:
  precision_key = 'precision@0.5/{}'.format(label)
  recall_key = 'recall@0.5/{}'.format(label)
  if precision_key in result and recall_key in result:
    print('{}: (precision@0.5: {}, recall@0.5: {})'.format(
        label, result[precision_key], result[recall_key]))
    
precision_key = 'precision@0.5/all'
recall_key = 'recall@0.5/all'
if precision_key in result and recall_key in result:
  print('all: (precision@0.5: {}, recall@0.5: {})'.format(
      result[precision_key], result[recall_key]))

## Suggest Emojis using an Emotion Prediction model

In this section, we apply the Emotion Prediction model trained above to suggest emojis relevant to input text.

Refer to our [GoEmotions Model Card](https://github.com/google-research/google-research/blob/master/goemotions/goemotions_model_card.pdf) for additional uses of the model and considerations and limitations for using the GoEmotions data.

Map each emotion label to a relevant emoji:
* Emotions are subtle and multi-faceted. In many cases, no one emoji can truely capture the full complexity of the human experience behind each emotion. 
* For the purpose of this exercise, we will select an emoji that captures at least one facet that is conveyed by an emotion label.

In [None]:
EMOJI_MAP = {
    'admiration': '👏',
    'amusement': '😂',
    'anger': '😡',
    'annoyance': '😒',
    'approval': '👍',
    'caring': '🤗',
    'confusion': '😕',
    'curiosity': '🤔',
    'desire': '😍',
    'disappointment': '😞',
    'disapproval': '👎',
    'disgust': '🤮',
    'embarrassment': '😳',
    'excitement': '🤩',
    'fear': '😨',
    'gratitude': '🙏',
    'grief': '😢',
    'joy': '😃',
    'love': '❤️',
    'nervousness': '😬',
    'optimism': '🤞',
    'pride': '😌',
    'realization': '💡',
    'relief': '😅',
    'remorse': '',
    'sadness': '😞',
    'surprise': '😲',
    'neutral': '',
}

Select sample inputs:

In [None]:
PREDICT_TEXT = [
  b'Good for you!',
  b'Happy birthday!',
  b'I love you.',
]

Run inference for the selected examples:

In [None]:
import numpy as np

model = build_model(base_layers.PREDICT)
model.load_weights('model/model_checkpoint')

for text in PREDICT_TEXT:
  results = model.predict(x=[text])
  print('')
  print('{}:'.format(text))
  labels = np.flip(np.argsort(results[0]))
  for x in range(3):
    label = LABELS[labels[x]]
    label = EMOJI_MAP[label] if EMOJI_MAP[label] else label
    print('{}: {}'.format(label, results[0][labels[x]]))