# Sentence Pair Classfication with BERT using a Cloud TPU

## Overview

**BERT**, or Bidirectional Embedding Representations from Transformers, is a new method of pre-training language representations which obtains state-of-the-art results on a wide array of Natural Language Processing (NLP) tasks. The academic paper can be found here: https://arxiv.org/abs/1810.04805.

This Colab demonstates the steps of fine-tuning a sentence and sentence-pair classification tasks built on top of pretrained BERT models (from  [TF Hub](https://www.tensorflow.org/hub)) and 
run predictions on tuned model.

*This notebook also requires you to have a Google account and access to a GCP Bucket.*

## Setting up Verta for logging

In [None]:
#installing vertax - restart your notebook if prompted
try:
    import verta
except ModuleNotFoundError:
    !pip install verta

In [None]:
HOST = "demo.app.verta.ai"
PROJECT_NAME = "BERT-Classification"
EXPERIMENT_NAME = "Sentence-Pair-Clf"

In [None]:
import os
os.environ['VERTA_EMAIL'] =  ""
os.environ['VERTA_DEV_KEY'] = ""

In [None]:
from verta import Client
from verta.utils import ModelAPI

client = Client(HOST,
                use_git=False)

proj = client.set_project(PROJECT_NAME)
expt = client.set_experiment(EXPERIMENT_NAME)
run = client.set_experiment_run()

## Set up your TPU environment



In this section, you perform the following tasks:

*   Set up a Colab TPU running environment
*   Verify that you are connected to a TPU device
*   Upload your credentials to TPU to access your GCS bucket.

In [None]:
import datetime
import json
import os
import pprint
import random
import string
import sys
import tensorflow as tf
from tensorflow.contrib import predictor

assert 'COLAB_TPU_ADDR' in os.environ, 'ERROR: Not connected to a TPU runtime; please see the first cell in this notebook for instructions!'
TPU_ADDRESS = 'grpc://' + os.environ['COLAB_TPU_ADDR']
print('TPU address is', TPU_ADDRESS)

from google.colab import auth
auth.authenticate_user()
with tf.Session(TPU_ADDRESS) as session:
    print('TPU devices:')
    pprint.pprint(session.list_devices())

    # Upload credentials to TPU.
    with open('/content/adc.json', 'r') as f:
        auth_info = json.load(f)
        tf.contrib.cloud.configure_gcs(session, credentials=auth_info)
    # Now credentials are set for all future sessions on this TPU.

### Prepare and import BERT modules
​
With your environment configured, you can now prepare and import the BERT modules. The following step clones the source code from GitHub and import the modules from the source.

In [None]:
import sys

# bert repository is created and managed by google research
!test -d bert_repo || git clone https://github.com/google-research/bert bert_repo
if not 'bert_repo' in sys.path:
    sys.path += ['bert_repo']

# import python modules defined by BERT
import modeling
import optimization
import run_classifier
import run_classifier_with_tfhub
import tokenization

# import tfhub 
import tensorflow_hub as hub

### Prepare for training

This next section of code performs the following tasks:

*  Specify task and download training data.
    - Using the standard GLUE training set from Microsoft Research for our example. Find more information [here](https://aclweb.org/aclwiki/Paraphrase_Identification_(State_of_the_art).
*  Specify BERT pretrained model
*  Specify GS bucket to download the model
*  Create output directory for model checkpoints and eval results.




In [None]:
TASK = 'MRPC'

# Download glue data.
!test -d download_glue_repo || git clone https://gist.github.com/60c2bdb54d156a41194446737ce03e2e.git download_glue_repo
!python download_glue_repo/download_glue_data.py --data_dir='glue_data' --tasks=$TASK

TASK_DATA_DIR = 'glue_data/' + TASK
print('***** Task data directory: {} *****'.format(TASK_DATA_DIR))
!ls $TASK_DATA_DIR

BUCKET = 'YOUR_BUCKET-NAME'
assert BUCKET, 'Must specify an existing GCS bucket name'
OUTPUT_DIR = 'gs://{}/bert-tfhub/models/{}'.format(BUCKET, TASK)
tf.gfile.MakeDirs(OUTPUT_DIR)
print('***** Model output directory: {} *****'.format(OUTPUT_DIR))

# Available pretrained model checkpoints:
#   uncased_L-12_H-768_A-12: uncased BERT base model
#   uncased_L-24_H-1024_A-16: uncased BERT large model
#   cased_L-12_H-768_A-12: cased BERT large model
BERT_MODEL = 'uncased_L-12_H-768_A-12'
BERT_MODEL_HUB = 'https://tfhub.dev/google/bert_' + BERT_MODEL + '/1'

Now let's load tokenizer module from TF Hub and play with it.

In [None]:
tokenizer = run_classifier_with_tfhub.create_tokenizer_from_hub_module(BERT_MODEL_HUB)
tokenizer.tokenize("This here's an example of using the BERT tokenizer")

Also we initilize our hyperprams, prepare the training data and initialize TPU config.

In [None]:
TRAIN_BATCH_SIZE = 32
EVAL_BATCH_SIZE = 8
PREDICT_BATCH_SIZE = 8
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
MAX_SEQ_LENGTH = 128
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 1000
SAVE_SUMMARY_STEPS = 500

run.log_hyperparameter("train_batch_size", TRAIN_BATCH_SIZE)
run.log_hyperparameter("eval_batch_size", EVAL_BATCH_SIZE)
run.log_hyperparameter("predict_batch_size", PREDICT_BATCH_SIZE)
run.log_hyperparameter("learning_rate", LEARNING_RATE)
run.log_hyperparameter("num_train_epochs", NUM_TRAIN_EPOCHS)
run.log_hyperparameter("warmup_proportion", WARMUP_PROPORTION)
run.log_hyperparameter("save_checkpoint_steps", SAVE_CHECKPOINTS_STEPS)
run.log_hyperparameter("save_summary_steps", SAVE_SUMMARY_STEPS)

In [None]:
processors = {
  "cola": run_classifier.ColaProcessor,
  "mnli": run_classifier.MnliProcessor,
  "mrpc": run_classifier.MrpcProcessor,
}
processor = processors[TASK.lower()]()
label_list = processor.get_labels()
print("LABELS: ", label_list)

# Compute number of train and warmup steps from batch size
train_examples = processor.get_train_examples(TASK_DATA_DIR)
num_train_steps = int(len(train_examples) / TRAIN_BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

# Setup TPU related config
tpu_cluster_resolver = tf.contrib.cluster_resolver.TPUClusterResolver(TPU_ADDRESS)
NUM_TPU_CORES = 8
ITERATIONS_PER_LOOP = 1000

def get_run_config(output_dir):
    return tf.contrib.tpu.RunConfig(
    cluster=tpu_cluster_resolver,
    model_dir=output_dir,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS,
    tpu_config=tf.contrib.tpu.TPUConfig(
        iterations_per_loop=ITERATIONS_PER_LOOP,
        num_shards=NUM_TPU_CORES,
        per_host_input_for_training=tf.contrib.tpu.InputPipelineConfig.PER_HOST_V2))

# Fine-tune and Run Predictions on a pretrained BERT Model from TF Hub

This section demonstrates fine-tuning from a pre-trained BERT TF Hub module and running predictions.


In [None]:
# Force TF Hub writes to the GS bucket we provide.
os.environ['TFHUB_CACHE_DIR'] = OUTPUT_DIR

model_fn = run_classifier_with_tfhub.model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps,
  use_tpu=True,
  bert_hub_module_handle=BERT_MODEL_HUB
)

estimator_from_tfhub = tf.contrib.tpu.TPUEstimator(
  use_tpu=True,
  model_fn=model_fn,
  config=get_run_config(OUTPUT_DIR),
  train_batch_size=TRAIN_BATCH_SIZE,
  eval_batch_size=EVAL_BATCH_SIZE,
  predict_batch_size=PREDICT_BATCH_SIZE,
)


At this point, you can now fine-tune the model, evaluate it, and run predictions on it.

In [None]:
# Train the model
def model_train(estimator):
    print('MRPC/CoLA on BERT base model normally takes about 2-3 minutes. Please wait...')
    # We'll set sequences to be at most 128 tokens long.
    train_features = run_classifier.convert_examples_to_features(
        train_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
    print('***** Started training at {} *****'.format(datetime.datetime.now()))
    print('  Num examples = {}'.format(len(train_examples)))
    print('  Batch size = {}'.format(TRAIN_BATCH_SIZE))
    tf.logging.info("  Num steps = %d", num_train_steps)
    train_input_fn = run_classifier.input_fn_builder(
        features=train_features,
        seq_length=MAX_SEQ_LENGTH,
        is_training=True,
        drop_remainder=True)
    estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
    print('***** Finished training at {} *****'.format(datetime.datetime.now()))

In [None]:
model_train(estimator_from_tfhub)

In [None]:
def model_eval(estimator):
    # Eval the model.
    eval_examples = processor.get_dev_examples(TASK_DATA_DIR)
    eval_features = run_classifier.convert_examples_to_features(
        eval_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
    print('***** Started evaluation at {} *****'.format(datetime.datetime.now()))
    print('  Num examples = {}'.format(len(eval_examples)))
    print('  Batch size = {}'.format(EVAL_BATCH_SIZE))

    # Eval will be slightly WRONG on the TPU because it will truncate
    # the last batch.
    eval_steps = int(len(eval_examples) / EVAL_BATCH_SIZE)
    eval_input_fn = run_classifier.input_fn_builder(
        features=eval_features,
        seq_length=MAX_SEQ_LENGTH,
        is_training=False,
        drop_remainder=True)
    result = estimator.evaluate(input_fn=eval_input_fn, steps=eval_steps)
    print('***** Finished evaluation at {} *****'.format(datetime.datetime.now()))
    output_eval_file = os.path.join(OUTPUT_DIR, "eval_results.txt")
    with tf.gfile.GFile(output_eval_file, "w") as writer:
        print("***** Eval results *****")
        for key in sorted(result.keys()):
            print('  {} = {}'.format(key, str(result[key])))
            writer.write("%s = %s\n" % (key, str(result[key])))


In [None]:
model_eval(estimator_from_tfhub)

In [None]:
def model_predict(estimator):
    # Make predictions on a subset of eval examples
    prediction_examples = processor.get_dev_examples(TASK_DATA_DIR)[:PREDICT_BATCH_SIZE]
    input_features = run_classifier.convert_examples_to_features(prediction_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
    predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=True)
    predictions = estimator.predict(predict_input_fn)

    for example, prediction in zip(prediction_examples, predictions):
        print('text_a: %s\ntext_b: %s\nlabel:%s\nprediction:%s\n' % (example.text_a, example.text_b, str(example.label), prediction['probabilities']))

In [None]:
model_predict(estimator_from_tfhub) 

## Saving and Deploying the BERT model
In this section performs the following tasks:
* Saving the model for predictions
* Creating a wrapper for deployment
* Logging models on Verta for deployment
* Predict on text from the deployed model

Tensorflow provides a more efficient way of serializing any inference graph that plays nicely with the rest of the ecosystem, like Tensorflow Serving. In line with the tf.estimator technical specifications of making it an easy-to-use, high-level API, exporting an Estimator as a saved_model is really simple.

We first need to define a special `input_fn`. Then, reloading and serializing the estimator is straightforward.

In [None]:
def serving_input_fn():
    label_ids = tf.placeholder(tf.int32, [None], name='label_ids')
    input_ids = tf.placeholder(tf.int32, [None, MAX_SEQ_LENGTH], name='input_ids')
    input_mask = tf.placeholder(tf.int32, [None, MAX_SEQ_LENGTH], name='input_mask')
    segment_ids = tf.placeholder(tf.int32, [None, MAX_SEQ_LENGTH], name='segment_ids')
    input_fn = tf.estimator.export.build_raw_serving_input_receiver_fn({
        'label_ids': label_ids,
        'input_ids': input_ids,
        'input_mask': input_mask,
        'segment_ids': segment_ids,
    })()
    return input_fn

In [None]:
estimator_from_tfhub._export_to_tpu = False
estimator_from_tfhub.export_saved_model('path/to/save/model', serving_input_fn) # saved to your local file system

In [None]:
run.log_model('trained_model', 'path/to/save/model')

### Creating a wrapper class
Verta deployment expects a particular interface for its models.
We must expose a predict() function based on the code we wrote earlier. This becomes a thin wrapper around our model.

In [None]:
# Testing prediction
dummy_example = {"input_ids":np.zeros((1,128), dtype=int).tolist(),
           "input_mask":np.zeros((1,128), dtype=int).tolist(),
           "label_ids":[0],
           "segment_ids":np.zeros((1,128), dtype=int).tolist()}

latest = '/content/saved_model/1564784206'
predict_fn = predictor.from_saved_model(latest)
results = predict_fn(example)['probabilities']

In [None]:
model_api = ModelAPI(in_sentences, results)

In [None]:
import six
# this is like open("requirements.txt"), but without creating a new file
requirements = six.StringIO('\n'.join([
    "tensorflow=={}".format(tf.__version__),
    "tensorflow-hub=={}".format(hub.__version__),
    "bert-tensorflow==1.0.1",
    "numpy=={}".format(np.__version__)
]))

In [None]:
from tensorflow.contrib import predictor
import numpy as np

class BertSentenceClf:
    def __init__(self, path):
        self.predict_fn = predictor.from_saved_model(path)
    
    def predict(self, in_sentences):
        input_examples = [run_classifier.InputExample(guid="", text_a = x[0], text_b = x[1], label = "0") for x in in_sentences] # here, "" is just a dummy label
        input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
        # predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=True)
        predictions = []
        for i in input_features:
            print()
            res = self.predict_fn({"input_ids":i.input_ids,
                                   "input_mask":i.input_mask,
                                   "label_ids":[i.label_id],
                                   "segment_ids":i.segment_ids})
            predictions.append(res['probabilities'])
        return predictions

In [None]:
# fails - mismatch in tensors
in_sentences = [("This integrates with Rational PurifyPlus and allows developers to work in supported versions of Java , Visual C # and Visual Basic .NET.",
                   "IBM said the Rational products were also integrated with Rational PurifyPlus , which allows developers to work in Java , Visual C # and VisualBasic .Net.")]
model = BertSentenceClf(path= '/path/to/saved/model/')  
model.predict(in_sentences)


In [None]:
# TypeError: can't pickle _thread.RLock objects
run.log_model_for_deployment(model, model_api, requirements)