# BERT fine tuning and batch inference on Cloud TPU 




**BERT**, or **B**idirectional **E**mbedding **R**epresentations from **T**ransformers, is a new method of pre-training language representations which obtains state-of-the-art results on a wide array of Natural Language Processing (NLP) tasks. The academic paper can be found here: https://arxiv.org/abs/1810.04805.

This Colab demonstates using a free Colab Cloud TPU to fine-tune sentence and sentence-pair classification tasks built on top of pretrained BERT models.

**Note:**  You will need a GCP (Google Compute Engine) account and a GCS (Google Cloud 
Storage) bucket for this Colab to run.



**Firstly**, we need to set up Colab TPU running environment, verify a TPU device is succesfully connected and upload credentials to TPU for GCS bucket usage.

In [0]:
#!pip install tensorflow
! pip uninstall -y tensorflow
! pip install -U tf-nightly
import tensorflow as tf

Uninstalling tensorflow-1.14.0:
  Successfully uninstalled tensorflow-1.14.0
Collecting tf-nightly
[?25l  Downloading https://files.pythonhosted.org/packages/d7/26/f8dcac360c7caaa0f53c65fb62c363a936bee1478408838718f0a7b07423/tf_nightly-1.15.0.dev20190703-cp36-cp36m-manylinux1_x86_64.whl (100.7MB)
[K     |████████████████████████████████| 100.7MB 151kB/s 
[?25hCollecting opt-einsum>=2.3.2 (from tf-nightly)
[?25l  Downloading https://files.pythonhosted.org/packages/f6/d6/44792ec668bcda7d91913c75237314e688f70415ab2acd7172c845f0b24f/opt_einsum-2.3.2.tar.gz (59kB)
[K     |████████████████████████████████| 61kB 21.4MB/s 
Collecting tf-estimator-nightly (from tf-nightly)
[?25l  Downloading https://files.pythonhosted.org/packages/f6/c5/2ce00403369520d5098ccefa2439b4278387ba6924f2dd74cc771fcef2d5/tf_estimator_nightly-1.14.0.dev2019070301-py2.py3-none-any.whl (499kB)
[K     |████████████████████████████████| 501kB 40.1MB/s 
Collecting tb-nightly<1.15.0a0,>=1.14.0a0 (from tf-nightly)
[?25

In [0]:
import datetime
import json
import os
import pprint
import random
import string
import sys
import tensorflow as tf
import pandas as pd
import urllib.request

assert 'COLAB_TPU_ADDR' in os.environ, 'ERROR: Not connected to a TPU runtime; please see the first cell in this notebook for instructions!'
TPU_ADDRESS = 'grpc://' + os.environ['COLAB_TPU_ADDR']
print('TPU address is', TPU_ADDRESS)

from google.colab import auth
auth.authenticate_user()
with tf.Session(TPU_ADDRESS) as session:
  print('TPU devices:')
  pprint.pprint(session.list_devices())

  # Upload credentials to TPU.
  with open('/content/adc.json', 'r') as f:
    auth_info = json.load(f)
  tf.contrib.cloud.configure_gcs(session, credentials=auth_info)
  # Now credentials are set for all future sessions on this TPU.

TPU address is grpc://10.76.5.58:8470


**Second**, prepare and import BERT modules. Also export data processing jupyter to python file. Use custom bert repo! 

In [0]:
import sys

!test -d bert_repo || git clone https://github.com/BrittonWinterrose/bert.git bert_repo
if not 'bert_repo' in sys.path:
  sys.path += ['bert_repo']

Cloning into 'bert_repo'...
remote: Enumerating objects: 365, done.[K
remote: Total 365 (delta 0), reused 0 (delta 0), pack-reused 365
Receiving objects: 100% (365/365), 26.69 MiB | 18.49 MiB/s, done.
Resolving deltas: 100% (207/207), done.


In [0]:
sys.path += ['bert']

**Thirdly**, prepare for training:

*  Specify task and download training data.
*  Specify BERT pretrained model
*  Specify GS bucket, create output directory for model checkpoints and eval results.



In [0]:
TASK = 'multilabel'

TASK_DATA_DIR = 'bert_repo/dataset'
print('***** Task data directory: {} *****'.format(TASK_DATA_DIR))


# Available pretrained model checkpoints:
#   uncased_L-12_H-768_A-12: uncased BERT base model
#   uncased_L-24_H-1024_A-16: uncased BERT large model
#   cased_L-12_H-768_A-12: cased BERT large model
BERT_MODEL = 'uncased_L-24_H-1024_A-16' #@param {type:"string"}
BERT_PRETRAINED_DIR = 'gs://cloud-tpu-checkpoints/bert/' + BERT_MODEL
print('***** BERT pretrained directory: {} *****'.format(BERT_PRETRAINED_DIR))
!gsutil ls $BERT_PRETRAINED_DIR

BUCKET = 'mr_bert_bucket' #@param {type:"string"}
assert BUCKET, 'Must specify an existing GCS bucket name'
OUTPUT_DIR = 'gs://{}/bert/models/{}'.format(BUCKET, TASK)
tf.gfile.MakeDirs(OUTPUT_DIR)
print('***** Model output directory: {} *****'.format(OUTPUT_DIR))


***** Task data directory: bert_repo/dataset *****
***** BERT pretrained directory: gs://cloud-tpu-checkpoints/bert/uncased_L-24_H-1024_A-16 *****
gs://cloud-tpu-checkpoints/bert/uncased_L-24_H-1024_A-16/bert_config.json
gs://cloud-tpu-checkpoints/bert/uncased_L-24_H-1024_A-16/bert_model.ckpt.data-00000-of-00001
gs://cloud-tpu-checkpoints/bert/uncased_L-24_H-1024_A-16/bert_model.ckpt.index
gs://cloud-tpu-checkpoints/bert/uncased_L-24_H-1024_A-16/bert_model.ckpt.meta
gs://cloud-tpu-checkpoints/bert/uncased_L-24_H-1024_A-16/checkpoint
gs://cloud-tpu-checkpoints/bert/uncased_L-24_H-1024_A-16/vocab.txt
***** Model output directory: gs://mr_bert_bucket/bert/models/multilabel *****


## Configure for Task


In [0]:
# Setup task specific model and TPU running config

import modeling
import optimization
import run_multilabels_classifier
import tokenization


# Model Hyper Parameters
NUM_TRAIN_EPOCHS = 5.0
MAX_SEQ_LENGTH = 220

TRAIN_BATCH_SIZE = 64
EVAL_BATCH_SIZE = 8
PREDICT_BATCH_SIZE = 8

LEARNING_RATE = 2e-5
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 1000

VOCAB_FILE = os.path.join(BERT_PRETRAINED_DIR, 'vocab.txt')
CONFIG_FILE = os.path.join(BERT_PRETRAINED_DIR, 'bert_config.json')
INIT_CHECKPOINT = os.path.join(BERT_PRETRAINED_DIR, 'bert_model.ckpt')
DO_LOWER_CASE = BERT_MODEL.startswith('uncased')

processors = {
  "multilabel": run_multilabels_classifier.MultiLabelTextProcessor,
}
processor = processors[TASK.lower()]()
#THIS IS HARDCODED
label_list = ['toxic',
              'severe_toxic',
              'obscene',
              'threat',
              'insult',
              'identity_hate']

tokenizer = tokenization.FullTokenizer(vocab_file=VOCAB_FILE, do_lower_case=DO_LOWER_CASE)

# TPU Config
USE_TPU = True
tpu_cluster_resolver = tf.contrib.cluster_resolver.TPUClusterResolver(TPU_ADDRESS)
ITERATIONS_PER_LOOP = 1000
NUM_TPU_CORES = 8

# Run TPU Config 
run_config = tf.contrib.tpu.RunConfig(
    cluster=tpu_cluster_resolver,
    model_dir=OUTPUT_DIR,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS,
    tpu_config=tf.contrib.tpu.TPUConfig(
        iterations_per_loop=ITERATIONS_PER_LOOP,
        num_shards=NUM_TPU_CORES,
        per_host_input_for_training=tf.contrib.tpu.InputPipelineConfig.PER_HOST_V2))

# Compute the number of train and warmup steps from batch size. 
train_examples = processor.get_train_examples(TASK_DATA_DIR)
num_train_steps = int(len(train_examples) / TRAIN_BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

# Configure our model's parameters
model_fn = run_multilabels_classifier.model_fn_builder(
    bert_config=modeling.BertConfig.from_json_file(CONFIG_FILE),
    num_labels=len(label_list),
    init_checkpoint=INIT_CHECKPOINT,
    learning_rate=LEARNING_RATE,
    num_train_steps=num_train_steps,
    num_warmup_steps=num_warmup_steps,
    use_tpu=USE_TPU,
    #may need to set false
    use_one_hot_embeddings=True,
    #do_serve=False
)
# use_one_hot_embeddings is the method of retrieving word embeddings out of the
# embedding tensor. If False, uses tf.gather otherwise, a simple tf.matmul 
# generates requested embeddings. It seems to me that setting 
# use_one_hot_embeddings to False is usually preferred.

# Configure the Estimator's parameters
estimator = tf.contrib.tpu.TPUEstimator(
    use_tpu=USE_TPU,
    model_fn=model_fn,
    config=run_config,
    train_batch_size=TRAIN_BATCH_SIZE,
    eval_batch_size=EVAL_BATCH_SIZE,
    predict_batch_size=EVAL_BATCH_SIZE)

INFO:tensorflow:Using config: {'_model_dir': 'gs://mr_bert_bucket/bert/models/multilabel', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 1000, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
cluster_def {
  job {
    name: "worker"
    tasks {
      key: 0
      value: "10.92.4.122:8470"
    }
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f830756eb38>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': 'grpc://10.92.4.122:8470', '_evaluation_master': 'grpc://10.92.4.122:8470', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=1000, num_shards=8, num_cor

In [0]:
# Train the Model
train_file = os.path.join(OUTPUT_DIR, "train.tf_record")
make_train_features = run_multilabels_classifier.file_based_convert_examples_to_features( train_examples, label_list, MAX_SEQ_LENGTH, tokenizer, train_file)

# Train the model.
print('Toxic Comments on BERT... Please wait...')
print('***** Started training at {} *****'.format(datetime.datetime.now()))
print('  Num examples = {}'.format(len(train_examples)))
print('  Batch size = {}'.format(TRAIN_BATCH_SIZE))
tf.logging.info("  Num steps = %d", num_train_steps)

train_input_fn = run_multilabels_classifier.file_based_input_fn_builder(
    input_file=train_file,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=True)
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print('***** Finished training at {} *****'.format(datetime.datetime.now()))

INFO:tensorflow:Writing example 0 of 127655
INFO:tensorflow:*** Example ***
INFO:tensorflow:guid: 000103f0d9cfb60f
INFO:tensorflow:tokens: [CLS] d ' aw ##w ! he matches this background colour i ' m seemingly stuck with . thanks . ( talk ) 21 : 51 , january 11 , 2016 ( utc ) [SEP]
INFO:tensorflow:input_ids: 101 1040 1005 22091 2860 999 2002 3503 2023 4281 6120 1045 1005 1049 9428 5881 2007 1012 4283 1012 1006 2831 1007 2538 1024 4868 1010 2254 2340 1010 2355 1006 11396 1007 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 

# Eval

In [0]:
# My validation dataset is called Test in this repo.
!mv bert_repo/dataset/test.csv bert_repo/dataset/val.csv

In [0]:
# Eval the model.
eval_examples = processor.get_dev_examples(TASK_DATA_DIR)
num_actual_eval_examples = len(eval_examples)
#while len(eval_examples) % EVAL_BATCH_SIZE != 0:
  #eval_examples.append(run_multilabels_classifier.PaddingInputExample())

eval_file = os.path.join(OUTPUT_DIR, "eval.tf_record")
eval_features = run_multilabels_classifier.file_based_convert_examples_to_features(eval_examples, label_list, MAX_SEQ_LENGTH, tokenizer, eval_file)

print('***** Started evaluation at {} *****'.format(datetime.datetime.now()))
print('  Num examples = {}'.format(len(eval_examples)))
print('  Batch size = {}'.format(EVAL_BATCH_SIZE))
# Eval will be slightly WRONG on the TPU because it will truncate
# the last batch.
eval_steps = int(len(eval_examples) / EVAL_BATCH_SIZE)
eval_input_fn = run_multilabels_classifier.file_based_input_fn_builder(
    input_file=eval_file,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=True)
result = estimator.evaluate(input_fn=eval_input_fn, steps=eval_steps)
print('***** Finished evaluation at {} *****'.format(datetime.datetime.now()))
output_eval_file = os.path.join(OUTPUT_DIR, "eval_results.txt")
with tf.gfile.GFile(output_eval_file, "w") as writer:
  print("***** Eval results *****")
  for key in sorted(result.keys()):
    print('  {} = {}'.format(key, str(result[key])))
    writer.write("%s = %s\n" % (key, str(result[key])))

INFO:tensorflow:Writing example 0 of 31913
INFO:tensorflow:*** Example ***
INFO:tensorflow:guid: aac73bf42ef22ff9
INFO:tensorflow:tokens: [CLS] " w ##p articles are not genealogical entries or trees , says w ##p : not . furthermore , this article is about jo ##gai ##la , and that one is about w ##ła ##dy ##sław ii of poland . our readers may find this lack of conform ##ity disturbing . - т ##р ##е ##п - " [SEP]
INFO:tensorflow:input_ids: 101 1000 1059 2361 4790 2024 2025 29606 10445 2030 3628 1010 2758 1059 2361 1024 2025 1012 7297 1010 2023 3720 2003 2055 8183 23805 2721 1010 1998 2008 2028 2003 2055 1059 22972 5149 23305 2462 1997 3735 1012 2256 8141 2089 2424 2023 3768 1997 23758 3012 14888 1012 1011 1197 16856 15290 29746 1011 1000 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 

# Check against Kaggle Test

# Predict  - Run predictions on multiple TPU's in multiple notebooks! 

In [0]:
# Files to predict in the root of the folder or specify a directory with /dir
# Source Bucket
PREDICT_BUCKET = 'mr_bert_bucket' #@param {type:"string"}
assert PREDICT_BUCKET, 'Must specify an existing GCS bucket name'
PREDICT_SOURCE_FOLDER = 'data4' #@param {type:"string"}
PREDICT_BUCKET_DIR = 'gs://{}'.format(PREDICT_BUCKET)
PREDICT_FOLDER_DIR = '{}/{}'.format(PREDICT_BUCKET_DIR, PREDICT_SOURCE_FOLDER)

print('***** Target Prediction Source Files: {} *****'.format(PREDICT_FOLDER_DIR))
!gsutil ls $PREDICT_FOLDER_DIR

***** Target Prediction Source Files: gs://mr_bert_bucket/data4 *****
gs://mr_bert_bucket/data4/hn_cleaned_ready_for_bert_a1.csv
gs://mr_bert_bucket/data4/hn_cleaned_ready_for_bert_a10.csv
gs://mr_bert_bucket/data4/hn_cleaned_ready_for_bert_a11.csv
gs://mr_bert_bucket/data4/hn_cleaned_ready_for_bert_a2.csv
gs://mr_bert_bucket/data4/hn_cleaned_ready_for_bert_a3.csv
gs://mr_bert_bucket/data4/hn_cleaned_ready_for_bert_a4.csv
gs://mr_bert_bucket/data4/hn_cleaned_ready_for_bert_a5.csv
gs://mr_bert_bucket/data4/hn_cleaned_ready_for_bert_a6.csv
gs://mr_bert_bucket/data4/hn_cleaned_ready_for_bert_a7.csv
gs://mr_bert_bucket/data4/hn_cleaned_ready_for_bert_a8.csv
gs://mr_bert_bucket/data4/hn_cleaned_ready_for_bert_a9.csv
gs://mr_bert_bucket/data4/hn_cleaned_ready_for_bert_small_test.csv
gs://mr_bert_bucket/data4/hn_cleaned_ready_for_bert_small_test_loop.csv


In [0]:
# Going to run a test "prediction" on my validation data to make sure this is configured correctly
#!mv bert_repo/dataset/val.csv bert_repo/dataset/test.csv

#predict_these = ["hn_cleaned_ready_for_bert_small_test_loop", "hn_cleaned_ready_for_bert_small_test"]

predict_these = [ 'hn_cleaned_ready_for_bert_a11']

tf.logging.set_verbosity(tf.logging.DEBUG)
for file in predict_these:
  FILE_TO_PREDICT = '{}.csv'.format(file)
  print(FILE_TO_PREDICT)
  PREDICT_FILEPATH = '{}/{}'.format(PREDICT_FOLDER_DIR, FILE_TO_PREDICT)
  
  PREDICT_OUTPUT_FOLDER_NAME = 'HackerSaltPredictions' #@param {type:"string"}
  PRED_OUTPUT_DIR = '{}/{}'.format(PREDICT_BUCKET_DIR, PREDICT_OUTPUT_FOLDER_NAME)
  tf.gfile.MakeDirs(PRED_OUTPUT_DIR)
  
  SAVE_OUTPUT_AS = '{}_predictions'.format(file)
  SAVE_OUTPUT_FP = '{}.tsv'.format(SAVE_OUTPUT_AS)

  print('***** File to predict: {} *****'.format(PREDICT_FILEPATH))
  print('***** Model output directory: {} *****'.format(PRED_OUTPUT_DIR))

  DWNLD_PRED = '{}/test.csv'.format(TASK_DATA_DIR)

  # Download the file.
  !gsutil cp $PREDICT_FILEPATH $DWNLD_PRED

  # Print the result to make sure the transfer worked.
  !head -10 $DWNLD_PRED

  # Run predictions
  tf.logging.set_verbosity(tf.logging.DEBUG)
  predict_examples = processor.get_test_examples(TASK_DATA_DIR)
  num_actual_predict_examples = len(predict_examples)
  if USE_TPU == True:
      # TPU requires a fixed batch size for all batches, therefore the number
      # of examples must be a multiple of the batch size, or else examples
      # will get dropped. So we pad with fake examples which are ignored
      # later on.
    while len(predict_examples) % PREDICT_BATCH_SIZE != 0:
          predict_examples.append(run_multilabels_classifier.PaddingInputExample())

  predict_file = os.path.join(PRED_OUTPUT_DIR, "predict.tf_record")
  run_multilabels_classifier.file_based_convert_examples_to_features(predict_examples, label_list,
                                          MAX_SEQ_LENGTH, tokenizer,
                                          predict_file)

  tf.logging.info("***** Running prediction*****")
  tf.logging.info("  Num examples = %d (%d actual, %d padding)",
                  len(predict_examples), num_actual_predict_examples,
                  len(predict_examples) - num_actual_predict_examples)
  tf.logging.info("  Batch size = %d", PREDICT_BATCH_SIZE)
  tf.logging.set_verbosity(tf.logging.WARN)
  predict_drop_remainder = USE_TPU
  predict_input_fn =  run_multilabels_classifier.file_based_input_fn_builder(
      input_file=predict_file,
      seq_length=MAX_SEQ_LENGTH,
      is_training=False,
      drop_remainder=predict_drop_remainder)

  result = estimator.predict(input_fn=predict_input_fn)

  output_predict_file = os.path.join(PRED_OUTPUT_DIR, SAVE_OUTPUT_FP)
  with tf.gfile.GFile(output_predict_file, "w") as writer:
      num_written_lines = 0
      tf.logging.set_verbosity(tf.logging.INFO)
      tf.logging.info("***** Writing Predict results *****")
      tf.logging.set_verbosity(tf.logging.WARN)
      for (i, prediction) in enumerate(result):
          probabilities = prediction["probabilities"]
          if i >= num_actual_predict_examples:
              break
          output_line = ",".join(
              str(class_probability)
              for class_probability in probabilities) + "\n"
          writer.write(str(predict_examples[i].guid) + ',' + str(output_line))
          num_written_lines += 1
      tf.logging.set_verbosity(tf.logging.INFO)
  assert num_written_lines == num_actual_predict_examples
  print("{} - Predictions Completed & Saved".format(SAVE_OUTPUT_AS))

hn_cleaned_ready_for_bert_a11.csv
***** File to predict: gs://mr_bert_bucket/data4/hn_cleaned_ready_for_bert_a11.csv *****
***** Model output directory: gs://mr_bert_bucket/HackerSaltPredictions *****
Copying gs://mr_bert_bucket/data4/hn_cleaned_ready_for_bert_a11.csv...
/ [1 files][531.7 MiB/531.7 MiB]                                                
Operation completed over 1 objects/531.7 MiB.                                    
commentid,cleaned_comment
17118327,"For which companies are those driver driving post-Uber? There aren't many choices, SF specifically. Talking about turnover. Any ideas how that's for Uber compared to Lyft et al.?"
14423201,"I agree with you, but as a Senior Developer, I am expected to pick up the tech stack quickly, and help the team learn good engineering practices, which are pretty much applicable to all stacks (DRY, SOLID, KISS etc.)"
14061069,"I use mine for work and personal stuff regularly, in preference of a mbp 15 retina. The otherwise amazing scree

#Export for Serving

In [0]:
## Freezing & serving TensorFlow BERT: https://hanxiao.github.io/2019/01/02/Serving-Google-BERT-in-Production-using-Tensorflow-and-ZeroMQ/

# input_tensors = [input_ids, input_mask, input_type_ids]
# output_tensors = [pooled]


from tensorflow.python.tools.optimize_for_inference_lib import optimize_for_inference
from tensorflow.graph_util import convert_variables_to_constants

# get graph
tmp_g = tf.get_default_graph().as_graph_def()

sess = tf.Session()
# load parameters then freeze
sess.run(tf.global_variables_initializer())
tmp_g = convert_variables_to_constants(sess, tmp_g, [n.name[:-2] for n in output_tensors])

# pruning
dtypes = [n.dtype for n in input_tensors]
tmp_g = optimize_for_inference(tmp_g, [n.name[:-2] for n in input_tensors],
    [n.name[:-2] for n in output_tensors],
    [dtype.as_datatype_enum for dtype in dtypes], False)
    
with tf.gfile.GFile('optimized.graph', 'wb') as f:
    f.write(tmp_g.SerializeToString())



In [0]:
# Export the model
def serving_input_fn():
  with tf.variable_scope("foo"):
    feature_spec = {
        "input_ids": tf.FixedLenFeature([MAX_SEQ_LENGTH], tf.int64),
        "input_mask": tf.FixedLenFeature([MAX_SEQ_LENGTH], tf.int64),
        "segment_ids": tf.FixedLenFeature([MAX_SEQ_LENGTH], tf.int64),
        "label_ids": tf.FixedLenFeature([6], tf.int64),
      }
    serialized_tf_example = tf.placeholder(dtype=tf.string,
                                           shape=[None],
                                           name='input_example_tensor')
    receiver_tensors = {'examples': serialized_tf_example}
    features = tf.parse_example(serialized_tf_example, feature_spec)
    return tf.estimator.export.ServingInputReceiver(features, receiver_tensors)

EXPORT_DIR = 'gs://{}/bert/export/{}'.format(BUCKET, TASK)
estimator._export_to_tpu = False  # this is important
path = estimator.export_savedmodel(EXPORT_DIR, serving_input_fn)
print(path)


INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Running infer on CPU
INFO:tensorflow:*** Features ***
INFO:tensorflow:  name = input_ids, shape = (?, 220)
INFO:tensorflow:  name = input_mask, shape = (?, 220)
INFO:tensorflow:  name = label_ids, shape = (?, 6)
INFO:tensorflow:  name = segment_ids, shape = (?, 220)
INFO:tensorflow:num_labels:6;logits:Tensor("loss/BiasAdd:0", shape=(?, 6), dtype=float32);labels:Tensor("loss/Cast:0", shape=(?, 6), dtype=float32)
INFO:tensorflow:**** Trainable Variables ****
INFO:tensorflow:  name = bert/embeddings/word_embeddings:0, shape = (30522, 1024), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/embeddings/token_type_embeddings:0, shape = (2, 1024), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/embeddings/position_embeddings:0, shape = (512, 1024), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/embeddings/LayerNorm/beta:0, shape = (1024,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/embeddings/LayerNorm/gamma:0, shape = (1024,), *INIT_FROM_CKPT

In [0]:
# See saved model details: 
!saved_model_cli show --all --dir gs://mr_bert_bucket/bert/export/multilabel/1557428031


MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['examples'] tensor_info:
        dtype: DT_STRING
        shape: (-1)
        name: foo/input_example_tensor:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['probabilities'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 6)
        name: loss/Sigmoid:0
  Method name is: tensorflow/serving/predict


In [0]:
# Execute model to understand model input format
!saved_model_cli run --dir gs://mr_bert_bucket/bert/export/multilabel/1557428031 --tag_set serve --signature_def serving_default \
--input_examples 'examples=[{"input_ids":np.ones((220), dtype=int).tolist(),"input_mask":np.zeros((220), dtype=int).tolist(),"label_ids":[0,0,0,0,0,0],"segment_ids":np.zeros((220), dtype=int).tolist()}]'

2019-05-09 18:56:31.902228: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2019-05-09 18:56:31.905578: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x56015321a7e0 executing computations on platform Host. Devices:
2019-05-09 18:56:31.905657: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
Result for output key probabilities:
[[0.9450909  0.14628963 0.47554603 0.02327371 0.44758207 0.17049994]]


In [0]:
tf.saved_model.simple_save(sess, "./saved_model", 
                           inputs={"x": x, }, outputs={"softmax": softmax, })

## Attempt at making the model smaller using TF.lite


In [0]:
# Quantize as tf.lite model
converter = tf.lite.TFLiteConverter.from_saved_model("gs://mr_bert_bucket/bert/export/multilabel/1557428031")
converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
tflite_quant_model = converter.convert()

Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.
INFO:tensorflow:Restoring parameters from gs://mr_bert_bucket/bert/export/multilabel/1557428031/variables/variables
INFO:tensorflow:The given SavedModel MetaGraphDef contains SignatureDefs with the following keys: {'serving_default'}
INFO:tensorflow:input tensors info: 
INFO:tensorflow:Tensor's key in saved_model's tensor_map: examples
INFO:tensorflow: tensor name: foo/input_example_tensor:0, shape: (-1), type: DT_STRING
INFO:tensorflow:output tensors info: 
INFO:tensorflow:Tensor's key in saved_model's tensor_map: probabilities
INFO:tensorflow: tensor name: loss/Sigmoid:0, shape: (-1, 6), type: DT_FLOAT
INFO:tensorflow:Restoring parameters from gs://mr_bert_bucket/bert/export/multilabel/1557428031/variables/variables
Instructions for updati

AttributeError: ignored

In [0]:
! pip uninstall -y tf-nightly
! pip install -U tf-nightly-2.0-preview>=2.0.0.dev20190502


Uninstalling tf-nightly-1.14.1.dev20190509:
  Successfully uninstalled tf-nightly-1.14.1.dev20190509


In [0]:
import tensorflow as tf
tf.enable_eager_execution()


tensorflowjs_converter \
    --input_format=tf_saved_model \
    --signature_name=serving_default \
    --saved_model_tags=serve \
    gs://mr_bert_bucket/bert/export/multilabel/1557428031 \
    /web_model

SyntaxError: ignored