<a href="https://colab.research.google.com/github/graulef/bert/blob/master/Colab_Predicting_Story_Cloze_with_BERT.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
# Copyright 2019 Google Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#Predicting Story Cloze task with BERT on TF Hub

If you’ve been following Natural Language Processing over the past year, you’ve probably heard of BERT: Bidirectional Encoder Representations from Transformers. It’s a neural network architecture designed by Google researchers that’s totally transformed what’s state-of-the-art for NLP tasks, like text classification, translation, summarization, and question answering.

Now that BERT's been added to [TF Hub](https://www.tensorflow.org/hub) as a loadable module, it's easy(ish) to add into existing Tensorflow text pipelines. In an existing pipeline, BERT can replace text embedding layers like ELMO and GloVE. Alternatively, [finetuning](http://wiki.fast.ai/index.php/Fine_tuning) BERT can provide both an accuracy boost and faster training time in many cases.

In [1]:
!pip list | grep tensorflow
!python --version

mesh-tensorflow          0.0.5                
tensorflow               1.13.1               
tensorflow-estimator     1.13.0               
tensorflow-hub           0.4.0                
tensorflow-metadata      0.13.0               
tensorflow-probability   0.6.0                
Python 3.6.7


In [2]:
from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime

import os
cwd = os.getcwd()
print(cwd)

W0529 13:37:13.431505 140384064227200 __init__.py:56] Some hub symbols are not available because TensorFlow version is less than 1.14


/content


In addition to the standard libraries we imported above, we'll need to install BERT's python package.

In [3]:
!pip install bert-tensorflow

Collecting bert-tensorflow
[?25l  Downloading https://files.pythonhosted.org/packages/a6/66/7eb4e8b6ea35b7cc54c322c816f976167a43019750279a8473d355800a93/bert_tensorflow-1.0.1-py2.py3-none-any.whl (67kB)
[K     |████████████████████████████████| 71kB 8.4MB/s 
Installing collected packages: bert-tensorflow
Successfully installed bert-tensorflow-1.0.1


In [0]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization

Below, we'll set an output directory location to store our model output and checkpoints. This can be a local directory, in which case you'd set OUTPUT_DIR to the name of the directory you'd like to create. If you're running this code in Google's hosted Colab, the directory won't persist after the Colab session ends.

Alternatively, if you're a GCP user, you can store output in a GCP bucket. To do that, set a directory name in OUTPUT_DIR and the name of the GCP bucket in the BUCKET field.

Set DO_DELETE to rewrite the OUTPUT_DIR if it exists. Otherwise, Tensorflow will load existing model checkpoints from that directory (if they exist).

In [5]:
# Set the output directory for saving model file
# Optionally, set a GCP bucket location

OUTPUT_DIR = 'bert_story_cloze'#@param {type:"string"}
#@markdown Whether or not to clear/delete the directory and create a new one
DO_DELETE = False #@param {type:"boolean"}

print('***** Model output directory: {} *****'.format(OUTPUT_DIR))


***** Model output directory: bert_story_cloze *****


#Data

First, let's download the dataset, hosted by Stanford. The code below, which downloads, extracts, and imports the IMDB Large Movie Review Dataset, is borrowed from [this Tensorflow tutorial](https://www.tensorflow.org/hub/tutorials/text_classification_with_tf_hub).

In [0]:
from tensorflow import keras
import os
import re
import csv

PATH_EVAL_DATA = "http://felix.graule.ch/wp-content/uploads/2019/05/cloze_test_val_spring2016.csv"
PATH_SENT_NN_DATA = "http://felix.graule.ch/wp-content/uploads/2019/05/train_stories_nearest_story_sent2vec_combined.csv"
PATH_RAND_NN_DATA = "http://felix.graule.ch/wp-content/uploads/2019/05/train_stories_rand_combined.csv"
#PATH_EVAL_DATA = "glue_data/StoryCloze/cloze_test_val_spring2016.csv"
#PATH_RAND_NN_DATA = "glue_data/StoryCloze/train_stories_rand_combined.csv"
#PATH_SENT_NN_DATA = "glue_data/StoryCloze/train_stories_nearest_story_sent2vec_combined.csv"

# Load all files from a directory in a DataFrame.
def load_data(path):
  data = {}
  data["label"] = []
  data["id_1"] = []
  data["id_2"] = []
  data["context"] = []
  data["ending"] = []
  print(path)
  with open(path) as f:
    csv_reader = csv.reader(f, delimiter=',')
    line_count = 0
    for row in csv_reader:
      if line_count == 0:
        #print("Columns = " + str(row))
        line_count += 1
      else:
        line_count += 1
        
        # Create two lines from one in order to have same label layout as 
        # MRPC task
        seperator = ' '
        data["id_1"].append(row[0])
        data["id_2"].append(row[0] + "_end_bli")
        data["context"].append(str(seperator.join(row[1:4])))
        data["id_1"].append(row[0])
        data["id_2"].append(row[0] + "_end_bla")
        data["context"].append(str(seperator.join(row[1:4])))
        
        if row[7] == 1: # First ending is the correct one
          data["ending"].append(row[5])
          data["label"].append(1)
          data["ending"].append(row[6])
          data["label"].append(0)
        else: # Second ending is the correct one
          data["ending"].append(row[6])
          data["label"].append(1)
          data["ending"].append(row[5])
          data["label"].append(0)    
    return pd.DataFrame.from_dict(data)

# Merge positive and negative examples, add a polarity column and shuffle.
def load_validation_only(eval_file):
    eval_data_df = load_data(eval_file)
    total_eval = eval_data_df.shape[0]
    eval_data_df.reset_index(drop=True)
    eval_split = 1/3
    test_df = eval_data_df.iloc[:int(total_eval * eval_split), :]
    train_df = eval_data_df.iloc[int(total_eval * eval_split):, :]
    return train_df, test_df

def load_augmented(eval_file, random_nn_file, sent_nn_file, ):
    eval_data_df = load_data(eval_file)
    total_eval = eval_data_df.shape[0]
    eval_data_df.reset_index(drop=True)
    eval_split = 2/10
    # Eval split defines the ratio of data going into the training set
    train_df = eval_data_df.iloc[:int(total_eval * eval_split), :]
    # The rest of the validation data is used as test set
    test_df = eval_data_df.iloc[int(total_eval * eval_split):, :]   
    
    random_nn_df = load_data(random_nn_file)
    total_random_nn = random_nn_df.shape[0]
    random_nn_df.reset_index(drop=True)
    random_nn_split = 1/10
    ext_df = random_nn_df.iloc[:int(total_random_nn * random_nn_split), :]
    train_df = train_df.append(ext_df, ignore_index=True)
    
    sent_nn_split = 7/10
    sent_nn_df = load_data(sent_nn_file)
    total_sent_nn = sent_nn_df.shape[0]
    sent_nn_df.reset_index(drop=True)
    ext_df = sent_nn_df.iloc[:int(total_sent_nn * sent_nn_split), :]
    train_df = train_df.append(ext_df, ignore_index=True)
    
    train_df.reset_index(drop=True)
    
    return train_df, test_df

# Download and process the dataset files.
def download_and_load_eval_datasets(force_download=False):
  validation = tf.keras.utils.get_file(
      fname="validation", 
      origin=PATH_EVAL_DATA)
  random_nn = tf.keras.utils.get_file(
    fname="rand_nn", 
    origin=PATH_RAND_NN_DATA)
  sent_nn = tf.keras.utils.get_file(
    fname="sent_nn", 
    origin=PATH_SENT_NN_DATA)

  #train_df, test_df = load_validation_only(PATH_EVAL_DATA)
  train_df, test_df = load_augmented(validation, random_nn, sent_nn)
  
  return train_df, test_df


In [7]:
train, test = download_and_load_eval_datasets()
print(train.shape)
print(test.shape)

print(train.iloc[0])
print(test.iloc[0])

Downloading data from http://felix.graule.ch/wp-content/uploads/2019/05/cloze_test_val_spring2016.csv
Downloading data from http://felix.graule.ch/wp-content/uploads/2019/05/train_stories_rand_combined.csv
Downloading data from http://felix.graule.ch/wp-content/uploads/2019/05/train_stories_nearest_story_sent2vec_combined.csv
/root/.keras/datasets/validation
/root/.keras/datasets/rand_nn
/root/.keras/datasets/sent_nn
(141805, 5)
(2994, 5)
label                                                      1
id_1                    138d5bfb-05cc-41e3-bf2c-fa85ebad14e2
id_2            138d5bfb-05cc-41e3-bf2c-fa85ebad14e2_end_bli
context    Rick grew up in a troubled household. He never...
ending                                     He joined a gang.
Name: 0, dtype: object
label                                                      1
id_1                    5bf65083-150d-4e08-a822-fd215ecd23fe
id_2            5bf65083-150d-4e08-a822-fd215ecd23fe_end_bli
context    Kira always loved Japanese culture.

Quick check whether dataset are fully disjoint (takes really long obviously)


In [8]:
'''
train.shape, test.shape
for j in range(train.shape[0]):
    query = train.iloc[j]['ending']
    for i in range(test.shape[0]):
      tmp = test.iloc[i]['ending']
      if tmp == query:
        print("Found something equal")
        print(tmp)
'''

'\ntrain.shape, test.shape\nfor j in range(train.shape[0]):\n    query = train.iloc[j][\'ending\']\n    for i in range(test.shape[0]):\n      tmp = test.iloc[i][\'ending\']\n      if tmp == query:\n        print("Found something equal")\n        print(tmp)\n'

For us, our input data are the 'context' and 'ending' column and our label is the 'label' column (0, 1 for negative and positive, respecitvely)

In [0]:
CONTEXT_COLUMN = 'context'
ENDING_COLUMN = 'ending'
LABEL_COLUMN = 'label'
# label_list is the list of labels, i.e. True, False or 0, 1 or 'dog', 'cat'
label_list = [0, 1]

#Data Preprocessing
We'll need to transform our data into a format BERT understands. This involves two steps. First, we create  `InputExample`'s using the constructor provided in the BERT library.

- `text_a` is the text we want to classify, which in this case, is the `Request` field in our Dataframe. For us, this is the context of the story.
- `text_b` is used if we're training a model to understand the relationship between sentences (i.e. is `text_b` a translation of `text_a`? Is `text_b` an answer to the question asked by `text_a`?). This is the ending in our case
- `label` is the label for our example, i.e. True, False

In [0]:
# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this example
                                                                   text_a = x[CONTEXT_COLUMN], 
                                                                   text_b = x[ENDING_COLUMN], 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

test_InputExamples = test.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                                   text_a = x[CONTEXT_COLUMN], 
                                                                   text_b = x[ENDING_COLUMN], 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (but don't worry--this is also included in the Python library):


1. Lowercase our text (if we're using a BERT lowercase model)
2. Tokenize it (i.e. "sally says hi" -> ["sally", "says", "hi"])
3. Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
4. Map our words to indexes using a vocab file that BERT provides
5. Add special "CLS" and "SEP" tokens (see the [readme](https://github.com/google-research/bert))
6. Append "index" and "segment" tokens to each input (see the [BERT paper](https://arxiv.org/pdf/1810.04805.pdf))

Happily, we don't have to worry about most of these details.




To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:

In [11]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

Instructions for updating:
Colocations handled automatically by placer.


W0529 13:37:50.877794 140384064227200 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py:3632: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0529 13:37:52.830511 140384064227200 saver.py:1483] Saver not created because there are no variables in the graph to restore


Great--we just learned that the BERT model we're using expects lowercase data (that's what stored in tokenization_info["do_lower_case"]) and we also loaded BERT's vocab file. We also created a tokenizer, which breaks words into word pieces:

In [12]:
tokenizer.tokenize("This here's an example of using the BERT tokenizer")

['this',
 'here',
 "'",
 's',
 'an',
 'example',
 'of',
 'using',
 'the',
 'bert',
 'token',
 '##izer']

Using our tokenizer, we'll call `run_classifier.convert_examples_to_features` on our InputExamples to convert them into features BERT understands.

In [13]:
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128
# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)

INFO:tensorflow:Writing example 0 of 141805


I0529 13:38:00.042713 140384064227200 run_classifier.py:774] Writing example 0 of 141805


INFO:tensorflow:*** Example ***


I0529 13:38:00.048554 140384064227200 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0529 13:38:00.052779 140384064227200 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] rick grew up in a troubled household . he never found good support in family , and turned to gangs . it wasn ' t long before rick got shot in a robbery . [SEP] he joined a gang . [SEP]


I0529 13:38:00.055821 140384064227200 run_classifier.py:464] tokens: [CLS] rick grew up in a troubled household . he never found good support in family , and turned to gangs . it wasn ' t long before rick got shot in a robbery . [SEP] he joined a gang . [SEP]


INFO:tensorflow:input_ids: 101 6174 3473 2039 1999 1037 11587 4398 1012 2002 2196 2179 2204 2490 1999 2155 1010 1998 2357 2000 18542 1012 2009 2347 1005 1056 2146 2077 6174 2288 2915 1999 1037 13742 1012 102 2002 2587 1037 6080 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:38:00.058183 140384064227200 run_classifier.py:465] input_ids: 101 6174 3473 2039 1999 1037 11587 4398 1012 2002 2196 2179 2204 2490 1999 2155 1010 1998 2357 2000 18542 1012 2009 2347 1005 1056 2146 2077 6174 2288 2915 1999 1037 13742 1012 102 2002 2587 1037 6080 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:38:00.061143 140384064227200 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:38:00.064149 140384064227200 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0529 13:38:00.067102 140384064227200 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0529 13:38:00.071633 140384064227200 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0529 13:38:00.074569 140384064227200 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] rick grew up in a troubled household . he never found good support in family , and turned to gangs . it wasn ' t long before rick got shot in a robbery . [SEP] he is happy now . [SEP]


I0529 13:38:00.077002 140384064227200 run_classifier.py:464] tokens: [CLS] rick grew up in a troubled household . he never found good support in family , and turned to gangs . it wasn ' t long before rick got shot in a robbery . [SEP] he is happy now . [SEP]


INFO:tensorflow:input_ids: 101 6174 3473 2039 1999 1037 11587 4398 1012 2002 2196 2179 2204 2490 1999 2155 1010 1998 2357 2000 18542 1012 2009 2347 1005 1056 2146 2077 6174 2288 2915 1999 1037 13742 1012 102 2002 2003 3407 2085 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:38:00.079999 140384064227200 run_classifier.py:465] input_ids: 101 6174 3473 2039 1999 1037 11587 4398 1012 2002 2196 2179 2204 2490 1999 2155 1010 1998 2357 2000 18542 1012 2009 2347 1005 1056 2146 2077 6174 2288 2915 1999 1037 13742 1012 102 2002 2003 3407 2085 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:38:00.082988 140384064227200 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:38:00.086217 140384064227200 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0529 13:38:00.089125 140384064227200 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0529 13:38:00.093808 140384064227200 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0529 13:38:00.096641 140384064227200 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] la ##vern ##e needs to prepare something for her friend ' s party . she decides to ba ##ke a batch of brown ##ies . she chooses a recipe and follows it closely . [SEP] la ##vern ##e doesn ' t go to her friend ' s party . [SEP]


I0529 13:38:00.099480 140384064227200 run_classifier.py:464] tokens: [CLS] la ##vern ##e needs to prepare something for her friend ' s party . she decides to ba ##ke a batch of brown ##ies . she chooses a recipe and follows it closely . [SEP] la ##vern ##e doesn ' t go to her friend ' s party . [SEP]


INFO:tensorflow:input_ids: 101 2474 23062 2063 3791 2000 7374 2242 2005 2014 2767 1005 1055 2283 1012 2016 7288 2000 8670 3489 1037 14108 1997 2829 3111 1012 2016 15867 1037 17974 1998 4076 2009 4876 1012 102 2474 23062 2063 2987 1005 1056 2175 2000 2014 2767 1005 1055 2283 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:38:00.101917 140384064227200 run_classifier.py:465] input_ids: 101 2474 23062 2063 3791 2000 7374 2242 2005 2014 2767 1005 1055 2283 1012 2016 7288 2000 8670 3489 1037 14108 1997 2829 3111 1012 2016 15867 1037 17974 1998 4076 2009 4876 1012 102 2474 23062 2063 2987 1005 1056 2175 2000 2014 2767 1005 1055 2283 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:38:00.104827 140384064227200 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:38:00.107698 140384064227200 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0529 13:38:00.110552 140384064227200 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0529 13:38:00.115133 140384064227200 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0529 13:38:00.117963 140384064227200 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] la ##vern ##e needs to prepare something for her friend ' s party . she decides to ba ##ke a batch of brown ##ies . she chooses a recipe and follows it closely . [SEP] the brown ##ies are so delicious la ##vern ##e eats two of them . [SEP]


I0529 13:38:00.120323 140384064227200 run_classifier.py:464] tokens: [CLS] la ##vern ##e needs to prepare something for her friend ' s party . she decides to ba ##ke a batch of brown ##ies . she chooses a recipe and follows it closely . [SEP] the brown ##ies are so delicious la ##vern ##e eats two of them . [SEP]


INFO:tensorflow:input_ids: 101 2474 23062 2063 3791 2000 7374 2242 2005 2014 2767 1005 1055 2283 1012 2016 7288 2000 8670 3489 1037 14108 1997 2829 3111 1012 2016 15867 1037 17974 1998 4076 2009 4876 1012 102 1996 2829 3111 2024 2061 12090 2474 23062 2063 20323 2048 1997 2068 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:38:00.123241 140384064227200 run_classifier.py:465] input_ids: 101 2474 23062 2063 3791 2000 7374 2242 2005 2014 2767 1005 1055 2283 1012 2016 7288 2000 8670 3489 1037 14108 1997 2829 3111 1012 2016 15867 1037 17974 1998 4076 2009 4876 1012 102 1996 2829 3111 2024 2061 12090 2474 23062 2063 20323 2048 1997 2068 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:38:00.126047 140384064227200 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:38:00.130265 140384064227200 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0529 13:38:00.135912 140384064227200 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0529 13:38:00.141054 140384064227200 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0529 13:38:00.144117 140384064227200 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] sarah had been dreaming of visiting europe for years . she had finally saved enough for the trip . she landed in spain and traveled east across the continent . [SEP] sarah decided that she preferred her home over europe . [SEP]


I0529 13:38:00.146420 140384064227200 run_classifier.py:464] tokens: [CLS] sarah had been dreaming of visiting europe for years . she had finally saved enough for the trip . she landed in spain and traveled east across the continent . [SEP] sarah decided that she preferred her home over europe . [SEP]


INFO:tensorflow:input_ids: 101 4532 2018 2042 12802 1997 5873 2885 2005 2086 1012 2016 2018 2633 5552 2438 2005 1996 4440 1012 2016 5565 1999 3577 1998 6158 2264 2408 1996 9983 1012 102 4532 2787 2008 2016 6871 2014 2188 2058 2885 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:38:00.149204 140384064227200 run_classifier.py:465] input_ids: 101 4532 2018 2042 12802 1997 5873 2885 2005 2086 1012 2016 2018 2633 5552 2438 2005 1996 4440 1012 2016 5565 1999 3577 1998 6158 2264 2408 1996 9983 1012 102 4532 2787 2008 2016 6871 2014 2188 2058 2885 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:38:00.151608 140384064227200 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:38:00.153491 140384064227200 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0529 13:38:00.157798 140384064227200 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:Writing example 10000 of 141805


I0529 13:38:08.756908 140384064227200 run_classifier.py:774] Writing example 10000 of 141805


INFO:tensorflow:Writing example 20000 of 141805


I0529 13:38:16.024828 140384064227200 run_classifier.py:774] Writing example 20000 of 141805


INFO:tensorflow:Writing example 30000 of 141805


I0529 13:38:23.447078 140384064227200 run_classifier.py:774] Writing example 30000 of 141805


INFO:tensorflow:Writing example 40000 of 141805


I0529 13:38:30.747739 140384064227200 run_classifier.py:774] Writing example 40000 of 141805


INFO:tensorflow:Writing example 50000 of 141805


I0529 13:38:38.282143 140384064227200 run_classifier.py:774] Writing example 50000 of 141805


INFO:tensorflow:Writing example 60000 of 141805


I0529 13:38:45.513153 140384064227200 run_classifier.py:774] Writing example 60000 of 141805


INFO:tensorflow:Writing example 70000 of 141805


I0529 13:38:52.706895 140384064227200 run_classifier.py:774] Writing example 70000 of 141805


INFO:tensorflow:Writing example 80000 of 141805


I0529 13:38:59.936125 140384064227200 run_classifier.py:774] Writing example 80000 of 141805


INFO:tensorflow:Writing example 90000 of 141805


I0529 13:39:07.673694 140384064227200 run_classifier.py:774] Writing example 90000 of 141805


INFO:tensorflow:Writing example 100000 of 141805


I0529 13:39:14.930725 140384064227200 run_classifier.py:774] Writing example 100000 of 141805


INFO:tensorflow:Writing example 110000 of 141805


I0529 13:39:22.629422 140384064227200 run_classifier.py:774] Writing example 110000 of 141805


INFO:tensorflow:Writing example 120000 of 141805


I0529 13:39:30.722825 140384064227200 run_classifier.py:774] Writing example 120000 of 141805


INFO:tensorflow:Writing example 130000 of 141805


I0529 13:39:38.600321 140384064227200 run_classifier.py:774] Writing example 130000 of 141805


INFO:tensorflow:Writing example 140000 of 141805


I0529 13:39:45.788559 140384064227200 run_classifier.py:774] Writing example 140000 of 141805


INFO:tensorflow:Writing example 0 of 2994


I0529 13:39:47.102785 140384064227200 run_classifier.py:774] Writing example 0 of 2994


INFO:tensorflow:*** Example ***


I0529 13:39:47.105491 140384064227200 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0529 13:39:47.107015 140384064227200 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] kira always loved japanese culture . many of her favorite movies were in japanese . she decided that she would travel there . [SEP] it was the best trip of her life . [SEP]


I0529 13:39:47.108584 140384064227200 run_classifier.py:464] tokens: [CLS] kira always loved japanese culture . many of her favorite movies were in japanese . she decided that she would travel there . [SEP] it was the best trip of her life . [SEP]


INFO:tensorflow:input_ids: 101 15163 2467 3866 2887 3226 1012 2116 1997 2014 5440 5691 2020 1999 2887 1012 2016 2787 2008 2016 2052 3604 2045 1012 102 2009 2001 1996 2190 4440 1997 2014 2166 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:39:47.110125 140384064227200 run_classifier.py:465] input_ids: 101 15163 2467 3866 2887 3226 1012 2116 1997 2014 5440 5691 2020 1999 2887 1012 2016 2787 2008 2016 2052 3604 2045 1012 102 2009 2001 1996 2190 4440 1997 2014 2166 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:39:47.114144 140384064227200 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:39:47.118864 140384064227200 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0529 13:39:47.122896 140384064227200 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0529 13:39:47.126408 140384064227200 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0529 13:39:47.128323 140384064227200 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] kira always loved japanese culture . many of her favorite movies were in japanese . she decided that she would travel there . [SEP] she spent all the money on shoes . [SEP]


I0529 13:39:47.131447 140384064227200 run_classifier.py:464] tokens: [CLS] kira always loved japanese culture . many of her favorite movies were in japanese . she decided that she would travel there . [SEP] she spent all the money on shoes . [SEP]


INFO:tensorflow:input_ids: 101 15163 2467 3866 2887 3226 1012 2116 1997 2014 5440 5691 2020 1999 2887 1012 2016 2787 2008 2016 2052 3604 2045 1012 102 2016 2985 2035 1996 2769 2006 6007 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:39:47.135444 140384064227200 run_classifier.py:465] input_ids: 101 15163 2467 3866 2887 3226 1012 2116 1997 2014 5440 5691 2020 1999 2887 1012 2016 2787 2008 2016 2052 3604 2045 1012 102 2016 2985 2035 1996 2769 2006 6007 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:39:47.137776 140384064227200 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:39:47.141482 140384064227200 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0529 13:39:47.144168 140384064227200 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0529 13:39:47.150078 140384064227200 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0529 13:39:47.154012 140384064227200 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] glen was in the mood for a walk . he put on his jacket and went outside . while walking on the sidewalk , he tripped and busted his knee open . [SEP] glen swore he ' d never walk again . [SEP]


I0529 13:39:47.158825 140384064227200 run_classifier.py:464] tokens: [CLS] glen was in the mood for a walk . he put on his jacket and went outside . while walking on the sidewalk , he tripped and busted his knee open . [SEP] glen swore he ' d never walk again . [SEP]


INFO:tensorflow:input_ids: 101 8904 2001 1999 1996 6888 2005 1037 3328 1012 2002 2404 2006 2010 6598 1998 2253 2648 1012 2096 3788 2006 1996 11996 1010 2002 21129 1998 23142 2010 6181 2330 1012 102 8904 12860 2002 1005 1040 2196 3328 2153 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:39:47.163552 140384064227200 run_classifier.py:465] input_ids: 101 8904 2001 1999 1996 6888 2005 1037 3328 1012 2002 2404 2006 2010 6598 1998 2253 2648 1012 2096 3788 2006 1996 11996 1010 2002 21129 1998 23142 2010 6181 2330 1012 102 8904 12860 2002 1005 1040 2196 3328 2153 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:39:47.170066 140384064227200 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:39:47.174819 140384064227200 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0529 13:39:47.181126 140384064227200 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0529 13:39:47.186660 140384064227200 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0529 13:39:47.189417 140384064227200 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] glen was in the mood for a walk . he put on his jacket and went outside . while walking on the sidewalk , he tripped and busted his knee open . [SEP] glen ran his best marathon time ever . [SEP]


I0529 13:39:47.193383 140384064227200 run_classifier.py:464] tokens: [CLS] glen was in the mood for a walk . he put on his jacket and went outside . while walking on the sidewalk , he tripped and busted his knee open . [SEP] glen ran his best marathon time ever . [SEP]


INFO:tensorflow:input_ids: 101 8904 2001 1999 1996 6888 2005 1037 3328 1012 2002 2404 2006 2010 6598 1998 2253 2648 1012 2096 3788 2006 1996 11996 1010 2002 21129 1998 23142 2010 6181 2330 1012 102 8904 2743 2010 2190 8589 2051 2412 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:39:47.197043 140384064227200 run_classifier.py:465] input_ids: 101 8904 2001 1999 1996 6888 2005 1037 3328 1012 2002 2404 2006 2010 6598 1998 2253 2648 1012 2096 3788 2006 1996 11996 1010 2002 21129 1998 23142 2010 6181 2330 1012 102 8904 2743 2010 2190 8589 2051 2412 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:39:47.201662 140384064227200 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:39:47.205652 140384064227200 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0529 13:39:47.209483 140384064227200 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0529 13:39:47.214724 140384064227200 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0529 13:39:47.216753 140384064227200 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] jesus spent the day polish ##ing the furniture . as night approached he surveyed the work he ' d done . everything looked shiny and new . [SEP] his friends polished his furniture . [SEP]


I0529 13:39:47.218948 140384064227200 run_classifier.py:464] tokens: [CLS] jesus spent the day polish ##ing the furniture . as night approached he surveyed the work he ' d done . everything looked shiny and new . [SEP] his friends polished his furniture . [SEP]


INFO:tensorflow:input_ids: 101 4441 2985 1996 2154 3907 2075 1996 7390 1012 2004 2305 5411 2002 12876 1996 2147 2002 1005 1040 2589 1012 2673 2246 12538 1998 2047 1012 102 2010 2814 12853 2010 7390 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:39:47.220959 140384064227200 run_classifier.py:465] input_ids: 101 4441 2985 1996 2154 3907 2075 1996 7390 1012 2004 2305 5411 2002 12876 1996 2147 2002 1005 1040 2589 1012 2673 2246 12538 1998 2047 1012 102 2010 2814 12853 2010 7390 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:39:47.223308 140384064227200 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0529 13:39:47.225492 140384064227200 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0529 13:39:47.227588 140384064227200 run_classifier.py:468] label: 1 (id = 1)


#Creating a model

Now that we've prepared our data, let's focus on building a model. `create_model` does just this below. First, it loads the BERT tf hub module again (this time to extract the computation graph). Next, it creates a single new layer that will be trained to adapt BERT to our sentiment task (i.e. classifying whether a movie review is positive or negative). This strategy of using a mostly trained model is called [fine-tuning](http://wiki.fast.ai/index.php/Fine_tuning).

In [0]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)


Next we'll wrap our model function in a `model_fn_builder` function that adapts our model to work for training, evaluation, and prediction.

In [0]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        f1_score = tf.contrib.metrics.f1_score(
            label_ids,
            predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        return {
            "eval_accuracy": accuracy,
            "f1_score": f1_score,
            "auc": auc,
            "precision": precision,
            "recall": recall,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
        }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


In [0]:
# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 500
SAVE_SUMMARY_STEPS = 100

In [0]:
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [0]:
# Specify outpit directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [19]:
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})

INFO:tensorflow:Using config: {'_model_dir': 'bert_story_cloze', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fad3c7e94a8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


I0529 13:40:00.812263 140384064227200 estimator.py:201] Using config: {'_model_dir': 'bert_story_cloze', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fad3c7e94a8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


Next we create an input builder function that takes our training feature set (`train_features`) and produces a generator. This is a pretty standard design pattern for working with Tensorflow [Estimators](https://www.tensorflow.org/guide/estimators).

In [0]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

Now we train our model! For me, using a Colab notebook running on Google's GPUs, my training time was about 14 minutes.

In [0]:
print(f'Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!
INFO:tensorflow:Calling model_fn.


I0529 13:41:22.104464 140384064227200 estimator.py:1111] Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0529 13:41:26.155544 140384064227200 saver.py:1483] Saver not created because there are no variables in the graph to restore


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


W0529 13:41:26.283044 140384064227200 deprecation.py:506] From <ipython-input-14-ca03218f28a6>:34: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


W0529 13:41:26.328684 140384064227200 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/learning_rate_decay_v2.py:321: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


Instructions for updating:
Use tf.cast instead.


W0529 13:41:26.410748 140384064227200 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


Instructions for updating:
Use tf.cast instead.


W0529 13:41:34.887732 140384064227200 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/metrics_impl.py:455: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.



For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

INFO:tensorflow:Done calling model_fn.


I0529 13:41:37.292865 140384064227200 estimator.py:1113] Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


I0529 13:41:37.297000 140384064227200 basic_session_run_hooks.py:527] Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


I0529 13:41:44.985288 140384064227200 monitored_session.py:222] Graph was finalized.


INFO:tensorflow:Running local_init_op.


I0529 13:41:50.900420 140384064227200 session_manager.py:491] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0529 13:41:51.154701 140384064227200 session_manager.py:493] Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into bert_story_cloze/model.ckpt.


I0529 13:42:55.153994 140384064227200 basic_session_run_hooks.py:594] Saving checkpoints for 0 into bert_story_cloze/model.ckpt.


INFO:tensorflow:loss = 0.6906431, step = 0


I0529 13:43:19.068607 140384064227200 basic_session_run_hooks.py:249] loss = 0.6906431, step = 0


INFO:tensorflow:global_step/sec: 1.02543


I0529 13:44:56.587396 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.02543


INFO:tensorflow:loss = 0.47049823, step = 100 (97.522 sec)


I0529 13:44:56.590123 140384064227200 basic_session_run_hooks.py:247] loss = 0.47049823, step = 100 (97.522 sec)


INFO:tensorflow:global_step/sec: 1.18016


I0529 13:46:21.321544 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18016


INFO:tensorflow:loss = 0.09735179, step = 200 (84.737 sec)


I0529 13:46:21.327526 140384064227200 basic_session_run_hooks.py:247] loss = 0.09735179, step = 200 (84.737 sec)


INFO:tensorflow:global_step/sec: 1.18349


I0529 13:47:45.817185 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18349


INFO:tensorflow:loss = 0.16628619, step = 300 (84.493 sec)


I0529 13:47:45.820544 140384064227200 basic_session_run_hooks.py:247] loss = 0.16628619, step = 300 (84.493 sec)


INFO:tensorflow:global_step/sec: 1.18179


I0529 13:49:10.434874 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18179


INFO:tensorflow:loss = 0.080648385, step = 400 (84.617 sec)


I0529 13:49:10.437426 140384064227200 basic_session_run_hooks.py:247] loss = 0.080648385, step = 400 (84.617 sec)


INFO:tensorflow:Saving checkpoints for 500 into bert_story_cloze/model.ckpt.


I0529 13:50:34.044511 140384064227200 basic_session_run_hooks.py:594] Saving checkpoints for 500 into bert_story_cloze/model.ckpt.


INFO:tensorflow:global_step/sec: 1.04741


I0529 13:50:45.908385 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.04741


INFO:tensorflow:loss = 0.07805068, step = 500 (95.477 sec)


I0529 13:50:45.914474 140384064227200 basic_session_run_hooks.py:247] loss = 0.07805068, step = 500 (95.477 sec)


INFO:tensorflow:global_step/sec: 1.18052


I0529 13:52:10.616926 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18052


INFO:tensorflow:loss = 0.30474588, step = 600 (84.706 sec)


I0529 13:52:10.620023 140384064227200 basic_session_run_hooks.py:247] loss = 0.30474588, step = 600 (84.706 sec)


INFO:tensorflow:global_step/sec: 1.18138


I0529 13:53:35.263916 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18138


INFO:tensorflow:loss = 0.32295176, step = 700 (84.647 sec)


I0529 13:53:35.267080 140384064227200 basic_session_run_hooks.py:247] loss = 0.32295176, step = 700 (84.647 sec)


INFO:tensorflow:global_step/sec: 1.18447


I0529 13:54:59.690156 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18447


INFO:tensorflow:loss = 0.26161125, step = 800 (84.427 sec)


I0529 13:54:59.693627 140384064227200 basic_session_run_hooks.py:247] loss = 0.26161125, step = 800 (84.427 sec)


INFO:tensorflow:global_step/sec: 1.18212


I0529 13:56:24.283633 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18212


INFO:tensorflow:loss = 0.30596566, step = 900 (84.598 sec)


I0529 13:56:24.291316 140384064227200 basic_session_run_hooks.py:247] loss = 0.30596566, step = 900 (84.598 sec)


INFO:tensorflow:Saving checkpoints for 1000 into bert_story_cloze/model.ckpt.


I0529 13:57:47.999534 140384064227200 basic_session_run_hooks.py:594] Saving checkpoints for 1000 into bert_story_cloze/model.ckpt.


INFO:tensorflow:global_step/sec: 1.04545


I0529 13:57:59.935896 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.04545


INFO:tensorflow:loss = 0.42821532, step = 1000 (95.650 sec)


I0529 13:57:59.941348 140384064227200 basic_session_run_hooks.py:247] loss = 0.42821532, step = 1000 (95.650 sec)


INFO:tensorflow:global_step/sec: 1.1786


I0529 13:59:24.782130 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.1786


INFO:tensorflow:loss = 0.27458388, step = 1100 (84.844 sec)


I0529 13:59:24.784997 140384064227200 basic_session_run_hooks.py:247] loss = 0.27458388, step = 1100 (84.844 sec)


INFO:tensorflow:global_step/sec: 1.1832


I0529 14:00:49.298474 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.1832


INFO:tensorflow:loss = 0.23631138, step = 1200 (84.519 sec)


I0529 14:00:49.304425 140384064227200 basic_session_run_hooks.py:247] loss = 0.23631138, step = 1200 (84.519 sec)


INFO:tensorflow:global_step/sec: 1.18321


I0529 14:02:13.814080 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18321


INFO:tensorflow:loss = 0.28968468, step = 1300 (84.516 sec)


I0529 14:02:13.820110 140384064227200 basic_session_run_hooks.py:247] loss = 0.28968468, step = 1300 (84.516 sec)


INFO:tensorflow:global_step/sec: 1.18402


I0529 14:03:38.272281 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18402


INFO:tensorflow:loss = 0.41471004, step = 1400 (84.461 sec)


I0529 14:03:38.280606 140384064227200 basic_session_run_hooks.py:247] loss = 0.41471004, step = 1400 (84.461 sec)


INFO:tensorflow:Saving checkpoints for 1500 into bert_story_cloze/model.ckpt.


I0529 14:05:01.894569 140384064227200 basic_session_run_hooks.py:594] Saving checkpoints for 1500 into bert_story_cloze/model.ckpt.


INFO:tensorflow:global_step/sec: 1.04237


I0529 14:05:14.207900 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.04237


INFO:tensorflow:loss = 0.57643485, step = 1500 (95.931 sec)


I0529 14:05:14.211711 140384064227200 basic_session_run_hooks.py:247] loss = 0.57643485, step = 1500 (95.931 sec)


INFO:tensorflow:global_step/sec: 1.18016


I0529 14:06:38.942460 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18016


INFO:tensorflow:loss = 0.28022203, step = 1600 (84.737 sec)


I0529 14:06:38.948909 140384064227200 basic_session_run_hooks.py:247] loss = 0.28022203, step = 1600 (84.737 sec)


INFO:tensorflow:global_step/sec: 1.18346


I0529 14:08:03.440727 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18346


INFO:tensorflow:loss = 0.35846657, step = 1700 (84.499 sec)


I0529 14:08:03.448193 140384064227200 basic_session_run_hooks.py:247] loss = 0.35846657, step = 1700 (84.499 sec)


INFO:tensorflow:global_step/sec: 1.18161


I0529 14:09:28.071160 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18161


INFO:tensorflow:loss = 0.4192232, step = 1800 (84.633 sec)


I0529 14:09:28.081497 140384064227200 basic_session_run_hooks.py:247] loss = 0.4192232, step = 1800 (84.633 sec)


INFO:tensorflow:global_step/sec: 1.18398


I0529 14:10:52.531721 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18398


INFO:tensorflow:loss = 0.21514031, step = 1900 (84.453 sec)


I0529 14:10:52.534656 140384064227200 basic_session_run_hooks.py:247] loss = 0.21514031, step = 1900 (84.453 sec)


INFO:tensorflow:Saving checkpoints for 2000 into bert_story_cloze/model.ckpt.


I0529 14:12:16.186382 140384064227200 basic_session_run_hooks.py:594] Saving checkpoints for 2000 into bert_story_cloze/model.ckpt.


INFO:tensorflow:global_step/sec: 1.04808


I0529 14:12:27.944385 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.04808


INFO:tensorflow:loss = 0.29235572, step = 2000 (95.414 sec)


I0529 14:12:27.948787 140384064227200 basic_session_run_hooks.py:247] loss = 0.29235572, step = 2000 (95.414 sec)


INFO:tensorflow:global_step/sec: 1.18049


I0529 14:13:52.655091 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18049


INFO:tensorflow:loss = 0.32277521, step = 2100 (84.709 sec)


I0529 14:13:52.657747 140384064227200 basic_session_run_hooks.py:247] loss = 0.32277521, step = 2100 (84.709 sec)


INFO:tensorflow:global_step/sec: 1.18284


I0529 14:15:17.197070 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18284


INFO:tensorflow:loss = 0.47312737, step = 2200 (84.545 sec)


I0529 14:15:17.202272 140384064227200 basic_session_run_hooks.py:247] loss = 0.47312737, step = 2200 (84.545 sec)


INFO:tensorflow:global_step/sec: 1.18251


I0529 14:16:41.763197 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18251


INFO:tensorflow:loss = 0.21176818, step = 2300 (84.564 sec)


I0529 14:16:41.765804 140384064227200 basic_session_run_hooks.py:247] loss = 0.21176818, step = 2300 (84.564 sec)


INFO:tensorflow:global_step/sec: 1.1826


I0529 14:18:06.322530 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.1826


INFO:tensorflow:loss = 0.33834806, step = 2400 (84.560 sec)


I0529 14:18:06.325678 140384064227200 basic_session_run_hooks.py:247] loss = 0.33834806, step = 2400 (84.560 sec)


INFO:tensorflow:Saving checkpoints for 2500 into bert_story_cloze/model.ckpt.


I0529 14:19:29.895869 140384064227200 basic_session_run_hooks.py:594] Saving checkpoints for 2500 into bert_story_cloze/model.ckpt.


Instructions for updating:
Use standard file APIs to delete files with this prefix.


W0529 14:19:37.868280 140384064227200 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:966: remove_checkpoint (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to delete files with this prefix.


INFO:tensorflow:global_step/sec: 1.04615


I0529 14:19:41.911420 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.04615


INFO:tensorflow:loss = 0.17537302, step = 2500 (95.594 sec)


I0529 14:19:41.920057 140384064227200 basic_session_run_hooks.py:247] loss = 0.17537302, step = 2500 (95.594 sec)


INFO:tensorflow:global_step/sec: 1.18012


I0529 14:21:06.648817 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18012


INFO:tensorflow:loss = 0.48691767, step = 2600 (84.731 sec)


I0529 14:21:06.651445 140384064227200 basic_session_run_hooks.py:247] loss = 0.48691767, step = 2600 (84.731 sec)


INFO:tensorflow:global_step/sec: 1.18128


I0529 14:22:31.302559 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18128


INFO:tensorflow:loss = 0.31117213, step = 2700 (84.656 sec)


I0529 14:22:31.307883 140384064227200 basic_session_run_hooks.py:247] loss = 0.31117213, step = 2700 (84.656 sec)


INFO:tensorflow:global_step/sec: 1.18177


I0529 14:23:55.921206 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18177


INFO:tensorflow:loss = 0.29809934, step = 2800 (84.616 sec)


I0529 14:23:55.924128 140384064227200 basic_session_run_hooks.py:247] loss = 0.29809934, step = 2800 (84.616 sec)


INFO:tensorflow:global_step/sec: 1.18481


I0529 14:25:20.323047 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18481


INFO:tensorflow:loss = 0.4558136, step = 2900 (84.403 sec)


I0529 14:25:20.326936 140384064227200 basic_session_run_hooks.py:247] loss = 0.4558136, step = 2900 (84.403 sec)


INFO:tensorflow:Saving checkpoints for 3000 into bert_story_cloze/model.ckpt.


I0529 14:26:43.991274 140384064227200 basic_session_run_hooks.py:594] Saving checkpoints for 3000 into bert_story_cloze/model.ckpt.


INFO:tensorflow:global_step/sec: 1.04823


I0529 14:26:55.722024 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.04823


INFO:tensorflow:loss = 0.34696215, step = 3000 (95.400 sec)


I0529 14:26:55.726718 140384064227200 basic_session_run_hooks.py:247] loss = 0.34696215, step = 3000 (95.400 sec)


INFO:tensorflow:global_step/sec: 1.17935


I0529 14:28:20.514675 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.17935


INFO:tensorflow:loss = 0.48437244, step = 3100 (84.795 sec)


I0529 14:28:20.521243 140384064227200 basic_session_run_hooks.py:247] loss = 0.48437244, step = 3100 (84.795 sec)


INFO:tensorflow:global_step/sec: 1.1829


I0529 14:29:45.052477 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.1829


INFO:tensorflow:loss = 0.41545248, step = 3200 (84.539 sec)


I0529 14:29:45.060163 140384064227200 basic_session_run_hooks.py:247] loss = 0.41545248, step = 3200 (84.539 sec)


INFO:tensorflow:global_step/sec: 1.18322


I0529 14:31:09.567699 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18322


INFO:tensorflow:loss = 0.21999538, step = 3300 (84.517 sec)


I0529 14:31:09.576836 140384064227200 basic_session_run_hooks.py:247] loss = 0.21999538, step = 3300 (84.517 sec)


INFO:tensorflow:global_step/sec: 1.18287


I0529 14:32:34.108051 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18287


INFO:tensorflow:loss = 0.32591617, step = 3400 (84.539 sec)


I0529 14:32:34.115967 140384064227200 basic_session_run_hooks.py:247] loss = 0.32591617, step = 3400 (84.539 sec)


INFO:tensorflow:Saving checkpoints for 3500 into bert_story_cloze/model.ckpt.


I0529 14:33:57.750109 140384064227200 basic_session_run_hooks.py:594] Saving checkpoints for 3500 into bert_story_cloze/model.ckpt.


INFO:tensorflow:global_step/sec: 1.04373


I0529 14:34:09.918444 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.04373


INFO:tensorflow:loss = 0.3496073, step = 3500 (95.811 sec)


I0529 14:34:09.926813 140384064227200 basic_session_run_hooks.py:247] loss = 0.3496073, step = 3500 (95.811 sec)


INFO:tensorflow:global_step/sec: 1.17946


I0529 14:35:34.702822 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.17946


INFO:tensorflow:loss = 0.5303535, step = 3600 (84.786 sec)


I0529 14:35:34.712661 140384064227200 basic_session_run_hooks.py:247] loss = 0.5303535, step = 3600 (84.786 sec)


INFO:tensorflow:global_step/sec: 1.18236


I0529 14:36:59.279561 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18236


INFO:tensorflow:loss = 0.3119651, step = 3700 (84.579 sec)


I0529 14:36:59.292038 140384064227200 basic_session_run_hooks.py:247] loss = 0.3119651, step = 3700 (84.579 sec)


INFO:tensorflow:global_step/sec: 1.18015


I0529 14:38:24.014654 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18015


INFO:tensorflow:loss = 0.4281534, step = 3800 (84.727 sec)


I0529 14:38:24.018664 140384064227200 basic_session_run_hooks.py:247] loss = 0.4281534, step = 3800 (84.727 sec)


INFO:tensorflow:global_step/sec: 1.18289


I0529 14:39:48.553606 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18289


INFO:tensorflow:loss = 0.49178058, step = 3900 (84.540 sec)


I0529 14:39:48.559302 140384064227200 basic_session_run_hooks.py:247] loss = 0.49178058, step = 3900 (84.540 sec)


INFO:tensorflow:Saving checkpoints for 4000 into bert_story_cloze/model.ckpt.


I0529 14:41:12.211977 140384064227200 basic_session_run_hooks.py:594] Saving checkpoints for 4000 into bert_story_cloze/model.ckpt.


INFO:tensorflow:global_step/sec: 1.03759


I0529 14:41:24.930426 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.03759


INFO:tensorflow:loss = 0.49581784, step = 4000 (96.381 sec)


I0529 14:41:24.938937 140384064227200 basic_session_run_hooks.py:247] loss = 0.49581784, step = 4000 (96.381 sec)


INFO:tensorflow:global_step/sec: 1.18063


I0529 14:42:49.630795 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18063


INFO:tensorflow:loss = 0.4282238, step = 4100 (84.697 sec)


I0529 14:42:49.636154 140384064227200 basic_session_run_hooks.py:247] loss = 0.4282238, step = 4100 (84.697 sec)


INFO:tensorflow:global_step/sec: 1.18258


I0529 14:44:14.191748 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18258


INFO:tensorflow:loss = 0.37396842, step = 4200 (84.560 sec)


I0529 14:44:14.196471 140384064227200 basic_session_run_hooks.py:247] loss = 0.37396842, step = 4200 (84.560 sec)


INFO:tensorflow:global_step/sec: 1.18293


I0529 14:45:38.727373 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18293


INFO:tensorflow:loss = 0.29706037, step = 4300 (84.537 sec)


I0529 14:45:38.732990 140384064227200 basic_session_run_hooks.py:247] loss = 0.29706037, step = 4300 (84.537 sec)


INFO:tensorflow:global_step/sec: 1.18184


I0529 14:47:03.341022 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18184


INFO:tensorflow:loss = 0.45250458, step = 4400 (84.611 sec)


I0529 14:47:03.343737 140384064227200 basic_session_run_hooks.py:247] loss = 0.45250458, step = 4400 (84.611 sec)


INFO:tensorflow:Saving checkpoints for 4500 into bert_story_cloze/model.ckpt.


I0529 14:48:26.876431 140384064227200 basic_session_run_hooks.py:594] Saving checkpoints for 4500 into bert_story_cloze/model.ckpt.


INFO:tensorflow:global_step/sec: 1.04354


I0529 14:48:39.168359 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.04354


INFO:tensorflow:loss = 0.14300229, step = 4500 (95.827 sec)


I0529 14:48:39.170902 140384064227200 basic_session_run_hooks.py:247] loss = 0.14300229, step = 4500 (95.827 sec)


INFO:tensorflow:global_step/sec: 1.17988


I0529 14:50:03.923042 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.17988


INFO:tensorflow:loss = 0.09556192, step = 4600 (84.755 sec)


I0529 14:50:03.925500 140384064227200 basic_session_run_hooks.py:247] loss = 0.09556192, step = 4600 (84.755 sec)


INFO:tensorflow:global_step/sec: 1.18316


I0529 14:51:28.442170 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18316


INFO:tensorflow:loss = 0.093704976, step = 4700 (84.523 sec)


I0529 14:51:28.448879 140384064227200 basic_session_run_hooks.py:247] loss = 0.093704976, step = 4700 (84.523 sec)


INFO:tensorflow:global_step/sec: 1.18344


I0529 14:52:52.941381 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18344


INFO:tensorflow:loss = 0.0056730118, step = 4800 (84.498 sec)


I0529 14:52:52.946963 140384064227200 basic_session_run_hooks.py:247] loss = 0.0056730118, step = 4800 (84.498 sec)


INFO:tensorflow:global_step/sec: 1.18227


I0529 14:54:17.524499 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18227


INFO:tensorflow:loss = 0.005613307, step = 4900 (84.586 sec)


I0529 14:54:17.533142 140384064227200 basic_session_run_hooks.py:247] loss = 0.005613307, step = 4900 (84.586 sec)


INFO:tensorflow:Saving checkpoints for 5000 into bert_story_cloze/model.ckpt.


I0529 14:55:41.231669 140384064227200 basic_session_run_hooks.py:594] Saving checkpoints for 5000 into bert_story_cloze/model.ckpt.


INFO:tensorflow:global_step/sec: 1.04217


I0529 14:55:53.478007 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.04217


INFO:tensorflow:loss = 0.0029019183, step = 5000 (95.949 sec)


I0529 14:55:53.481670 140384064227200 basic_session_run_hooks.py:247] loss = 0.0029019183, step = 5000 (95.949 sec)


INFO:tensorflow:global_step/sec: 1.17948


I0529 14:57:18.261319 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.17948


INFO:tensorflow:loss = 0.21919905, step = 5100 (84.787 sec)


I0529 14:57:18.269007 140384064227200 basic_session_run_hooks.py:247] loss = 0.21919905, step = 5100 (84.787 sec)


INFO:tensorflow:global_step/sec: 1.18241


I0529 14:58:42.834508 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18241


INFO:tensorflow:loss = 0.22988437, step = 5200 (84.568 sec)


I0529 14:58:42.837200 140384064227200 basic_session_run_hooks.py:247] loss = 0.22988437, step = 5200 (84.568 sec)


INFO:tensorflow:global_step/sec: 1.18139


I0529 15:00:07.480367 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18139


INFO:tensorflow:loss = 0.31234434, step = 5300 (84.649 sec)


I0529 15:00:07.486297 140384064227200 basic_session_run_hooks.py:247] loss = 0.31234434, step = 5300 (84.649 sec)


INFO:tensorflow:global_step/sec: 1.18307


I0529 15:01:32.006234 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18307


INFO:tensorflow:loss = 0.09498214, step = 5400 (84.523 sec)


I0529 15:01:32.008879 140384064227200 basic_session_run_hooks.py:247] loss = 0.09498214, step = 5400 (84.523 sec)


INFO:tensorflow:Saving checkpoints for 5500 into bert_story_cloze/model.ckpt.


I0529 15:02:55.605901 140384064227200 basic_session_run_hooks.py:594] Saving checkpoints for 5500 into bert_story_cloze/model.ckpt.


INFO:tensorflow:global_step/sec: 1.04377


I0529 15:03:07.812833 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.04377


INFO:tensorflow:loss = 0.07676785, step = 5500 (95.809 sec)


I0529 15:03:07.818074 140384064227200 basic_session_run_hooks.py:247] loss = 0.07676785, step = 5500 (95.809 sec)


INFO:tensorflow:global_step/sec: 1.17987


I0529 15:04:32.567800 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.17987


INFO:tensorflow:loss = 0.36640105, step = 5600 (84.755 sec)


I0529 15:04:32.572777 140384064227200 basic_session_run_hooks.py:247] loss = 0.36640105, step = 5600 (84.755 sec)


INFO:tensorflow:global_step/sec: 1.18203


I0529 15:05:57.168061 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18203


INFO:tensorflow:loss = 0.26934457, step = 5700 (84.601 sec)


I0529 15:05:57.173871 140384064227200 basic_session_run_hooks.py:247] loss = 0.26934457, step = 5700 (84.601 sec)


INFO:tensorflow:global_step/sec: 1.1825


I0529 15:07:21.734875 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.1825


INFO:tensorflow:loss = 0.14467677, step = 5800 (84.567 sec)


I0529 15:07:21.741266 140384064227200 basic_session_run_hooks.py:247] loss = 0.14467677, step = 5800 (84.567 sec)


INFO:tensorflow:global_step/sec: 1.1822


I0529 15:08:46.322577 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.1822


INFO:tensorflow:loss = 0.19376108, step = 5900 (84.586 sec)


I0529 15:08:46.328238 140384064227200 basic_session_run_hooks.py:247] loss = 0.19376108, step = 5900 (84.586 sec)


INFO:tensorflow:Saving checkpoints for 6000 into bert_story_cloze/model.ckpt.


I0529 15:10:10.018418 140384064227200 basic_session_run_hooks.py:594] Saving checkpoints for 6000 into bert_story_cloze/model.ckpt.


INFO:tensorflow:global_step/sec: 1.04671


I0529 15:10:21.859867 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.04671


INFO:tensorflow:loss = 0.31247887, step = 6000 (95.539 sec)


I0529 15:10:21.866680 140384064227200 basic_session_run_hooks.py:247] loss = 0.31247887, step = 6000 (95.539 sec)


INFO:tensorflow:global_step/sec: 1.17988


I0529 15:11:46.614525 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.17988


INFO:tensorflow:loss = 0.21861848, step = 6100 (84.753 sec)


I0529 15:11:46.619372 140384064227200 basic_session_run_hooks.py:247] loss = 0.21861848, step = 6100 (84.753 sec)


INFO:tensorflow:global_step/sec: 1.18332


I0529 15:13:11.122543 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18332


INFO:tensorflow:loss = 0.35126114, step = 6200 (84.506 sec)


I0529 15:13:11.125218 140384064227200 basic_session_run_hooks.py:247] loss = 0.35126114, step = 6200 (84.506 sec)


INFO:tensorflow:global_step/sec: 1.18168


I0529 15:14:35.747796 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18168


INFO:tensorflow:loss = 0.18864006, step = 6300 (84.629 sec)


I0529 15:14:35.753829 140384064227200 basic_session_run_hooks.py:247] loss = 0.18864006, step = 6300 (84.629 sec)


INFO:tensorflow:global_step/sec: 1.18466


I0529 15:16:00.159907 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18466


INFO:tensorflow:loss = 0.116259605, step = 6400 (84.412 sec)


I0529 15:16:00.166044 140384064227200 basic_session_run_hooks.py:247] loss = 0.116259605, step = 6400 (84.412 sec)


INFO:tensorflow:Saving checkpoints for 6500 into bert_story_cloze/model.ckpt.


I0529 15:17:23.874912 140384064227200 basic_session_run_hooks.py:594] Saving checkpoints for 6500 into bert_story_cloze/model.ckpt.


INFO:tensorflow:global_step/sec: 1.04692


I0529 15:17:35.677900 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.04692


INFO:tensorflow:loss = 0.47800213, step = 6500 (95.515 sec)


I0529 15:17:35.680627 140384064227200 basic_session_run_hooks.py:247] loss = 0.47800213, step = 6500 (95.515 sec)


INFO:tensorflow:global_step/sec: 1.17993


I0529 15:19:00.428674 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.17993


INFO:tensorflow:loss = 0.2803661, step = 6600 (84.752 sec)


I0529 15:19:00.432731 140384064227200 basic_session_run_hooks.py:247] loss = 0.2803661, step = 6600 (84.752 sec)


INFO:tensorflow:global_step/sec: 1.18242


I0529 15:20:25.001225 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18242


INFO:tensorflow:loss = 0.33243307, step = 6700 (84.577 sec)


I0529 15:20:25.009683 140384064227200 basic_session_run_hooks.py:247] loss = 0.33243307, step = 6700 (84.577 sec)


INFO:tensorflow:global_step/sec: 1.18459


I0529 15:21:49.418431 140384064227200 basic_session_run_hooks.py:680] global_step/sec: 1.18459


INFO:tensorflow:loss = 0.25222337, step = 6800 (84.411 sec)


I0529 15:21:49.420975 140384064227200 basic_session_run_hooks.py:247] loss = 0.25222337, step = 6800 (84.411 sec)


Now let's use our test data to see how well our model did:

In [0]:
test_input_fn = run_classifier.input_fn_builder(
    features=test_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

In [0]:
estimator.evaluate(input_fn=test_input_fn, steps=None)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore


  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2019-02-12T21:04:20Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from gs://bert-tfhub/aclImdb_v1/model.ckpt-468
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Finished evaluation at 2019-02-12-21:06:05
INFO:tensorflow:Saving dict for global step 468: auc = 0.86659324, eval_accuracy = 0.8664, f1_score = 0.8659711, false_negatives = 375.0, false_positives = 293.0, global_step = 468, loss = 0.51870537, precision = 0.880457, recall = 0.8519542, true_negatives = 2174.0, true_positives = 2158.0
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 468: gs://bert-tfhub/aclImdb_v1/model.ckpt-468


{'auc': 0.86659324,
 'eval_accuracy': 0.8664,
 'f1_score': 0.8659711,
 'false_negatives': 375.0,
 'false_positives': 293.0,
 'global_step': 468,
 'loss': 0.51870537,
 'precision': 0.880457,
 'recall': 0.8519542,
 'true_negatives': 2174.0,
 'true_positives': 2158.0}

Now let's write code to make predictions on new sentences:

In [0]:
def getPrediction(in_sentences):
  labels = ["Negative", "Positive"]
  input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, label = 0) for x in in_sentences] # here, "" is just a dummy label
  input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
  predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False)
  predictions = estimator.predict(predict_input_fn)
  return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]

In [0]:
pred_sentences = [
  "That movie was absolutely awful",
  "The acting was a bit lacking",
  "The film was creative and surprising",
  "Absolutely fantastic!"
]

In [0]:
predictions = getPrediction(pred_sentences)

INFO:tensorflow:Writing example 0 of 4
INFO:tensorflow:*** Example ***
INFO:tensorflow:guid: 
INFO:tensorflow:tokens: [CLS] that movie was absolutely awful [SEP]
INFO:tensorflow:input_ids: 101 2008 3185 2001 7078 9643 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Voila! We have a sentiment classifier!

In [0]:
predictions

[('That movie was absolutely awful',
  array([-4.9142293e-03, -5.3180690e+00], dtype=float32),
  'Negative'),
 ('The acting was a bit lacking',
  array([-0.03325794, -3.4200459 ], dtype=float32),
  'Negative'),
 ('The film was creative and surprising',
  array([-5.3589125e+00, -4.7171740e-03], dtype=float32),
  'Positive'),
 ('Absolutely fantastic!',
  array([-5.0434084 , -0.00647258], dtype=float32),
  'Positive')]