<a href="https://colab.research.google.com/github/graulef/bert/blob/master/Predicting_Story_Cloze_with_BERT_usc_nlp_nn_only.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
# Copyright 2019 Google Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#Predicting Story Cloze task with BERT on TF Hub

If you’ve been following Natural Language Processing over the past year, you’ve probably heard of BERT: Bidirectional Encoder Representations from Transformers. It’s a neural network architecture designed by Google researchers that’s totally transformed what’s state-of-the-art for NLP tasks, like text classification, translation, summarization, and question answering.

Now that BERT's been added to [TF Hub](https://www.tensorflow.org/hub) as a loadable module, it's easy(ish) to add into existing Tensorflow text pipelines. In an existing pipeline, BERT can replace text embedding layers like ELMO and GloVE. Alternatively, [finetuning](http://wiki.fast.ai/index.php/Fine_tuning) BERT can provide both an accuracy boost and faster training time in many cases.

In [2]:
!pip list | grep tensorflow
!python --version

mesh-tensorflow          0.0.5                
tensorflow               1.13.1               
tensorflow-estimator     1.13.0               
tensorflow-hub           0.4.0                
tensorflow-metadata      0.13.0               
tensorflow-probability   0.6.0                
Python 3.6.7


In [3]:
from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime

import os
cwd = os.getcwd()
print(cwd)

W0603 23:17:29.318079 140229920782208 __init__.py:56] Some hub symbols are not available because TensorFlow version is less than 1.14


/content


In addition to the standard libraries we imported above, we'll need to install BERT's python package.

In [4]:
!pip install bert-tensorflow

Collecting bert-tensorflow
[?25l  Downloading https://files.pythonhosted.org/packages/a6/66/7eb4e8b6ea35b7cc54c322c816f976167a43019750279a8473d355800a93/bert_tensorflow-1.0.1-py2.py3-none-any.whl (67kB)
[K     |████▉                           | 10kB 21.4MB/s eta 0:00:01[K     |█████████▊                      | 20kB 29.0MB/s eta 0:00:01[K     |██████████████▋                 | 30kB 35.9MB/s eta 0:00:01[K     |███████████████████▍            | 40kB 23.8MB/s eta 0:00:01[K     |████████████████████████▎       | 51kB 27.4MB/s eta 0:00:01[K     |█████████████████████████████▏  | 61kB 31.1MB/s eta 0:00:01[K     |████████████████████████████████| 71kB 23.2MB/s 
Installing collected packages: bert-tensorflow
Successfully installed bert-tensorflow-1.0.1


In [0]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization

Below, we'll set an output directory location to store our model output and checkpoints. This can be a local directory, in which case you'd set OUTPUT_DIR to the name of the directory you'd like to create. If you're running this code in Google's hosted Colab, the directory won't persist after the Colab session ends.

Alternatively, if you're a GCP user, you can store output in a GCP bucket. To do that, set a directory name in OUTPUT_DIR and the name of the GCP bucket in the BUCKET field.

Set DO_DELETE to rewrite the OUTPUT_DIR if it exists. Otherwise, Tensorflow will load existing model checkpoints from that directory (if they exist).

In [6]:
# Set the output directory for saving model file
# Optionally, set a GCP bucket location

OUTPUT_DIR = 'bert_story_cloze_usc_nlp'#@param {type:"string"}
#@markdown Whether or not to clear/delete the directory and create a new one
DO_DELETE = False #@param {type:"boolean"}

print('***** Model output directory: {} *****'.format(OUTPUT_DIR))


***** Model output directory: bert_story_cloze_usc_nlp *****


#Data

In [0]:
from tensorflow import keras
import os
import re
import csv

PATH_EVAL_DATA = "http://felix.graule.ch/wp-content/uploads/2019/05/cloze_test_val_spring2016.csv"
PATH_SENT_NN_DATA = "http://felix.graule.ch/wp-content/uploads/2019/05/train_stories_nearest_story_sent2vec_combined.csv"
PATH_RAND_NN_DATA = "http://felix.graule.ch/wp-content/uploads/2019/05/train_stories_rand_combined.csv"
PATH_USC_NN_DATA = "http://felix.graule.ch/wp-content/uploads/2019/05/train_stories_nearest_story_usc_combined.csv"
PATH_USC_NLP_NN_DATA = "http://felix.graule.ch/wp-content/uploads/2019/06/train_stories_nearest_story_usc_with_nlp_features_combined.csv"
#PATH_EVAL_DATA = "glue_data/StoryCloze/cloze_test_val_spring2016.csv"
#PATH_RAND_NN_DATA = "glue_data/StoryCloze/train_stories_rand_combined.csv"
#PATH_SENT_NN_DATA = "glue_data/StoryCloze/train_stories_nearest_story_sent2vec_combined.csv"

# Load all files from a directory in a DataFrame.
def load_data(path):
  data_1 = {}
  data_1["label"] = []
  data_1["id_1"] = []
  data_1["id_2"] = []
  data_1["context"] = []
  data_1["ending"] = []
  
  data_2 = {}
  data_2["label"] = []
  data_2["id_1"] = []
  data_2["id_2"] = []
  data_2["context"] = []
  data_2["ending"] = []
  
  print(path)
  with open(path) as f:
    csv_reader = csv.reader(f, delimiter=',')
    line_count = 0
    for row in csv_reader:
      if line_count == 0:
        #print("Columns = " + str(row))
        line_count += 1
      else:
        line_count += 1
        
        # Create two lines from one in order to have same label layout as 
        # MRPC task
        seperator = ' '
        data_1["id_1"].append(row[0])
        data_1["id_2"].append(row[0] + "_end_bli")
        data_1["context"].append(str(seperator.join(row[1:5])))
        
        data_2["id_1"].append(row[0])
        data_2["id_2"].append(row[0] + "_end_bla")
        data_2["context"].append(str(seperator.join(row[1:5])))
        
        if row[7] == "1": # First ending is the correct one
          data_1["ending"].append(row[5])
          data_1["label"].append(1)
          data_2["ending"].append(row[6])
          data_2["label"].append(0)
        else: # Second ending is the correct one
          data_1["ending"].append(row[6])
          data_1["label"].append(1)
          data_2["ending"].append(row[5])
          data_2["label"].append(0) 
          
    data_df_1 = pd.DataFrame.from_dict(data_1)
    data_df_2 = pd.DataFrame.from_dict(data_2)
    data = pd.concat([data_df_1, data_df_2])      
    return data     

# Merge positive and negative examples, add a polarity column and shuffle.
def load_validation_only(eval_file):
    eval_data_df = load_data(eval_file)
    total_eval = eval_data_df.shape[0]
    eval_split = 0.3
    eval_num = int(total_eval * eval_split)
    eval_data_df = eval_data_df.sample(frac=1).reset_index(drop=True)
    test_df = eval_data_df.iloc[:eval_num, :]
    train_df = eval_data_df.iloc[eval_num:, :]
    return train_df, test_df

def load_augmented(eval_file, random_nn_file, sent_nn_file, ):
    eval_data_df = load_data(eval_file)
    total_eval = eval_data_df.shape[0]
    eval_split = 0
    eval_data_df = eval_data_df.sample(frac=1).reset_index(drop=True)
    # Eval split defines the ratio of data going into the training set
    #train_df = eval_data_df.iloc[:int(total_eval * eval_split), :]
    # The rest of the validation data is used as test set
    test_df = eval_data_df.iloc[int(total_eval * eval_split):, :]   
    
    usc_nn_df = load_data(sent_nn_file)
    usc_nn_df = usc_nn_df.sample(frac=1).reset_index(drop=True)
    total_usc_nn = usc_nn_df.shape[0]
    usc_nn_df.reset_index(drop=True)
    train_df = pd.DataFrame()
    usc_nn_split = 1
    ext_df = usc_nn_df.iloc[:int(total_usc_nn * usc_nn_split), :]
    train_df = train_df.append(ext_df, ignore_index=True)
    
    return train_df, test_df

# Download and process the dataset files.
def download_and_load_eval_datasets(force_download=False):
  validation = tf.keras.utils.get_file(
      fname="validation", 
      origin=PATH_EVAL_DATA)
  random_nn = tf.keras.utils.get_file(
    fname="rand_nn", 
    origin=PATH_RAND_NN_DATA)
  sent_nn = tf.keras.utils.get_file(
    fname="sent_nn", 
    origin=PATH_USC_NLP_NN_DATA)

  #train_df, test_df = load_validation_only(validation)
  train_df, test_df = load_augmented(validation, random_nn, sent_nn)
  
  return train_df, test_df


In [8]:
train, test = download_and_load_eval_datasets()

print("\nTrain data")
print(train.shape)
for i in range(5):
  print(train.iloc[i]['label'])
  print(train.iloc[i]['context'])
  print(train.iloc[i]['ending'])

print("\nTest data")
print(test.shape)
for i in range(5):
  print(test.iloc[i]['label'])
  print(test.iloc[i]['context'])
  print(test.iloc[i]['ending'])

Downloading data from http://felix.graule.ch/wp-content/uploads/2019/05/cloze_test_val_spring2016.csv
Downloading data from http://felix.graule.ch/wp-content/uploads/2019/05/train_stories_rand_combined.csv
Downloading data from http://felix.graule.ch/wp-content/uploads/2019/06/train_stories_nearest_story_usc_with_nlp_features_combined.csv
/root/.keras/datasets/validation
/root/.keras/datasets/sent_nn

Train data
(176322, 5)
0
Maria was still mad at Jake for cancelling their date yesterday. He tried to call her in the afternoon to see how she's doing. She rejected all of his phone calls. He also drove to her house.
She left his house and went home to mend her broken heart.
1
John was the only boy in his dance class. He wasn't sure he would get along with all of the girls. The teacher said it was time to find a partner. All of the girls wanted John as a partner.
John liked being the only boy in his dance class.
1
The big day had finally come, Sam was going to be married. After getting re

Quick check whether dataset are fully disjoint (takes really long obviously)


In [0]:
train.shape, test.shape
for j in range(10):
    query = train.iloc[j]['ending']
    for i in range(test.shape[0]):
      tmp = test.iloc[i]['ending']
      if tmp == query:
        print("Found something equal")
        print(tmp)

For us, our input data are the 'context' and 'ending' column and our label is the 'label' column (0, 1 for negative and positive, respecitvely)

In [0]:
CONTEXT_COLUMN = 'context'
ENDING_COLUMN = 'ending'
LABEL_COLUMN = 'label'
label_list = [0, 1]

#Data Preprocessing
We'll need to transform our data into a format BERT understands. This involves two steps. First, we create  `InputExample`'s using the constructor provided in the BERT library.

- `text_a` is the text we want to classify, which in this case, is the `Request` field in our Dataframe. For us, this is the context of the story.
- `text_b` is used if we're training a model to understand the relationship between sentences (i.e. is `text_b` a translation of `text_a`? Is `text_b` an answer to the question asked by `text_a`?). This is the ending in our case
- `label` is the label for our example, i.e. True, False

In [11]:
# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this example
                                                                   text_a = x[CONTEXT_COLUMN], 
                                                                   text_b = x[ENDING_COLUMN], 
                                                                   label = x[LABEL_COLUMN]), axis = 1)
print(train_InputExamples.shape)
test_InputExamples = test.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                                   text_a = x[CONTEXT_COLUMN], 
                                                                   text_b = x[ENDING_COLUMN], 
                                                                   label = x[LABEL_COLUMN]), axis = 1)
print(test_InputExamples.shape)

(176322,)
(3742,)


Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (but don't worry--this is also included in the Python library):


1. Lowercase our text (if we're using a BERT lowercase model)
2. Tokenize it (i.e. "sally says hi" -> ["sally", "says", "hi"])
3. Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
4. Map our words to indexes using a vocab file that BERT provides
5. Add special "CLS" and "SEP" tokens (see the [readme](https://github.com/google-research/bert))
6. Append "index" and "segment" tokens to each input (see the [BERT paper](https://arxiv.org/pdf/1810.04805.pdf))

Happily, we don't have to worry about most of these details.




To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:

In [15]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0603 23:21:10.368961 140229920782208 saver.py:1483] Saver not created because there are no variables in the graph to restore


Great--we just learned that the BERT model we're using expects lowercase data (that's what stored in tokenization_info["do_lower_case"]) and we also loaded BERT's vocab file. We also created a tokenizer, which breaks words into word pieces:

In [16]:
tokenizer.tokenize("This here's an example of using the BERT tokenizer")

['this',
 'here',
 "'",
 's',
 'an',
 'example',
 'of',
 'using',
 'the',
 'bert',
 'token',
 '##izer']

Using our tokenizer, we'll call `run_classifier.convert_examples_to_features` on our InputExamples to convert them into features BERT understands.

In [17]:
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128
# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)

INFO:tensorflow:Writing example 0 of 176322


I0603 23:21:15.717627 140229920782208 run_classifier.py:774] Writing example 0 of 176322


INFO:tensorflow:*** Example ***


I0603 23:21:15.724212 140229920782208 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0603 23:21:15.727178 140229920782208 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] maria was still mad at jake for cancel ##ling their date yesterday . he tried to call her in the afternoon to see how she ' s doing . she rejected all of his phone calls . he also drove to her house . [SEP] she left his house and went home to men ##d her broken heart . [SEP]


I0603 23:21:15.730058 140229920782208 run_classifier.py:464] tokens: [CLS] maria was still mad at jake for cancel ##ling their date yesterday . he tried to call her in the afternoon to see how she ' s doing . she rejected all of his phone calls . he also drove to her house . [SEP] she left his house and went home to men ##d her broken heart . [SEP]


INFO:tensorflow:input_ids: 101 3814 2001 2145 5506 2012 5180 2005 17542 2989 2037 3058 7483 1012 2002 2699 2000 2655 2014 1999 1996 5027 2000 2156 2129 2016 1005 1055 2725 1012 2016 5837 2035 1997 2010 3042 4455 1012 2002 2036 5225 2000 2014 2160 1012 102 2016 2187 2010 2160 1998 2253 2188 2000 2273 2094 2014 3714 2540 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:21:15.732675 140229920782208 run_classifier.py:465] input_ids: 101 3814 2001 2145 5506 2012 5180 2005 17542 2989 2037 3058 7483 1012 2002 2699 2000 2655 2014 1999 1996 5027 2000 2156 2129 2016 1005 1055 2725 1012 2016 5837 2035 1997 2010 3042 4455 1012 2002 2036 5225 2000 2014 2160 1012 102 2016 2187 2010 2160 1998 2253 2188 2000 2273 2094 2014 3714 2540 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:21:15.735360 140229920782208 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:21:15.737927 140229920782208 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0603 23:21:15.740335 140229920782208 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0603 23:21:15.744397 140229920782208 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0603 23:21:15.746892 140229920782208 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] john was the only boy in his dance class . he wasn ' t sure he would get along with all of the girls . the teacher said it was time to find a partner . all of the girls wanted john as a partner . [SEP] john liked being the only boy in his dance class . [SEP]


I0603 23:21:15.749345 140229920782208 run_classifier.py:464] tokens: [CLS] john was the only boy in his dance class . he wasn ' t sure he would get along with all of the girls . the teacher said it was time to find a partner . all of the girls wanted john as a partner . [SEP] john liked being the only boy in his dance class . [SEP]


INFO:tensorflow:input_ids: 101 2198 2001 1996 2069 2879 1999 2010 3153 2465 1012 2002 2347 1005 1056 2469 2002 2052 2131 2247 2007 2035 1997 1996 3057 1012 1996 3836 2056 2009 2001 2051 2000 2424 1037 4256 1012 2035 1997 1996 3057 2359 2198 2004 1037 4256 1012 102 2198 4669 2108 1996 2069 2879 1999 2010 3153 2465 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:21:15.751751 140229920782208 run_classifier.py:465] input_ids: 101 2198 2001 1996 2069 2879 1999 2010 3153 2465 1012 2002 2347 1005 1056 2469 2002 2052 2131 2247 2007 2035 1997 1996 3057 1012 1996 3836 2056 2009 2001 2051 2000 2424 1037 4256 1012 2035 1997 1996 3057 2359 2198 2004 1037 4256 1012 102 2198 4669 2108 1996 2069 2879 1999 2010 3153 2465 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:21:15.754216 140229920782208 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:21:15.756392 140229920782208 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0603 23:21:15.758848 140229920782208 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0603 23:21:15.765466 140229920782208 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0603 23:21:15.771854 140229920782208 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] the big day had finally come , sam was going to be married . after getting ready , her bride ##sma ##ids met her in the hotel lobby . waiting , the limo was over an hour late . after a few calls , the limo had been parked just out of sight . [SEP] after entering the limo , sam screamed at the driver for being late . [SEP]


I0603 23:21:15.773874 140229920782208 run_classifier.py:464] tokens: [CLS] the big day had finally come , sam was going to be married . after getting ready , her bride ##sma ##ids met her in the hotel lobby . waiting , the limo was over an hour late . after a few calls , the limo had been parked just out of sight . [SEP] after entering the limo , sam screamed at the driver for being late . [SEP]


INFO:tensorflow:input_ids: 101 1996 2502 2154 2018 2633 2272 1010 3520 2001 2183 2000 2022 2496 1012 2044 2893 3201 1010 2014 8959 26212 9821 2777 2014 1999 1996 3309 9568 1012 3403 1010 1996 23338 2001 2058 2019 3178 2397 1012 2044 1037 2261 4455 1010 1996 23338 2018 2042 9083 2074 2041 1997 4356 1012 102 2044 5738 1996 23338 1010 3520 7210 2012 1996 4062 2005 2108 2397 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:21:15.776350 140229920782208 run_classifier.py:465] input_ids: 101 1996 2502 2154 2018 2633 2272 1010 3520 2001 2183 2000 2022 2496 1012 2044 2893 3201 1010 2014 8959 26212 9821 2777 2014 1999 1996 3309 9568 1012 3403 1010 1996 23338 2001 2058 2019 3178 2397 1012 2044 1037 2261 4455 1010 1996 23338 2018 2042 9083 2074 2041 1997 4356 1012 102 2044 5738 1996 23338 1010 3520 7210 2012 1996 4062 2005 2108 2397 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:21:15.778767 140229920782208 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:21:15.781239 140229920782208 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0603 23:21:15.783196 140229920782208 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0603 23:21:15.787615 140229920782208 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0603 23:21:15.790082 140229920782208 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] lauren decides to have friends over to eat br ##un ##ch . she invites a few of her friends to her home . together they eat eggs , bacon , and fruit . it is a great br ##un ##ch . [SEP] lauren is happy she had some friends over for br ##un ##ch . [SEP]


I0603 23:21:15.792218 140229920782208 run_classifier.py:464] tokens: [CLS] lauren decides to have friends over to eat br ##un ##ch . she invites a few of her friends to her home . together they eat eggs , bacon , and fruit . it is a great br ##un ##ch . [SEP] lauren is happy she had some friends over for br ##un ##ch . [SEP]


INFO:tensorflow:input_ids: 101 10294 7288 2000 2031 2814 2058 2000 4521 7987 4609 2818 1012 2016 18675 1037 2261 1997 2014 2814 2000 2014 2188 1012 2362 2027 4521 6763 1010 11611 1010 1998 5909 1012 2009 2003 1037 2307 7987 4609 2818 1012 102 10294 2003 3407 2016 2018 2070 2814 2058 2005 7987 4609 2818 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:21:15.794646 140229920782208 run_classifier.py:465] input_ids: 101 10294 7288 2000 2031 2814 2058 2000 4521 7987 4609 2818 1012 2016 18675 1037 2261 1997 2014 2814 2000 2014 2188 1012 2362 2027 4521 6763 1010 11611 1010 1998 5909 1012 2009 2003 1037 2307 7987 4609 2818 1012 102 10294 2003 3407 2016 2018 2070 2814 2058 2005 7987 4609 2818 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:21:15.797082 140229920782208 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:21:15.799044 140229920782208 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0603 23:21:15.801428 140229920782208 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0603 23:21:15.805755 140229920782208 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0603 23:21:15.808149 140229920782208 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] one day , a sailor left new york to go to london . three hours after leaving , a storm arose suddenly . the sailor ' s boat nearly caps ##ized as he fought the winds . suddenly , the sailor had no clue where he was and drifted for days . [SEP] this made him seas ##ick and he up ##chu ##cked over the side of the boat . [SEP]


I0603 23:21:15.810579 140229920782208 run_classifier.py:464] tokens: [CLS] one day , a sailor left new york to go to london . three hours after leaving , a storm arose suddenly . the sailor ' s boat nearly caps ##ized as he fought the winds . suddenly , the sailor had no clue where he was and drifted for days . [SEP] this made him seas ##ick and he up ##chu ##cked over the side of the boat . [SEP]


INFO:tensorflow:input_ids: 101 2028 2154 1010 1037 11803 2187 2047 2259 2000 2175 2000 2414 1012 2093 2847 2044 2975 1010 1037 4040 10375 3402 1012 1996 11803 1005 1055 4049 3053 9700 3550 2004 2002 4061 1996 7266 1012 3402 1010 1996 11803 2018 2053 9789 2073 2002 2001 1998 10070 2005 2420 1012 102 2023 2081 2032 11915 6799 1998 2002 2039 20760 18141 2058 1996 2217 1997 1996 4049 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:21:15.813009 140229920782208 run_classifier.py:465] input_ids: 101 2028 2154 1010 1037 11803 2187 2047 2259 2000 2175 2000 2414 1012 2093 2847 2044 2975 1010 1037 4040 10375 3402 1012 1996 11803 1005 1055 4049 3053 9700 3550 2004 2002 4061 1996 7266 1012 3402 1010 1996 11803 2018 2053 9789 2073 2002 2001 1998 10070 2005 2420 1012 102 2023 2081 2032 11915 6799 1998 2002 2039 20760 18141 2058 1996 2217 1997 1996 4049 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:21:15.814986 140229920782208 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:21:15.817456 140229920782208 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0603 23:21:15.819872 140229920782208 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:Writing example 10000 of 176322


I0603 23:21:24.510150 140229920782208 run_classifier.py:774] Writing example 10000 of 176322


INFO:tensorflow:Writing example 20000 of 176322


I0603 23:21:34.086739 140229920782208 run_classifier.py:774] Writing example 20000 of 176322


INFO:tensorflow:Writing example 30000 of 176322


I0603 23:21:42.808663 140229920782208 run_classifier.py:774] Writing example 30000 of 176322


INFO:tensorflow:Writing example 40000 of 176322


I0603 23:21:51.592789 140229920782208 run_classifier.py:774] Writing example 40000 of 176322


INFO:tensorflow:Writing example 50000 of 176322


I0603 23:22:00.395951 140229920782208 run_classifier.py:774] Writing example 50000 of 176322


INFO:tensorflow:Writing example 60000 of 176322


I0603 23:22:09.082093 140229920782208 run_classifier.py:774] Writing example 60000 of 176322


INFO:tensorflow:Writing example 70000 of 176322


I0603 23:22:17.752087 140229920782208 run_classifier.py:774] Writing example 70000 of 176322


INFO:tensorflow:Writing example 80000 of 176322


I0603 23:22:27.514253 140229920782208 run_classifier.py:774] Writing example 80000 of 176322


INFO:tensorflow:Writing example 90000 of 176322


I0603 23:22:36.224781 140229920782208 run_classifier.py:774] Writing example 90000 of 176322


INFO:tensorflow:Writing example 100000 of 176322


I0603 23:22:44.881736 140229920782208 run_classifier.py:774] Writing example 100000 of 176322


INFO:tensorflow:Writing example 110000 of 176322


I0603 23:22:53.582502 140229920782208 run_classifier.py:774] Writing example 110000 of 176322


INFO:tensorflow:Writing example 120000 of 176322


I0603 23:23:02.262179 140229920782208 run_classifier.py:774] Writing example 120000 of 176322


INFO:tensorflow:Writing example 130000 of 176322


I0603 23:23:10.997309 140229920782208 run_classifier.py:774] Writing example 130000 of 176322


INFO:tensorflow:Writing example 140000 of 176322


I0603 23:23:19.774426 140229920782208 run_classifier.py:774] Writing example 140000 of 176322


INFO:tensorflow:Writing example 150000 of 176322


I0603 23:23:28.434909 140229920782208 run_classifier.py:774] Writing example 150000 of 176322


INFO:tensorflow:Writing example 160000 of 176322


I0603 23:23:37.090244 140229920782208 run_classifier.py:774] Writing example 160000 of 176322


INFO:tensorflow:Writing example 170000 of 176322


I0603 23:23:47.122940 140229920782208 run_classifier.py:774] Writing example 170000 of 176322


INFO:tensorflow:Writing example 0 of 3742


I0603 23:23:53.171152 140229920782208 run_classifier.py:774] Writing example 0 of 3742


INFO:tensorflow:*** Example ***


I0603 23:23:53.174261 140229920782208 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0603 23:23:53.179011 140229920782208 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] kay was moving back in with her mom . she sadly packed her things and drove to her mom ' s house . her mother helped her un ##pack her car . there was no enough room for kay ' s things . [SEP] kay had to put some of her things in storage . [SEP]


I0603 23:23:53.182132 140229920782208 run_classifier.py:464] tokens: [CLS] kay was moving back in with her mom . she sadly packed her things and drove to her mom ' s house . her mother helped her un ##pack her car . there was no enough room for kay ' s things . [SEP] kay had to put some of her things in storage . [SEP]


INFO:tensorflow:input_ids: 101 10905 2001 3048 2067 1999 2007 2014 3566 1012 2016 13718 8966 2014 2477 1998 5225 2000 2014 3566 1005 1055 2160 1012 2014 2388 3271 2014 4895 23947 2014 2482 1012 2045 2001 2053 2438 2282 2005 10905 1005 1055 2477 1012 102 10905 2018 2000 2404 2070 1997 2014 2477 1999 5527 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:23:53.185537 140229920782208 run_classifier.py:465] input_ids: 101 10905 2001 3048 2067 1999 2007 2014 3566 1012 2016 13718 8966 2014 2477 1998 5225 2000 2014 3566 1005 1055 2160 1012 2014 2388 3271 2014 4895 23947 2014 2482 1012 2045 2001 2053 2438 2282 2005 10905 1005 1055 2477 1012 102 10905 2018 2000 2404 2070 1997 2014 2477 1999 5527 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:23:53.188140 140229920782208 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:23:53.191422 140229920782208 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0603 23:23:53.194644 140229920782208 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0603 23:23:53.199624 140229920782208 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0603 23:23:53.201420 140229920782208 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] kelly was reading about hybrid animals online . and she saw pictures of li ##gers . she was impressed by their size and strength . but she wondered if they could live in the wild . [SEP] kelly remembered when her grandmother showed her the flowers . [SEP]


I0603 23:23:53.203574 140229920782208 run_classifier.py:464] tokens: [CLS] kelly was reading about hybrid animals online . and she saw pictures of li ##gers . she was impressed by their size and strength . but she wondered if they could live in the wild . [SEP] kelly remembered when her grandmother showed her the flowers . [SEP]


INFO:tensorflow:input_ids: 101 5163 2001 3752 2055 8893 4176 3784 1012 1998 2016 2387 4620 1997 5622 15776 1012 2016 2001 7622 2011 2037 2946 1998 3997 1012 2021 2016 4999 2065 2027 2071 2444 1999 1996 3748 1012 102 5163 4622 2043 2014 7133 3662 2014 1996 4870 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:23:53.206950 140229920782208 run_classifier.py:465] input_ids: 101 5163 2001 3752 2055 8893 4176 3784 1012 1998 2016 2387 4620 1997 5622 15776 1012 2016 2001 7622 2011 2037 2946 1998 3997 1012 2021 2016 4999 2065 2027 2071 2444 1999 1996 3748 1012 102 5163 4622 2043 2014 7133 3662 2014 1996 4870 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:23:53.210289 140229920782208 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:23:53.213150 140229920782208 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0603 23:23:53.215069 140229920782208 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0603 23:23:53.219079 140229920782208 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0603 23:23:53.220891 140229920782208 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] ni ##la ' s mom was diagnosed with stage four cancer . ni ##la knew her mom did not have a long time to live . ni ##la made sure to provide her mom with a good life until her death . ni ##la foster ##ed many good memories with her mom and they were happy . [SEP] ni ##la loved her mom . [SEP]


I0603 23:23:53.222796 140229920782208 run_classifier.py:464] tokens: [CLS] ni ##la ' s mom was diagnosed with stage four cancer . ni ##la knew her mom did not have a long time to live . ni ##la made sure to provide her mom with a good life until her death . ni ##la foster ##ed many good memories with her mom and they were happy . [SEP] ni ##la loved her mom . [SEP]


INFO:tensorflow:input_ids: 101 9152 2721 1005 1055 3566 2001 11441 2007 2754 2176 4456 1012 9152 2721 2354 2014 3566 2106 2025 2031 1037 2146 2051 2000 2444 1012 9152 2721 2081 2469 2000 3073 2014 3566 2007 1037 2204 2166 2127 2014 2331 1012 9152 2721 6469 2098 2116 2204 5758 2007 2014 3566 1998 2027 2020 3407 1012 102 9152 2721 3866 2014 3566 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:23:53.224535 140229920782208 run_classifier.py:465] input_ids: 101 9152 2721 1005 1055 3566 2001 11441 2007 2754 2176 4456 1012 9152 2721 2354 2014 3566 2106 2025 2031 1037 2146 2051 2000 2444 1012 9152 2721 2081 2469 2000 3073 2014 3566 2007 1037 2204 2166 2127 2014 2331 1012 9152 2721 6469 2098 2116 2204 5758 2007 2014 3566 1998 2027 2020 3407 1012 102 9152 2721 3866 2014 3566 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:23:53.226434 140229920782208 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:23:53.228314 140229920782208 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0603 23:23:53.230144 140229920782208 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0603 23:23:53.233152 140229920782208 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0603 23:23:53.234993 140229920782208 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] kay ##lee always wanted a puppy . on her birthday her parents took her to a farm . there were lots of bea ##gle pup ##pies there . her parents told her she could pick a puppy for her birthday . [SEP] kay ##lee was thrilled ! [SEP]


I0603 23:23:53.236753 140229920782208 run_classifier.py:464] tokens: [CLS] kay ##lee always wanted a puppy . on her birthday her parents took her to a farm . there were lots of bea ##gle pup ##pies there . her parents told her she could pick a puppy for her birthday . [SEP] kay ##lee was thrilled ! [SEP]


INFO:tensorflow:input_ids: 101 10905 10559 2467 2359 1037 17022 1012 2006 2014 5798 2014 3008 2165 2014 2000 1037 3888 1012 2045 2020 7167 1997 26892 9354 26781 13046 2045 1012 2014 3008 2409 2014 2016 2071 4060 1037 17022 2005 2014 5798 1012 102 10905 10559 2001 16082 999 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:23:53.238645 140229920782208 run_classifier.py:465] input_ids: 101 10905 10559 2467 2359 1037 17022 1012 2006 2014 5798 2014 3008 2165 2014 2000 1037 3888 1012 2045 2020 7167 1997 26892 9354 26781 13046 2045 1012 2014 3008 2409 2014 2016 2071 4060 1037 17022 2005 2014 5798 1012 102 10905 10559 2001 16082 999 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:23:53.240523 140229920782208 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:23:53.242393 140229920782208 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0603 23:23:53.244181 140229920782208 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0603 23:23:53.247260 140229920782208 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0603 23:23:53.249077 140229920782208 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] randy ' s friend gave him directions to his house . he was supposed to take a right after the white house . randy continued to get lost . he back ##tra ##cked continuously until he finally found it . [SEP] randy saw the house but kept on driving and didn ' t come back . [SEP]


I0603 23:23:53.250754 140229920782208 run_classifier.py:464] tokens: [CLS] randy ' s friend gave him directions to his house . he was supposed to take a right after the white house . randy continued to get lost . he back ##tra ##cked continuously until he finally found it . [SEP] randy saw the house but kept on driving and didn ' t come back . [SEP]


INFO:tensorflow:input_ids: 101 9744 1005 1055 2767 2435 2032 7826 2000 2010 2160 1012 2002 2001 4011 2000 2202 1037 2157 2044 1996 2317 2160 1012 9744 2506 2000 2131 2439 1012 2002 2067 6494 18141 10843 2127 2002 2633 2179 2009 1012 102 9744 2387 1996 2160 2021 2921 2006 4439 1998 2134 1005 1056 2272 2067 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:23:53.252612 140229920782208 run_classifier.py:465] input_ids: 101 9744 1005 1055 2767 2435 2032 7826 2000 2010 2160 1012 2002 2001 4011 2000 2202 1037 2157 2044 1996 2317 2160 1012 9744 2506 2000 2131 2439 1012 2002 2067 6494 18141 10843 2127 2002 2633 2179 2009 1012 102 9744 2387 1996 2160 2021 2921 2006 4439 1998 2134 1005 1056 2272 2067 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:23:53.254544 140229920782208 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0603 23:23:53.256383 140229920782208 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0603 23:23:53.258162 140229920782208 run_classifier.py:468] label: 0 (id = 0)


#Creating a model

Now that we've prepared our data, let's focus on building a model. `create_model` does just this below. First, it loads the BERT tf hub module again (this time to extract the computation graph). Next, it creates a single new layer that will be trained to adapt BERT to our sentiment task (i.e. classifying whether a movie review is positive or negative). This strategy of using a mostly trained model is called [fine-tuning](http://wiki.fast.ai/index.php/Fine_tuning).

In [0]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)


Next we'll wrap our model function in a `model_fn_builder` function that adapts our model to work for training, evaluation, and prediction.

In [0]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        f1_score = tf.contrib.metrics.f1_score(
            label_ids,
            predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        return {
            "eval_accuracy": accuracy,
            "f1_score": f1_score,
            "auc": auc,
            "precision": precision,
            "recall": recall,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
        }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


In [0]:
# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 500
SAVE_SUMMARY_STEPS = 100

In [0]:
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [0]:
# Specify outpit directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [23]:
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})

INFO:tensorflow:Using config: {'_model_dir': 'bert_story_cloze_usc_nlp', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f8958e8ac50>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


I0603 23:23:56.647747 140229920782208 estimator.py:201] Using config: {'_model_dir': 'bert_story_cloze_usc_nlp', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f8958e8ac50>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


Next we create an input builder function that takes our training feature set (`train_features`) and produces a generator. This is a pretty standard design pattern for working with Tensorflow [Estimators](https://www.tensorflow.org/guide/estimators).

In [0]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

Now we train our model! For me, using a Colab notebook running on Google's GPUs, my training time was about 14 minutes.

In [0]:
print(f'Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!
INFO:tensorflow:Calling model_fn.


I0603 23:25:26.597210 140229920782208 estimator.py:1111] Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0603 23:25:29.713078 140229920782208 saver.py:1483] Saver not created because there are no variables in the graph to restore


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


W0603 23:25:29.830688 140229920782208 deprecation.py:506] From <ipython-input-18-ca03218f28a6>:34: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


W0603 23:25:29.874909 140229920782208 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/learning_rate_decay_v2.py:321: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


Instructions for updating:
Use tf.cast instead.


W0603 23:25:29.951170 140229920782208 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


Instructions for updating:
Use tf.cast instead.


W0603 23:25:38.169581 140229920782208 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/metrics_impl.py:455: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.



For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

INFO:tensorflow:Done calling model_fn.


I0603 23:25:40.412787 140229920782208 estimator.py:1113] Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


I0603 23:25:40.418420 140229920782208 basic_session_run_hooks.py:527] Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


I0603 23:25:47.870185 140229920782208 monitored_session.py:222] Graph was finalized.


INFO:tensorflow:Running local_init_op.


I0603 23:25:52.471346 140229920782208 session_manager.py:491] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0603 23:25:52.671347 140229920782208 session_manager.py:493] Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into bert_story_cloze_usc_nlp/model.ckpt.


I0603 23:27:08.396111 140229920782208 basic_session_run_hooks.py:594] Saving checkpoints for 0 into bert_story_cloze_usc_nlp/model.ckpt.


INFO:tensorflow:loss = 0.7044437, step = 0


I0603 23:27:31.297042 140229920782208 basic_session_run_hooks.py:249] loss = 0.7044437, step = 0


INFO:tensorflow:global_step/sec: 1.01692


I0603 23:29:09.632979 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.01692


INFO:tensorflow:loss = 0.55289143, step = 101 (98.342 sec)


I0603 23:29:09.639475 140229920782208 basic_session_run_hooks.py:247] loss = 0.55289143, step = 101 (98.342 sec)


INFO:tensorflow:global_step/sec: 1.14506


I0603 23:30:36.964256 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.14506


INFO:tensorflow:loss = 0.35184205, step = 200 (87.327 sec)


I0603 23:30:36.966478 140229920782208 basic_session_run_hooks.py:247] loss = 0.35184205, step = 200 (87.327 sec)


INFO:tensorflow:global_step/sec: 1.14646


I0603 23:32:04.189554 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.14646


INFO:tensorflow:loss = 0.42252517, step = 300 (87.226 sec)


I0603 23:32:04.192191 140229920782208 basic_session_run_hooks.py:247] loss = 0.42252517, step = 300 (87.226 sec)


INFO:tensorflow:global_step/sec: 1.14751


I0603 23:33:31.334406 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.14751


INFO:tensorflow:loss = 0.269329, step = 400 (87.145 sec)


I0603 23:33:31.336861 140229920782208 basic_session_run_hooks.py:247] loss = 0.269329, step = 400 (87.145 sec)


INFO:tensorflow:Saving checkpoints for 500 into bert_story_cloze_usc_nlp/model.ckpt.


I0603 23:34:57.272189 140229920782208 basic_session_run_hooks.py:594] Saving checkpoints for 500 into bert_story_cloze_usc_nlp/model.ckpt.


INFO:tensorflow:global_step/sec: 1.02141


I0603 23:35:09.238419 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.02141


INFO:tensorflow:loss = 0.5897054, step = 500 (97.904 sec)


I0603 23:35:09.240582 140229920782208 basic_session_run_hooks.py:247] loss = 0.5897054, step = 500 (97.904 sec)


INFO:tensorflow:global_step/sec: 1.14639


I0603 23:36:36.468864 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.14639


INFO:tensorflow:loss = 0.3109292, step = 600 (87.230 sec)


I0603 23:36:36.470939 140229920782208 basic_session_run_hooks.py:247] loss = 0.3109292, step = 600 (87.230 sec)


INFO:tensorflow:global_step/sec: 1.14923


I0603 23:38:03.483791 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.14923


INFO:tensorflow:loss = 0.3734576, step = 700 (87.020 sec)


I0603 23:38:03.490893 140229920782208 basic_session_run_hooks.py:247] loss = 0.3734576, step = 700 (87.020 sec)


INFO:tensorflow:global_step/sec: 1.15147


I0603 23:39:30.329121 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.15147


INFO:tensorflow:loss = 0.43055096, step = 800 (86.843 sec)


I0603 23:39:30.333620 140229920782208 basic_session_run_hooks.py:247] loss = 0.43055096, step = 800 (86.843 sec)


INFO:tensorflow:global_step/sec: 1.148


I0603 23:40:57.437397 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.148


INFO:tensorflow:loss = 0.26735997, step = 900 (87.106 sec)


I0603 23:40:57.439584 140229920782208 basic_session_run_hooks.py:247] loss = 0.26735997, step = 900 (87.106 sec)


INFO:tensorflow:Saving checkpoints for 1000 into bert_story_cloze_usc_nlp/model.ckpt.


I0603 23:42:23.544122 140229920782208 basic_session_run_hooks.py:594] Saving checkpoints for 1000 into bert_story_cloze_usc_nlp/model.ckpt.


INFO:tensorflow:global_step/sec: 1.01622


I0603 23:42:35.841709 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.01622


INFO:tensorflow:loss = 0.4388312, step = 1000 (98.407 sec)


I0603 23:42:35.846872 140229920782208 basic_session_run_hooks.py:247] loss = 0.4388312, step = 1000 (98.407 sec)


INFO:tensorflow:global_step/sec: 1.14753


I0603 23:44:02.985359 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.14753


INFO:tensorflow:loss = 0.3897897, step = 1100 (87.141 sec)


I0603 23:44:02.987717 140229920782208 basic_session_run_hooks.py:247] loss = 0.3897897, step = 1100 (87.141 sec)


INFO:tensorflow:global_step/sec: 1.14974


I0603 23:45:29.961405 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.14974


INFO:tensorflow:loss = 0.45484707, step = 1200 (86.977 sec)


I0603 23:45:29.964778 140229920782208 basic_session_run_hooks.py:247] loss = 0.45484707, step = 1200 (86.977 sec)


INFO:tensorflow:global_step/sec: 1.15543


I0603 23:46:56.509646 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.15543


INFO:tensorflow:loss = 0.37804532, step = 1300 (86.549 sec)


I0603 23:46:56.513918 140229920782208 basic_session_run_hooks.py:247] loss = 0.37804532, step = 1300 (86.549 sec)


INFO:tensorflow:global_step/sec: 1.15147


I0603 23:48:23.354881 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.15147


INFO:tensorflow:loss = 0.32833454, step = 1400 (86.845 sec)


I0603 23:48:23.359032 140229920782208 basic_session_run_hooks.py:247] loss = 0.32833454, step = 1400 (86.845 sec)


INFO:tensorflow:Saving checkpoints for 1500 into bert_story_cloze_usc_nlp/model.ckpt.


I0603 23:49:49.367968 140229920782208 basic_session_run_hooks.py:594] Saving checkpoints for 1500 into bert_story_cloze_usc_nlp/model.ckpt.


INFO:tensorflow:global_step/sec: 1.02225


I0603 23:50:01.178151 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.02225


INFO:tensorflow:loss = 0.30923972, step = 1500 (97.828 sec)


I0603 23:50:01.186716 140229920782208 basic_session_run_hooks.py:247] loss = 0.30923972, step = 1500 (97.828 sec)


INFO:tensorflow:global_step/sec: 1.14903


I0603 23:51:28.207974 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.14903


INFO:tensorflow:loss = 0.18754536, step = 1600 (87.024 sec)


I0603 23:51:28.210527 140229920782208 basic_session_run_hooks.py:247] loss = 0.18754536, step = 1600 (87.024 sec)


INFO:tensorflow:global_step/sec: 1.15085


I0603 23:52:55.100059 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.15085


INFO:tensorflow:loss = 0.21218066, step = 1700 (86.894 sec)


I0603 23:52:55.104107 140229920782208 basic_session_run_hooks.py:247] loss = 0.21218066, step = 1700 (86.894 sec)


INFO:tensorflow:global_step/sec: 1.15226


I0603 23:54:21.886346 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.15226


INFO:tensorflow:loss = 0.3272484, step = 1800 (86.788 sec)


I0603 23:54:21.892406 140229920782208 basic_session_run_hooks.py:247] loss = 0.3272484, step = 1800 (86.788 sec)


INFO:tensorflow:global_step/sec: 1.15444


I0603 23:55:48.508448 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.15444


INFO:tensorflow:loss = 0.30445635, step = 1900 (86.622 sec)


I0603 23:55:48.514602 140229920782208 basic_session_run_hooks.py:247] loss = 0.30445635, step = 1900 (86.622 sec)


INFO:tensorflow:Saving checkpoints for 2000 into bert_story_cloze_usc_nlp/model.ckpt.


I0603 23:57:14.285772 140229920782208 basic_session_run_hooks.py:594] Saving checkpoints for 2000 into bert_story_cloze_usc_nlp/model.ckpt.


INFO:tensorflow:global_step/sec: 1.01842


I0603 23:57:26.699633 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.01842


INFO:tensorflow:loss = 0.26378748, step = 2000 (98.189 sec)


I0603 23:57:26.704007 140229920782208 basic_session_run_hooks.py:247] loss = 0.26378748, step = 2000 (98.189 sec)


INFO:tensorflow:global_step/sec: 1.14717


I0603 23:58:53.870664 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.14717


INFO:tensorflow:loss = 0.26358238, step = 2100 (87.170 sec)


I0603 23:58:53.873670 140229920782208 basic_session_run_hooks.py:247] loss = 0.26358238, step = 2100 (87.170 sec)


INFO:tensorflow:global_step/sec: 1.15059


I0604 00:00:20.782702 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.15059


INFO:tensorflow:loss = 0.28967175, step = 2200 (86.911 sec)


I0604 00:00:20.784870 140229920782208 basic_session_run_hooks.py:247] loss = 0.28967175, step = 2200 (86.911 sec)


INFO:tensorflow:global_step/sec: 1.15362


I0604 00:01:47.466555 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.15362


INFO:tensorflow:loss = 0.27771923, step = 2300 (86.684 sec)


I0604 00:01:47.468744 140229920782208 basic_session_run_hooks.py:247] loss = 0.27771923, step = 2300 (86.684 sec)


INFO:tensorflow:global_step/sec: 1.15196


I0604 00:03:14.274761 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.15196


INFO:tensorflow:loss = 0.31198627, step = 2400 (86.810 sec)


I0604 00:03:14.278968 140229920782208 basic_session_run_hooks.py:247] loss = 0.31198627, step = 2400 (86.810 sec)


INFO:tensorflow:Saving checkpoints for 2500 into bert_story_cloze_usc_nlp/model.ckpt.


I0604 00:04:40.258496 140229920782208 basic_session_run_hooks.py:594] Saving checkpoints for 2500 into bert_story_cloze_usc_nlp/model.ckpt.


Instructions for updating:
Use standard file APIs to delete files with this prefix.


W0604 00:04:48.303866 140229920782208 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:966: remove_checkpoint (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to delete files with this prefix.


INFO:tensorflow:global_step/sec: 1.01917


I0604 00:04:52.393802 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.01917


INFO:tensorflow:loss = 0.2813922, step = 2500 (98.118 sec)


I0604 00:04:52.397198 140229920782208 basic_session_run_hooks.py:247] loss = 0.2813922, step = 2500 (98.118 sec)


INFO:tensorflow:global_step/sec: 1.14679


I0604 00:06:19.593719 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.14679


INFO:tensorflow:loss = 0.33376563, step = 2600 (87.202 sec)


I0604 00:06:19.598768 140229920782208 basic_session_run_hooks.py:247] loss = 0.33376563, step = 2600 (87.202 sec)


INFO:tensorflow:global_step/sec: 1.15105


I0604 00:07:46.470574 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.15105


INFO:tensorflow:loss = 0.2877571, step = 2700 (86.876 sec)


I0604 00:07:46.474534 140229920782208 basic_session_run_hooks.py:247] loss = 0.2877571, step = 2700 (86.876 sec)


INFO:tensorflow:global_step/sec: 1.15505


I0604 00:09:13.047120 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.15505


INFO:tensorflow:loss = 0.48978725, step = 2800 (86.577 sec)


I0604 00:09:13.052006 140229920782208 basic_session_run_hooks.py:247] loss = 0.48978725, step = 2800 (86.577 sec)


INFO:tensorflow:global_step/sec: 1.15174


I0604 00:10:39.872496 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.15174


INFO:tensorflow:loss = 0.48326728, step = 2900 (86.823 sec)


I0604 00:10:39.874533 140229920782208 basic_session_run_hooks.py:247] loss = 0.48326728, step = 2900 (86.823 sec)


INFO:tensorflow:Saving checkpoints for 3000 into bert_story_cloze_usc_nlp/model.ckpt.


I0604 00:12:05.787767 140229920782208 basic_session_run_hooks.py:594] Saving checkpoints for 3000 into bert_story_cloze_usc_nlp/model.ckpt.


INFO:tensorflow:global_step/sec: 1.01852


I0604 00:12:18.054246 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.01852


INFO:tensorflow:loss = 0.27979583, step = 3000 (98.188 sec)


I0604 00:12:18.062445 140229920782208 basic_session_run_hooks.py:247] loss = 0.27979583, step = 3000 (98.188 sec)


INFO:tensorflow:global_step/sec: 1.14807


I0604 00:13:45.157176 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.14807


INFO:tensorflow:loss = 0.27832466, step = 3100 (87.102 sec)


I0604 00:13:45.164247 140229920782208 basic_session_run_hooks.py:247] loss = 0.27832466, step = 3100 (87.102 sec)


INFO:tensorflow:global_step/sec: 1.15147


I0604 00:15:12.002531 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.15147


INFO:tensorflow:loss = 0.5693831, step = 3200 (86.840 sec)


I0604 00:15:12.004523 140229920782208 basic_session_run_hooks.py:247] loss = 0.5693831, step = 3200 (86.840 sec)


INFO:tensorflow:global_step/sec: 1.15407


I0604 00:16:38.652509 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.15407


INFO:tensorflow:loss = 0.32295296, step = 3300 (86.650 sec)


I0604 00:16:38.654522 140229920782208 basic_session_run_hooks.py:247] loss = 0.32295296, step = 3300 (86.650 sec)


INFO:tensorflow:global_step/sec: 1.15176


I0604 00:18:05.475974 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.15176


INFO:tensorflow:loss = 0.28968143, step = 3400 (86.826 sec)


I0604 00:18:05.481007 140229920782208 basic_session_run_hooks.py:247] loss = 0.28968143, step = 3400 (86.826 sec)


INFO:tensorflow:Saving checkpoints for 3500 into bert_story_cloze_usc_nlp/model.ckpt.


I0604 00:19:31.466199 140229920782208 basic_session_run_hooks.py:594] Saving checkpoints for 3500 into bert_story_cloze_usc_nlp/model.ckpt.


INFO:tensorflow:global_step/sec: 1.01554


I0604 00:19:43.946130 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.01554


INFO:tensorflow:loss = 0.34296602, step = 3500 (98.468 sec)


I0604 00:19:43.949456 140229920782208 basic_session_run_hooks.py:247] loss = 0.34296602, step = 3500 (98.468 sec)


INFO:tensorflow:global_step/sec: 1.14659


I0604 00:21:11.161109 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.14659


INFO:tensorflow:loss = 0.14398074, step = 3600 (87.214 sec)


I0604 00:21:11.163284 140229920782208 basic_session_run_hooks.py:247] loss = 0.14398074, step = 3600 (87.214 sec)


INFO:tensorflow:global_step/sec: 1.15167


I0604 00:22:37.991473 140229920782208 basic_session_run_hooks.py:680] global_step/sec: 1.15167


INFO:tensorflow:loss = 0.5000293, step = 3700 (86.830 sec)


I0604 00:22:37.993590 140229920782208 basic_session_run_hooks.py:247] loss = 0.5000293, step = 3700 (86.830 sec)


Now let's use our test data to see how well our model did:

In [0]:
test_input_fn = bert.run_classifier.input_fn_builder(
    features=test_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

In [0]:
estimator.evaluate(input_fn=test_input_fn, steps=None)