<a href="https://colab.research.google.com/github/graulef/bert/blob/master/Predicting_Story_Cloze_with_BERT_random_nn_only.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
# Copyright 2019 Google Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#Predicting Story Cloze task with BERT on TF Hub

If you’ve been following Natural Language Processing over the past year, you’ve probably heard of BERT: Bidirectional Encoder Representations from Transformers. It’s a neural network architecture designed by Google researchers that’s totally transformed what’s state-of-the-art for NLP tasks, like text classification, translation, summarization, and question answering.

Now that BERT's been added to [TF Hub](https://www.tensorflow.org/hub) as a loadable module, it's easy(ish) to add into existing Tensorflow text pipelines. In an existing pipeline, BERT can replace text embedding layers like ELMO and GloVE. Alternatively, [finetuning](http://wiki.fast.ai/index.php/Fine_tuning) BERT can provide both an accuracy boost and faster training time in many cases.

In [2]:
!pip list | grep tensorflow
!python --version

mesh-tensorflow          0.0.5                
tensorflow               1.13.1               
tensorflow-estimator     1.13.0               
tensorflow-hub           0.4.0                
tensorflow-metadata      0.13.0               
tensorflow-probability   0.6.0                
Python 3.6.7


In [3]:
from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime

import os
cwd = os.getcwd()
print(cwd)

W0531 15:42:04.114859 139844217653120 __init__.py:56] Some hub symbols are not available because TensorFlow version is less than 1.14


/content


In addition to the standard libraries we imported above, we'll need to install BERT's python package.

In [4]:
!pip install bert-tensorflow

Collecting bert-tensorflow
[?25l  Downloading https://files.pythonhosted.org/packages/a6/66/7eb4e8b6ea35b7cc54c322c816f976167a43019750279a8473d355800a93/bert_tensorflow-1.0.1-py2.py3-none-any.whl (67kB)
[K     |████▉                           | 10kB 20.1MB/s eta 0:00:01[K     |█████████▊                      | 20kB 5.7MB/s eta 0:00:01[K     |██████████████▋                 | 30kB 8.1MB/s eta 0:00:01[K     |███████████████████▍            | 40kB 5.3MB/s eta 0:00:01[K     |████████████████████████▎       | 51kB 6.4MB/s eta 0:00:01[K     |█████████████████████████████▏  | 61kB 7.6MB/s eta 0:00:01[K     |████████████████████████████████| 71kB 7.7MB/s 
Installing collected packages: bert-tensorflow
Successfully installed bert-tensorflow-1.0.1


In [0]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization

Below, we'll set an output directory location to store our model output and checkpoints. This can be a local directory, in which case you'd set OUTPUT_DIR to the name of the directory you'd like to create. If you're running this code in Google's hosted Colab, the directory won't persist after the Colab session ends.

Alternatively, if you're a GCP user, you can store output in a GCP bucket. To do that, set a directory name in OUTPUT_DIR and the name of the GCP bucket in the BUCKET field.

Set DO_DELETE to rewrite the OUTPUT_DIR if it exists. Otherwise, Tensorflow will load existing model checkpoints from that directory (if they exist).

In [6]:
# Set the output directory for saving model file
# Optionally, set a GCP bucket location

OUTPUT_DIR = 'bert_story_cloze_aug'#@param {type:"string"}
#@markdown Whether or not to clear/delete the directory and create a new one
DO_DELETE = False #@param {type:"boolean"}

print('***** Model output directory: {} *****'.format(OUTPUT_DIR))


***** Model output directory: bert_story_cloze_aug *****


#Data

In [0]:
from tensorflow import keras
import os
import re
import csv

PATH_EVAL_DATA = "http://felix.graule.ch/wp-content/uploads/2019/05/cloze_test_val_spring2016.csv"
PATH_SENT_NN_DATA = "http://felix.graule.ch/wp-content/uploads/2019/05/train_stories_nearest_story_sent2vec_combined.csv"
PATH_RAND_NN_DATA = "http://felix.graule.ch/wp-content/uploads/2019/05/train_stories_rand_combined.csv"
#PATH_EVAL_DATA = "glue_data/StoryCloze/cloze_test_val_spring2016.csv"
#PATH_RAND_NN_DATA = "glue_data/StoryCloze/train_stories_rand_combined.csv"
#PATH_SENT_NN_DATA = "glue_data/StoryCloze/train_stories_nearest_story_sent2vec_combined.csv"

# Load all files from a directory in a DataFrame.
def load_data(path):
  data_1 = {}
  data_1["label"] = []
  data_1["id_1"] = []
  data_1["id_2"] = []
  data_1["context"] = []
  data_1["ending"] = []
  
  data_2 = {}
  data_2["label"] = []
  data_2["id_1"] = []
  data_2["id_2"] = []
  data_2["context"] = []
  data_2["ending"] = []
  
  print(path)
  with open(path) as f:
    csv_reader = csv.reader(f, delimiter=',')
    line_count = 0
    for row in csv_reader:
      if line_count == 0:
        #print("Columns = " + str(row))
        line_count += 1
      else:
        line_count += 1
        
        # Create two lines from one in order to have same label layout as 
        # MRPC task
        seperator = ' '
        data_1["id_1"].append(row[0])
        data_1["id_2"].append(row[0] + "_end_bli")
        data_1["context"].append(str(seperator.join(row[1:5])))
        
        data_2["id_1"].append(row[0])
        data_2["id_2"].append(row[0] + "_end_bla")
        data_2["context"].append(str(seperator.join(row[1:5])))
        
        if row[7] == "1": # First ending is the correct one
          data_1["ending"].append(row[5])
          data_1["label"].append(1)
          data_2["ending"].append(row[6])
          data_2["label"].append(0)
        else: # Second ending is the correct one
          data_1["ending"].append(row[6])
          data_1["label"].append(1)
          data_2["ending"].append(row[5])
          data_2["label"].append(0) 
          
    data_df_1 = pd.DataFrame.from_dict(data_1)
    data_df_2 = pd.DataFrame.from_dict(data_2)
    data = pd.concat([data_df_1, data_df_2])      
    return data     

# Merge positive and negative examples, add a polarity column and shuffle.
def load_validation_only(eval_file):
    eval_data_df = load_data(eval_file)
    total_eval = eval_data_df.shape[0]
    eval_split = 0.3
    eval_num = int(total_eval * eval_split)
    eval_data_df = eval_data_df.sample(frac=1).reset_index(drop=True)
    test_df = eval_data_df.iloc[:eval_num, :]
    train_df = eval_data_df.iloc[eval_num:, :]
    return train_df, test_df

def load_augmented(eval_file, random_nn_file, sent_nn_file, ):
    eval_data_df = load_data(eval_file)
    total_eval = eval_data_df.shape[0]
    eval_split = 0
    eval_data_df = eval_data_df.sample(frac=1).reset_index(drop=True)
    # Eval split defines the ratio of data going into the training set
    #train_df = eval_data_df.iloc[:int(total_eval * eval_split), :]
    # The rest of the validation data is used as test set
    test_df = eval_data_df.iloc[int(total_eval * eval_split):, :]   
    
    random_nn_df = load_data(random_nn_file)
    random_nn_df = random_nn_df.sample(frac=1).reset_index(drop=True)
    total_random_nn = random_nn_df.shape[0]
    random_nn_df.reset_index(drop=True)
    train_df = pd.DataFrame()
    random_nn_split = 1
    ext_df = random_nn_df.iloc[:int(total_random_nn * random_nn_split), :]
    train_df = train_df.append(ext_df, ignore_index=True)
    
    #sent_nn_split = 7/10
    #sent_nn_df = load_data(sent_nn_file)
    #sent_nn_df = sent_nn_df.sample(frac=1).reset_index(drop=True)
    #total_sent_nn = sent_nn_df.shape[0]
    #sent_nn_df.reset_index(drop=True)
    #ext_df = sent_nn_df.iloc[:int(total_sent_nn * sent_nn_split), :]
    #train_df = train_df.append(ext_df, ignore_index=True)
    
    return train_df, test_df

# Download and process the dataset files.
def download_and_load_eval_datasets(force_download=False):
  validation = tf.keras.utils.get_file(
      fname="validation", 
      origin=PATH_EVAL_DATA)
  random_nn = tf.keras.utils.get_file(
    fname="rand_nn", 
    origin=PATH_RAND_NN_DATA)
  sent_nn = tf.keras.utils.get_file(
    fname="sent_nn", 
    origin=PATH_SENT_NN_DATA)

  #train_df, test_df = load_validation_only(validation)
  train_df, test_df = load_augmented(validation, random_nn, sent_nn)
  
  return train_df, test_df


In [16]:
train, test = download_and_load_eval_datasets()

print("\nTrain data")
print(train.shape)
for i in range(5):
  print(train.iloc[i]['label'])
  print(train.iloc[i]['context'])
  print(train.iloc[i]['ending'])

print("\nTest data")
print(test.shape)
for i in range(5):
  print(test.iloc[i]['label'])
  print(test.iloc[i]['context'])
  print(test.iloc[i]['ending'])

/root/.keras/datasets/validation
/root/.keras/datasets/rand_nn

Train data
(176322, 5)
1
Jill had a gambling problem. She had spent her savings on lottery tickets. She thought she would be able to pay off her credit card if she won. Jill lost all her money through the lottery tickets.
Jill got treatment for her gambling problem the next day.
0
Sally bought a coffee at her local coffee shop. On the receipt was a survey. For completing it, Sally got a free donut-which she redeemed at lunch. Then she was given another receipt with a survey on it!
Running to catch up, he saw Bill's fiance leaving a stranger's house.
1
Will's car wouldn't start after work. Will didn't live far from the office. So Will decided just to walk home, as the evening was nice. Will began doing this on a regular basis.
Now Bill walks to and from work every single day!
1
Patricia figured she knew everything that needed to be known. Someone asked her where Colombia was. She said it was in South Carolina, easy enough. 

Quick check whether dataset are fully disjoint (takes really long obviously)


In [0]:
train.shape, test.shape
for j in range(10):
    query = train.iloc[j]['ending']
    for i in range(test.shape[0]):
      tmp = test.iloc[i]['ending']
      if tmp == query:
        print("Found something equal")
        print(tmp)

For us, our input data are the 'context' and 'ending' column and our label is the 'label' column (0, 1 for negative and positive, respecitvely)

In [0]:
CONTEXT_COLUMN = 'context'
ENDING_COLUMN = 'ending'
LABEL_COLUMN = 'label'
label_list = [0, 1]

#Data Preprocessing
We'll need to transform our data into a format BERT understands. This involves two steps. First, we create  `InputExample`'s using the constructor provided in the BERT library.

- `text_a` is the text we want to classify, which in this case, is the `Request` field in our Dataframe. For us, this is the context of the story.
- `text_b` is used if we're training a model to understand the relationship between sentences (i.e. is `text_b` a translation of `text_a`? Is `text_b` an answer to the question asked by `text_a`?). This is the ending in our case
- `label` is the label for our example, i.e. True, False

In [19]:
# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this example
                                                                   text_a = x[CONTEXT_COLUMN], 
                                                                   text_b = x[ENDING_COLUMN], 
                                                                   label = x[LABEL_COLUMN]), axis = 1)
print(train_InputExamples.shape)
test_InputExamples = test.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                                   text_a = x[CONTEXT_COLUMN], 
                                                                   text_b = x[ENDING_COLUMN], 
                                                                   label = x[LABEL_COLUMN]), axis = 1)
print(test_InputExamples.shape)

(176322,)
(3742,)


Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (but don't worry--this is also included in the Python library):


1. Lowercase our text (if we're using a BERT lowercase model)
2. Tokenize it (i.e. "sally says hi" -> ["sally", "says", "hi"])
3. Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
4. Map our words to indexes using a vocab file that BERT provides
5. Add special "CLS" and "SEP" tokens (see the [readme](https://github.com/google-research/bert))
6. Append "index" and "segment" tokens to each input (see the [BERT paper](https://arxiv.org/pdf/1810.04805.pdf))

Happily, we don't have to worry about most of these details.




To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:

In [20]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

Instructions for updating:
Colocations handled automatically by placer.


W0531 15:47:05.334389 139844217653120 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py:3632: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0531 15:47:07.304256 139844217653120 saver.py:1483] Saver not created because there are no variables in the graph to restore


Great--we just learned that the BERT model we're using expects lowercase data (that's what stored in tokenization_info["do_lower_case"]) and we also loaded BERT's vocab file. We also created a tokenizer, which breaks words into word pieces:

In [21]:
tokenizer.tokenize("This here's an example of using the BERT tokenizer")

['this',
 'here',
 "'",
 's',
 'an',
 'example',
 'of',
 'using',
 'the',
 'bert',
 'token',
 '##izer']

Using our tokenizer, we'll call `run_classifier.convert_examples_to_features` on our InputExamples to convert them into features BERT understands.

In [22]:
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128
# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)

INFO:tensorflow:Writing example 0 of 176322


I0531 15:47:15.676979 139844217653120 run_classifier.py:774] Writing example 0 of 176322


INFO:tensorflow:*** Example ***


I0531 15:47:15.681223 139844217653120 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0531 15:47:15.684233 139844217653120 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] jill had a gambling problem . she had spent her savings on lottery tickets . she thought she would be able to pay off her credit card if she won . jill lost all her money through the lottery tickets . [SEP] jill got treatment for her gambling problem the next day . [SEP]


I0531 15:47:15.688808 139844217653120 run_classifier.py:464] tokens: [CLS] jill had a gambling problem . she had spent her savings on lottery tickets . she thought she would be able to pay off her credit card if she won . jill lost all her money through the lottery tickets . [SEP] jill got treatment for her gambling problem the next day . [SEP]


INFO:tensorflow:input_ids: 101 10454 2018 1037 12219 3291 1012 2016 2018 2985 2014 10995 2006 15213 9735 1012 2016 2245 2016 2052 2022 2583 2000 3477 2125 2014 4923 4003 2065 2016 2180 1012 10454 2439 2035 2014 2769 2083 1996 15213 9735 1012 102 10454 2288 3949 2005 2014 12219 3291 1996 2279 2154 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:47:15.692189 139844217653120 run_classifier.py:465] input_ids: 101 10454 2018 1037 12219 3291 1012 2016 2018 2985 2014 10995 2006 15213 9735 1012 2016 2245 2016 2052 2022 2583 2000 3477 2125 2014 4923 4003 2065 2016 2180 1012 10454 2439 2035 2014 2769 2083 1996 15213 9735 1012 102 10454 2288 3949 2005 2014 12219 3291 1996 2279 2154 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:47:15.695100 139844217653120 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:47:15.698004 139844217653120 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0531 15:47:15.700195 139844217653120 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0531 15:47:15.705594 139844217653120 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0531 15:47:15.708361 139844217653120 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] sally bought a coffee at her local coffee shop . on the receipt was a survey . for completing it , sally got a free don ##ut - which she red ##eem ##ed at lunch . then she was given another receipt with a survey on it ! [SEP] running to catch up , he saw bill ' s fiance leaving a stranger ' s house . [SEP]


I0531 15:47:15.710776 139844217653120 run_classifier.py:464] tokens: [CLS] sally bought a coffee at her local coffee shop . on the receipt was a survey . for completing it , sally got a free don ##ut - which she red ##eem ##ed at lunch . then she was given another receipt with a survey on it ! [SEP] running to catch up , he saw bill ' s fiance leaving a stranger ' s house . [SEP]


INFO:tensorflow:input_ids: 101 8836 4149 1037 4157 2012 2014 2334 4157 4497 1012 2006 1996 24306 2001 1037 5002 1012 2005 7678 2009 1010 8836 2288 1037 2489 2123 4904 1011 2029 2016 2417 21564 2098 2012 6265 1012 2059 2016 2001 2445 2178 24306 2007 1037 5002 2006 2009 999 102 2770 2000 4608 2039 1010 2002 2387 3021 1005 1055 19154 2975 1037 7985 1005 1055 2160 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:47:15.713688 139844217653120 run_classifier.py:465] input_ids: 101 8836 4149 1037 4157 2012 2014 2334 4157 4497 1012 2006 1996 24306 2001 1037 5002 1012 2005 7678 2009 1010 8836 2288 1037 2489 2123 4904 1011 2029 2016 2417 21564 2098 2012 6265 1012 2059 2016 2001 2445 2178 24306 2007 1037 5002 2006 2009 999 102 2770 2000 4608 2039 1010 2002 2387 3021 1005 1055 19154 2975 1037 7985 1005 1055 2160 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:47:15.716596 139844217653120 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:47:15.719007 139844217653120 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0531 15:47:15.721851 139844217653120 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0531 15:47:15.726603 139844217653120 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0531 15:47:15.729474 139844217653120 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] will ' s car wouldn ' t start after work . will didn ' t live far from the office . so will decided just to walk home , as the evening was nice . will began doing this on a regular basis . [SEP] now bill walks to and from work every single day ! [SEP]


I0531 15:47:15.732267 139844217653120 run_classifier.py:464] tokens: [CLS] will ' s car wouldn ' t start after work . will didn ' t live far from the office . so will decided just to walk home , as the evening was nice . will began doing this on a regular basis . [SEP] now bill walks to and from work every single day ! [SEP]


INFO:tensorflow:input_ids: 101 2097 1005 1055 2482 2876 1005 1056 2707 2044 2147 1012 2097 2134 1005 1056 2444 2521 2013 1996 2436 1012 2061 2097 2787 2074 2000 3328 2188 1010 2004 1996 3944 2001 3835 1012 2097 2211 2725 2023 2006 1037 3180 3978 1012 102 2085 3021 7365 2000 1998 2013 2147 2296 2309 2154 999 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:47:15.734705 139844217653120 run_classifier.py:465] input_ids: 101 2097 1005 1055 2482 2876 1005 1056 2707 2044 2147 1012 2097 2134 1005 1056 2444 2521 2013 1996 2436 1012 2061 2097 2787 2074 2000 3328 2188 1010 2004 1996 3944 2001 3835 1012 2097 2211 2725 2023 2006 1037 3180 3978 1012 102 2085 3021 7365 2000 1998 2013 2147 2296 2309 2154 999 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:47:15.737617 139844217653120 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:47:15.740521 139844217653120 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0531 15:47:15.743361 139844217653120 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0531 15:47:15.748243 139844217653120 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0531 15:47:15.751062 139844217653120 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] patricia figured she knew everything that needed to be known . someone asked her where colombia was . she said it was in south carolina , easy enough . they said they meant the country . [SEP] patricia said she didn ' t know where it was , so it wasn ' t worth knowing . [SEP]


I0531 15:47:15.754683 139844217653120 run_classifier.py:464] tokens: [CLS] patricia figured she knew everything that needed to be known . someone asked her where colombia was . she said it was in south carolina , easy enough . they said they meant the country . [SEP] patricia said she didn ' t know where it was , so it wasn ' t worth knowing . [SEP]


INFO:tensorflow:input_ids: 101 10717 6618 2016 2354 2673 2008 2734 2000 2022 2124 1012 2619 2356 2014 2073 7379 2001 1012 2016 2056 2009 2001 1999 2148 3792 1010 3733 2438 1012 2027 2056 2027 3214 1996 2406 1012 102 10717 2056 2016 2134 1005 1056 2113 2073 2009 2001 1010 2061 2009 2347 1005 1056 4276 4209 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:47:15.762065 139844217653120 run_classifier.py:465] input_ids: 101 10717 6618 2016 2354 2673 2008 2734 2000 2022 2124 1012 2619 2356 2014 2073 7379 2001 1012 2016 2056 2009 2001 1999 2148 3792 1010 3733 2438 1012 2027 2056 2027 3214 1996 2406 1012 102 10717 2056 2016 2134 1005 1056 2113 2073 2009 2001 1010 2061 2009 2347 1005 1056 4276 4209 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:47:15.764561 139844217653120 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:47:15.767908 139844217653120 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0531 15:47:15.770732 139844217653120 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0531 15:47:15.775441 139844217653120 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0531 15:47:15.778263 139844217653120 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] cathy had a crush on bill . she found out that bill liked blonde hair . cathy decided to dye her hair to en ##tic ##e bill . cathy dyed her hair herself , but it came out orange . [SEP] allie shook her head in disbelief . [SEP]


I0531 15:47:15.781216 139844217653120 run_classifier.py:464] tokens: [CLS] cathy had a crush on bill . she found out that bill liked blonde hair . cathy decided to dye her hair to en ##tic ##e bill . cathy dyed her hair herself , but it came out orange . [SEP] allie shook her head in disbelief . [SEP]


INFO:tensorflow:input_ids: 101 18305 2018 1037 10188 2006 3021 1012 2016 2179 2041 2008 3021 4669 9081 2606 1012 18305 2787 2000 18554 2014 2606 2000 4372 4588 2063 3021 1012 18305 28432 2014 2606 2841 1010 2021 2009 2234 2041 4589 1012 102 16944 3184 2014 2132 1999 12537 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:47:15.784130 139844217653120 run_classifier.py:465] input_ids: 101 18305 2018 1037 10188 2006 3021 1012 2016 2179 2041 2008 3021 4669 9081 2606 1012 18305 2787 2000 18554 2014 2606 2000 4372 4588 2063 3021 1012 18305 28432 2014 2606 2841 1010 2021 2009 2234 2041 4589 1012 102 16944 3184 2014 2132 1999 12537 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:47:15.786570 139844217653120 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:47:15.789378 139844217653120 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0531 15:47:15.792474 139844217653120 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:Writing example 10000 of 176322


I0531 15:47:25.849345 139844217653120 run_classifier.py:774] Writing example 10000 of 176322


INFO:tensorflow:Writing example 20000 of 176322


I0531 15:47:34.539940 139844217653120 run_classifier.py:774] Writing example 20000 of 176322


INFO:tensorflow:Writing example 30000 of 176322


I0531 15:47:43.256770 139844217653120 run_classifier.py:774] Writing example 30000 of 176322


INFO:tensorflow:Writing example 40000 of 176322


I0531 15:47:52.219078 139844217653120 run_classifier.py:774] Writing example 40000 of 176322


INFO:tensorflow:Writing example 50000 of 176322


I0531 15:48:00.763901 139844217653120 run_classifier.py:774] Writing example 50000 of 176322


INFO:tensorflow:Writing example 60000 of 176322


I0531 15:48:09.356563 139844217653120 run_classifier.py:774] Writing example 60000 of 176322


INFO:tensorflow:Writing example 70000 of 176322


I0531 15:48:18.667152 139844217653120 run_classifier.py:774] Writing example 70000 of 176322


INFO:tensorflow:Writing example 80000 of 176322


I0531 15:48:28.461644 139844217653120 run_classifier.py:774] Writing example 80000 of 176322


INFO:tensorflow:Writing example 90000 of 176322


I0531 15:48:37.748684 139844217653120 run_classifier.py:774] Writing example 90000 of 176322


INFO:tensorflow:Writing example 100000 of 176322


I0531 15:48:46.937840 139844217653120 run_classifier.py:774] Writing example 100000 of 176322


INFO:tensorflow:Writing example 110000 of 176322


I0531 15:48:55.940832 139844217653120 run_classifier.py:774] Writing example 110000 of 176322


INFO:tensorflow:Writing example 120000 of 176322


I0531 15:49:04.464614 139844217653120 run_classifier.py:774] Writing example 120000 of 176322


INFO:tensorflow:Writing example 130000 of 176322


I0531 15:49:13.044342 139844217653120 run_classifier.py:774] Writing example 130000 of 176322


INFO:tensorflow:Writing example 140000 of 176322


I0531 15:49:21.565050 139844217653120 run_classifier.py:774] Writing example 140000 of 176322


INFO:tensorflow:Writing example 150000 of 176322


I0531 15:49:30.153653 139844217653120 run_classifier.py:774] Writing example 150000 of 176322


INFO:tensorflow:Writing example 160000 of 176322


I0531 15:49:39.277472 139844217653120 run_classifier.py:774] Writing example 160000 of 176322


INFO:tensorflow:Writing example 170000 of 176322


I0531 15:49:47.824938 139844217653120 run_classifier.py:774] Writing example 170000 of 176322


INFO:tensorflow:Writing example 0 of 3742


I0531 15:49:53.196523 139844217653120 run_classifier.py:774] Writing example 0 of 3742


INFO:tensorflow:*** Example ***


I0531 15:49:53.200554 139844217653120 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0531 15:49:53.204488 139844217653120 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] cindy cooked and cleaned every single day . she grew tired of doing all the cooking and cleaning by herself . cindy said she was not going to cook until she got help cleaning . the family had beans every night until they realized she was serious . [SEP] the family agreed to sell their house . [SEP]


I0531 15:49:53.208304 139844217653120 run_classifier.py:464] tokens: [CLS] cindy cooked and cleaned every single day . she grew tired of doing all the cooking and cleaning by herself . cindy said she was not going to cook until she got help cleaning . the family had beans every night until they realized she was serious . [SEP] the family agreed to sell their house . [SEP]


INFO:tensorflow:input_ids: 101 15837 12984 1998 12176 2296 2309 2154 1012 2016 3473 5458 1997 2725 2035 1996 8434 1998 9344 2011 2841 1012 15837 2056 2016 2001 2025 2183 2000 5660 2127 2016 2288 2393 9344 1012 1996 2155 2018 13435 2296 2305 2127 2027 3651 2016 2001 3809 1012 102 1996 2155 3530 2000 5271 2037 2160 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:49:53.211390 139844217653120 run_classifier.py:465] input_ids: 101 15837 12984 1998 12176 2296 2309 2154 1012 2016 3473 5458 1997 2725 2035 1996 8434 1998 9344 2011 2841 1012 15837 2056 2016 2001 2025 2183 2000 5660 2127 2016 2288 2393 9344 1012 1996 2155 2018 13435 2296 2305 2127 2027 3651 2016 2001 3809 1012 102 1996 2155 3530 2000 5271 2037 2160 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:49:53.214363 139844217653120 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:49:53.217295 139844217653120 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0531 15:49:53.220004 139844217653120 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0531 15:49:53.224477 139844217653120 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0531 15:49:53.227005 139844217653120 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] juan wanted to make ta ##cos for his family tonight . he went to the grocery store and bought the ingredients . as he was preparing dinner he realized he forgot the tor ##till ##as ! he wanted to go back to the store but it was closed . [SEP] juan went back to the store . [SEP]


I0531 15:49:53.229676 139844217653120 run_classifier.py:464] tokens: [CLS] juan wanted to make ta ##cos for his family tonight . he went to the grocery store and bought the ingredients . as he was preparing dinner he realized he forgot the tor ##till ##as ! he wanted to go back to the store but it was closed . [SEP] juan went back to the store . [SEP]


INFO:tensorflow:input_ids: 101 5348 2359 2000 2191 11937 13186 2005 2010 2155 3892 1012 2002 2253 2000 1996 13025 3573 1998 4149 1996 12760 1012 2004 2002 2001 8225 4596 2002 3651 2002 9471 1996 17153 28345 3022 999 2002 2359 2000 2175 2067 2000 1996 3573 2021 2009 2001 2701 1012 102 5348 2253 2067 2000 1996 3573 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:49:53.232496 139844217653120 run_classifier.py:465] input_ids: 101 5348 2359 2000 2191 11937 13186 2005 2010 2155 3892 1012 2002 2253 2000 1996 13025 3573 1998 4149 1996 12760 1012 2004 2002 2001 8225 4596 2002 3651 2002 9471 1996 17153 28345 3022 999 2002 2359 2000 2175 2067 2000 1996 3573 2021 2009 2001 2701 1012 102 5348 2253 2067 2000 1996 3573 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:49:53.235316 139844217653120 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:49:53.237743 139844217653120 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0531 15:49:53.240486 139844217653120 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0531 15:49:53.244987 139844217653120 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0531 15:49:53.247733 139844217653120 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] amy was excited it was christmas eve . she had been very good all year . she gave the list of her favorite toys to santa at the mall . he told her she would receive everything she wanted . [SEP] amy did not like christmas at all . [SEP]


I0531 15:49:53.250134 139844217653120 run_classifier.py:464] tokens: [CLS] amy was excited it was christmas eve . she had been very good all year . she gave the list of her favorite toys to santa at the mall . he told her she would receive everything she wanted . [SEP] amy did not like christmas at all . [SEP]


INFO:tensorflow:input_ids: 101 6864 2001 7568 2009 2001 4234 6574 1012 2016 2018 2042 2200 2204 2035 2095 1012 2016 2435 1996 2862 1997 2014 5440 10899 2000 4203 2012 1996 6670 1012 2002 2409 2014 2016 2052 4374 2673 2016 2359 1012 102 6864 2106 2025 2066 4234 2012 2035 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:49:53.252932 139844217653120 run_classifier.py:465] input_ids: 101 6864 2001 7568 2009 2001 4234 6574 1012 2016 2018 2042 2200 2204 2035 2095 1012 2016 2435 1996 2862 1997 2014 5440 10899 2000 4203 2012 1996 6670 1012 2002 2409 2014 2016 2052 4374 2673 2016 2359 1012 102 6864 2106 2025 2066 4234 2012 2035 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:49:53.255333 139844217653120 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:49:53.257447 139844217653120 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0531 15:49:53.260155 139844217653120 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0531 15:49:53.263460 139844217653120 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0531 15:49:53.266194 139844217653120 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] kelly got a new bracelet yesterday . she was in love with it . yet today when she went shopping , she ended up losing it . she searched everywhere for it . [SEP] try as she might , kelley couldn ' t find her dog . [SEP]


I0531 15:49:53.268993 139844217653120 run_classifier.py:464] tokens: [CLS] kelly got a new bracelet yesterday . she was in love with it . yet today when she went shopping , she ended up losing it . she searched everywhere for it . [SEP] try as she might , kelley couldn ' t find her dog . [SEP]


INFO:tensorflow:input_ids: 101 5163 2288 1037 2047 19688 7483 1012 2016 2001 1999 2293 2007 2009 1012 2664 2651 2043 2016 2253 6023 1010 2016 3092 2039 3974 2009 1012 2016 9022 7249 2005 2009 1012 102 3046 2004 2016 2453 1010 19543 2481 1005 1056 2424 2014 3899 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:49:53.271809 139844217653120 run_classifier.py:465] input_ids: 101 5163 2288 1037 2047 19688 7483 1012 2016 2001 1999 2293 2007 2009 1012 2664 2651 2043 2016 2253 6023 1010 2016 3092 2039 3974 2009 1012 2016 9022 7249 2005 2009 1012 102 3046 2004 2016 2453 1010 19543 2481 1005 1056 2424 2014 3899 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:49:53.274322 139844217653120 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:49:53.277126 139844217653120 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0531 15:49:53.279945 139844217653120 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0531 15:49:53.284348 139844217653120 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0531 15:49:53.287127 139844217653120 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] reilly loved going to the fair . the pet ##ting zoo was her favorite part . she had fed a small goat before feeding the cow . the goat was trying to get more food from her but she ignored it . [SEP] reilly fed the goat extra food . [SEP]


I0531 15:49:53.289580 139844217653120 run_classifier.py:464] tokens: [CLS] reilly loved going to the fair . the pet ##ting zoo was her favorite part . she had fed a small goat before feeding the cow . the goat was trying to get more food from her but she ignored it . [SEP] reilly fed the goat extra food . [SEP]


INFO:tensorflow:input_ids: 101 13875 3866 2183 2000 1996 4189 1012 1996 9004 3436 9201 2001 2014 5440 2112 1012 2016 2018 7349 1037 2235 13555 2077 8521 1996 11190 1012 1996 13555 2001 2667 2000 2131 2062 2833 2013 2014 2021 2016 6439 2009 1012 102 13875 7349 1996 13555 4469 2833 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:49:53.292380 139844217653120 run_classifier.py:465] input_ids: 101 13875 3866 2183 2000 1996 4189 1012 1996 9004 3436 9201 2001 2014 5440 2112 1012 2016 2018 7349 1037 2235 13555 2077 8521 1996 11190 1012 1996 13555 2001 2667 2000 2131 2062 2833 2013 2014 2021 2016 6439 2009 1012 102 13875 7349 1996 13555 4469 2833 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:49:53.294851 139844217653120 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 15:49:53.297663 139844217653120 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0531 15:49:53.300471 139844217653120 run_classifier.py:468] label: 0 (id = 0)


#Creating a model

Now that we've prepared our data, let's focus on building a model. `create_model` does just this below. First, it loads the BERT tf hub module again (this time to extract the computation graph). Next, it creates a single new layer that will be trained to adapt BERT to our sentiment task (i.e. classifying whether a movie review is positive or negative). This strategy of using a mostly trained model is called [fine-tuning](http://wiki.fast.ai/index.php/Fine_tuning).

In [0]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)


Next we'll wrap our model function in a `model_fn_builder` function that adapts our model to work for training, evaluation, and prediction.

In [0]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        f1_score = tf.contrib.metrics.f1_score(
            label_ids,
            predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        return {
            "eval_accuracy": accuracy,
            "f1_score": f1_score,
            "auc": auc,
            "precision": precision,
            "recall": recall,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
        }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


In [0]:
# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 500
SAVE_SUMMARY_STEPS = 100

In [0]:
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [0]:
# Specify outpit directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [28]:
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})

INFO:tensorflow:Using config: {'_model_dir': 'bert_story_cloze_aug', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f2f8947f940>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


I0531 15:49:57.178346 139844217653120 estimator.py:201] Using config: {'_model_dir': 'bert_story_cloze_aug', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f2f8947f940>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


Next we create an input builder function that takes our training feature set (`train_features`) and produces a generator. This is a pretty standard design pattern for working with Tensorflow [Estimators](https://www.tensorflow.org/guide/estimators).

In [0]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

Now we train our model! For me, using a Colab notebook running on Google's GPUs, my training time was about 14 minutes.

In [0]:
print(f'Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!
INFO:tensorflow:Calling model_fn.


I0531 15:51:30.046745 139844217653120 estimator.py:1111] Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0531 15:51:33.072453 139844217653120 saver.py:1483] Saver not created because there are no variables in the graph to restore


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


W0531 15:51:33.188959 139844217653120 deprecation.py:506] From <ipython-input-23-ca03218f28a6>:34: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


W0531 15:51:33.232484 139844217653120 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/learning_rate_decay_v2.py:321: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


Instructions for updating:
Use tf.cast instead.


W0531 15:51:33.308983 139844217653120 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


Instructions for updating:
Use tf.cast instead.


W0531 15:51:42.381572 139844217653120 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/metrics_impl.py:455: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.



For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

INFO:tensorflow:Done calling model_fn.


I0531 15:51:44.648185 139844217653120 estimator.py:1113] Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


I0531 15:51:44.651416 139844217653120 basic_session_run_hooks.py:527] Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


I0531 15:51:51.515347 139844217653120 monitored_session.py:222] Graph was finalized.


INFO:tensorflow:Running local_init_op.


I0531 15:51:56.095332 139844217653120 session_manager.py:491] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0531 15:51:56.302936 139844217653120 session_manager.py:493] Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into bert_story_cloze_aug/model.ckpt.


I0531 15:53:13.719104 139844217653120 basic_session_run_hooks.py:594] Saving checkpoints for 0 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:loss = 1.0121429, step = 0


I0531 15:53:37.222926 139844217653120 basic_session_run_hooks.py:249] loss = 1.0121429, step = 0


INFO:tensorflow:global_step/sec: 0.998215


I0531 15:55:17.401101 139844217653120 basic_session_run_hooks.py:680] global_step/sec: 0.998215


INFO:tensorflow:loss = 0.6549957, step = 100 (100.183 sec)


I0531 15:55:17.406113 139844217653120 basic_session_run_hooks.py:247] loss = 0.6549957, step = 100 (100.183 sec)


INFO:tensorflow:global_step/sec: 1.11837


I0531 15:56:46.816619 139844217653120 basic_session_run_hooks.py:680] global_step/sec: 1.11837


INFO:tensorflow:loss = 0.19092628, step = 200 (89.413 sec)


I0531 15:56:46.819012 139844217653120 basic_session_run_hooks.py:247] loss = 0.19092628, step = 200 (89.413 sec)


INFO:tensorflow:global_step/sec: 1.11825


I0531 15:58:16.242201 139844217653120 basic_session_run_hooks.py:680] global_step/sec: 1.11825


INFO:tensorflow:loss = 0.24012099, step = 300 (89.428 sec)


I0531 15:58:16.247165 139844217653120 basic_session_run_hooks.py:247] loss = 0.24012099, step = 300 (89.428 sec)


INFO:tensorflow:global_step/sec: 1.11712


I0531 15:59:45.758171 139844217653120 basic_session_run_hooks.py:680] global_step/sec: 1.11712


INFO:tensorflow:loss = 0.07342526, step = 400 (89.514 sec)


I0531 15:59:45.761642 139844217653120 basic_session_run_hooks.py:247] loss = 0.07342526, step = 400 (89.514 sec)


INFO:tensorflow:Saving checkpoints for 500 into bert_story_cloze_aug/model.ckpt.


I0531 16:01:14.516445 139844217653120 basic_session_run_hooks.py:594] Saving checkpoints for 500 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 0.99464


I0531 16:01:26.297097 139844217653120 basic_session_run_hooks.py:680] global_step/sec: 0.99464


INFO:tensorflow:loss = 0.06010526, step = 500 (100.540 sec)


I0531 16:01:26.301294 139844217653120 basic_session_run_hooks.py:247] loss = 0.06010526, step = 500 (100.540 sec)


INFO:tensorflow:global_step/sec: 1.11581


I0531 16:02:55.918133 139844217653120 basic_session_run_hooks.py:680] global_step/sec: 1.11581


INFO:tensorflow:loss = 0.012843341, step = 600 (89.619 sec)


I0531 16:02:55.920779 139844217653120 basic_session_run_hooks.py:247] loss = 0.012843341, step = 600 (89.619 sec)


INFO:tensorflow:global_step/sec: 1.1184


I0531 16:04:25.331443 139844217653120 basic_session_run_hooks.py:680] global_step/sec: 1.1184


INFO:tensorflow:loss = 0.06416676, step = 700 (89.415 sec)


I0531 16:04:25.335968 139844217653120 basic_session_run_hooks.py:247] loss = 0.06416676, step = 700 (89.415 sec)


INFO:tensorflow:global_step/sec: 1.11981


I0531 16:05:54.632420 139844217653120 basic_session_run_hooks.py:680] global_step/sec: 1.11981


INFO:tensorflow:loss = 0.05096917, step = 800 (89.306 sec)


I0531 16:05:54.641511 139844217653120 basic_session_run_hooks.py:247] loss = 0.05096917, step = 800 (89.306 sec)


INFO:tensorflow:global_step/sec: 1.115


I0531 16:07:24.318254 139844217653120 basic_session_run_hooks.py:680] global_step/sec: 1.115


INFO:tensorflow:loss = 0.13138919, step = 900 (89.685 sec)


I0531 16:07:24.326045 139844217653120 basic_session_run_hooks.py:247] loss = 0.13138919, step = 900 (89.685 sec)


INFO:tensorflow:Saving checkpoints for 1000 into bert_story_cloze_aug/model.ckpt.


I0531 16:08:53.003407 139844217653120 basic_session_run_hooks.py:594] Saving checkpoints for 1000 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 0.98713


I0531 16:09:05.622053 139844217653120 basic_session_run_hooks.py:680] global_step/sec: 0.98713


INFO:tensorflow:loss = 0.08120134, step = 1000 (101.304 sec)


I0531 16:09:05.630449 139844217653120 basic_session_run_hooks.py:247] loss = 0.08120134, step = 1000 (101.304 sec)


INFO:tensorflow:global_step/sec: 1.11472


I0531 16:10:35.330670 139844217653120 basic_session_run_hooks.py:680] global_step/sec: 1.11472


INFO:tensorflow:loss = 0.2777952, step = 1100 (89.703 sec)


I0531 16:10:35.333176 139844217653120 basic_session_run_hooks.py:247] loss = 0.2777952, step = 1100 (89.703 sec)


INFO:tensorflow:global_step/sec: 1.11733


I0531 16:12:04.829961 139844217653120 basic_session_run_hooks.py:680] global_step/sec: 1.11733


INFO:tensorflow:loss = 0.06410207, step = 1200 (89.503 sec)


I0531 16:12:04.836594 139844217653120 basic_session_run_hooks.py:247] loss = 0.06410207, step = 1200 (89.503 sec)


INFO:tensorflow:global_step/sec: 1.11538


I0531 16:13:34.485738 139844217653120 basic_session_run_hooks.py:680] global_step/sec: 1.11538


INFO:tensorflow:loss = 0.11081226, step = 1300 (89.654 sec)


I0531 16:13:34.490500 139844217653120 basic_session_run_hooks.py:247] loss = 0.11081226, step = 1300 (89.654 sec)


INFO:tensorflow:global_step/sec: 1.11718


I0531 16:15:03.996627 139844217653120 basic_session_run_hooks.py:680] global_step/sec: 1.11718


INFO:tensorflow:loss = 0.12819085, step = 1400 (89.511 sec)


I0531 16:15:04.001394 139844217653120 basic_session_run_hooks.py:247] loss = 0.12819085, step = 1400 (89.511 sec)


INFO:tensorflow:Saving checkpoints for 1500 into bert_story_cloze_aug/model.ckpt.


I0531 16:16:32.519145 139844217653120 basic_session_run_hooks.py:594] Saving checkpoints for 1500 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 0.992411


I0531 16:16:44.761348 139844217653120 basic_session_run_hooks.py:680] global_step/sec: 0.992411


INFO:tensorflow:loss = 0.199106, step = 1500 (100.763 sec)


I0531 16:16:44.764249 139844217653120 basic_session_run_hooks.py:247] loss = 0.199106, step = 1500 (100.763 sec)


INFO:tensorflow:global_step/sec: 1.11337


I0531 16:18:14.578786 139844217653120 basic_session_run_hooks.py:680] global_step/sec: 1.11337


INFO:tensorflow:loss = 0.044776462, step = 1600 (89.817 sec)


I0531 16:18:14.581005 139844217653120 basic_session_run_hooks.py:247] loss = 0.044776462, step = 1600 (89.817 sec)


INFO:tensorflow:global_step/sec: 1.11837


I0531 16:19:43.994436 139844217653120 basic_session_run_hooks.py:680] global_step/sec: 1.11837


INFO:tensorflow:loss = 0.17419273, step = 1700 (89.420 sec)


I0531 16:19:44.000655 139844217653120 basic_session_run_hooks.py:247] loss = 0.17419273, step = 1700 (89.420 sec)


INFO:tensorflow:global_step/sec: 1.11727


I0531 16:21:13.497916 139844217653120 basic_session_run_hooks.py:680] global_step/sec: 1.11727


INFO:tensorflow:loss = 0.15598023, step = 1800 (89.500 sec)


I0531 16:21:13.500357 139844217653120 basic_session_run_hooks.py:247] loss = 0.15598023, step = 1800 (89.500 sec)


INFO:tensorflow:global_step/sec: 1.11616


I0531 16:22:43.090841 139844217653120 basic_session_run_hooks.py:680] global_step/sec: 1.11616


INFO:tensorflow:loss = 0.16317563, step = 1900 (89.597 sec)


I0531 16:22:43.097181 139844217653120 basic_session_run_hooks.py:247] loss = 0.16317563, step = 1900 (89.597 sec)


Now let's use our test data to see how well our model did:

In [0]:
test_input_fn = bert.run_classifier.input_fn_builder(
    features=test_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

In [0]:
estimator.evaluate(input_fn=test_input_fn, steps=None)