<a href="https://colab.research.google.com/github/graulef/bert/blob/master/Predicting_Story_Cloze_with_BERT_random_nn_only.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
# Copyright 2019 Google Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#Predicting Story Cloze task with BERT on TF Hub

If you’ve been following Natural Language Processing over the past year, you’ve probably heard of BERT: Bidirectional Encoder Representations from Transformers. It’s a neural network architecture designed by Google researchers that’s totally transformed what’s state-of-the-art for NLP tasks, like text classification, translation, summarization, and question answering.

Now that BERT's been added to [TF Hub](https://www.tensorflow.org/hub) as a loadable module, it's easy(ish) to add into existing Tensorflow text pipelines. In an existing pipeline, BERT can replace text embedding layers like ELMO and GloVE. Alternatively, [finetuning](http://wiki.fast.ai/index.php/Fine_tuning) BERT can provide both an accuracy boost and faster training time in many cases.

In [2]:
!pip list | grep tensorflow
!python --version

mesh-tensorflow          0.0.5                
tensorflow               1.13.1               
tensorflow-estimator     1.13.0               
tensorflow-hub           0.4.0                
tensorflow-metadata      0.13.0               
tensorflow-probability   0.6.0                
Python 3.6.7


In [3]:
from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime

import os
cwd = os.getcwd()
print(cwd)

W0531 22:10:42.987919 139625747429248 __init__.py:56] Some hub symbols are not available because TensorFlow version is less than 1.14


/content


In addition to the standard libraries we imported above, we'll need to install BERT's python package.

In [4]:
!pip install bert-tensorflow

Collecting bert-tensorflow
[?25l  Downloading https://files.pythonhosted.org/packages/a6/66/7eb4e8b6ea35b7cc54c322c816f976167a43019750279a8473d355800a93/bert_tensorflow-1.0.1-py2.py3-none-any.whl (67kB)
[K     |████████████████████████████████| 71kB 3.4MB/s 
Installing collected packages: bert-tensorflow
Successfully installed bert-tensorflow-1.0.1


In [0]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization

Below, we'll set an output directory location to store our model output and checkpoints. This can be a local directory, in which case you'd set OUTPUT_DIR to the name of the directory you'd like to create. If you're running this code in Google's hosted Colab, the directory won't persist after the Colab session ends.

Alternatively, if you're a GCP user, you can store output in a GCP bucket. To do that, set a directory name in OUTPUT_DIR and the name of the GCP bucket in the BUCKET field.

Set DO_DELETE to rewrite the OUTPUT_DIR if it exists. Otherwise, Tensorflow will load existing model checkpoints from that directory (if they exist).

In [6]:
# Set the output directory for saving model file
# Optionally, set a GCP bucket location

OUTPUT_DIR = 'bert_story_cloze_aug'#@param {type:"string"}
#@markdown Whether or not to clear/delete the directory and create a new one
DO_DELETE = False #@param {type:"boolean"}

print('***** Model output directory: {} *****'.format(OUTPUT_DIR))


***** Model output directory: bert_story_cloze_aug *****


#Data

In [0]:
from tensorflow import keras
import os
import re
import csv

PATH_EVAL_DATA = "http://felix.graule.ch/wp-content/uploads/2019/05/cloze_test_val_spring2016.csv"
PATH_SENT_NN_DATA = "http://felix.graule.ch/wp-content/uploads/2019/05/train_stories_nearest_story_sent2vec_combined.csv"
PATH_RAND_NN_DATA = "http://felix.graule.ch/wp-content/uploads/2019/05/train_stories_rand_combined.csv"
#PATH_EVAL_DATA = "glue_data/StoryCloze/cloze_test_val_spring2016.csv"
#PATH_RAND_NN_DATA = "glue_data/StoryCloze/train_stories_rand_combined.csv"
#PATH_SENT_NN_DATA = "glue_data/StoryCloze/train_stories_nearest_story_sent2vec_combined.csv"

# Load all files from a directory in a DataFrame.
def load_data(path):
  data_1 = {}
  data_1["label"] = []
  data_1["id_1"] = []
  data_1["id_2"] = []
  data_1["context"] = []
  data_1["ending"] = []
  
  data_2 = {}
  data_2["label"] = []
  data_2["id_1"] = []
  data_2["id_2"] = []
  data_2["context"] = []
  data_2["ending"] = []
  
  print(path)
  with open(path) as f:
    csv_reader = csv.reader(f, delimiter=',')
    line_count = 0
    for row in csv_reader:
      if line_count == 0:
        #print("Columns = " + str(row))
        line_count += 1
      else:
        line_count += 1
        
        # Create two lines from one in order to have same label layout as 
        # MRPC task
        seperator = ' '
        data_1["id_1"].append(row[0])
        data_1["id_2"].append(row[0] + "_end_bli")
        data_1["context"].append(str(seperator.join(row[1:5])))
        
        data_2["id_1"].append(row[0])
        data_2["id_2"].append(row[0] + "_end_bla")
        data_2["context"].append(str(seperator.join(row[1:5])))
        
        if row[7] == "1": # First ending is the correct one
          data_1["ending"].append(row[5])
          data_1["label"].append(1)
          data_2["ending"].append(row[6])
          data_2["label"].append(0)
        else: # Second ending is the correct one
          data_1["ending"].append(row[6])
          data_1["label"].append(1)
          data_2["ending"].append(row[5])
          data_2["label"].append(0) 
          
    data_df_1 = pd.DataFrame.from_dict(data_1)
    data_df_2 = pd.DataFrame.from_dict(data_2)
    data = pd.concat([data_df_1, data_df_2])      
    return data     

# Merge positive and negative examples, add a polarity column and shuffle.
def load_validation_only(eval_file):
    eval_data_df = load_data(eval_file)
    total_eval = eval_data_df.shape[0]
    eval_split = 0.3
    eval_num = int(total_eval * eval_split)
    eval_data_df = eval_data_df.sample(frac=1).reset_index(drop=True)
    test_df = eval_data_df.iloc[:eval_num, :]
    train_df = eval_data_df.iloc[eval_num:, :]
    return train_df, test_df

def load_augmented(eval_file, random_nn_file, sent_nn_file, ):
    eval_data_df = load_data(eval_file)
    total_eval = eval_data_df.shape[0]
    eval_split = 0
    eval_data_df = eval_data_df.sample(frac=1).reset_index(drop=True)
    # Eval split defines the ratio of data going into the training set
    #train_df = eval_data_df.iloc[:int(total_eval * eval_split), :]
    # The rest of the validation data is used as test set
    test_df = eval_data_df.iloc[int(total_eval * eval_split):, :]   
    
    random_nn_df = load_data(random_nn_file)
    random_nn_df = random_nn_df.sample(frac=1).reset_index(drop=True)
    total_random_nn = random_nn_df.shape[0]
    random_nn_df.reset_index(drop=True)
    train_df = pd.DataFrame()
    random_nn_split = 1
    ext_df = random_nn_df.iloc[:int(total_random_nn * random_nn_split), :]
    train_df = train_df.append(ext_df, ignore_index=True)
    
    #sent_nn_split = 7/10
    #sent_nn_df = load_data(sent_nn_file)
    #sent_nn_df = sent_nn_df.sample(frac=1).reset_index(drop=True)
    #total_sent_nn = sent_nn_df.shape[0]
    #sent_nn_df.reset_index(drop=True)
    #ext_df = sent_nn_df.iloc[:int(total_sent_nn * sent_nn_split), :]
    #train_df = train_df.append(ext_df, ignore_index=True)
    
    return train_df, test_df

# Download and process the dataset files.
def download_and_load_eval_datasets(force_download=False):
  validation = tf.keras.utils.get_file(
      fname="validation", 
      origin=PATH_EVAL_DATA)
  random_nn = tf.keras.utils.get_file(
    fname="rand_nn", 
    origin=PATH_RAND_NN_DATA)
  sent_nn = tf.keras.utils.get_file(
    fname="sent_nn", 
    origin=PATH_SENT_NN_DATA)

  #train_df, test_df = load_validation_only(validation)
  train_df, test_df = load_augmented(validation, random_nn, sent_nn)
  
  return train_df, test_df


In [8]:
train, test = download_and_load_eval_datasets()

print("\nTrain data")
print(train.shape)
for i in range(5):
  print(train.iloc[i]['label'])
  print(train.iloc[i]['context'])
  print(train.iloc[i]['ending'])

print("\nTest data")
print(test.shape)
for i in range(5):
  print(test.iloc[i]['label'])
  print(test.iloc[i]['context'])
  print(test.iloc[i]['ending'])

Downloading data from http://felix.graule.ch/wp-content/uploads/2019/05/cloze_test_val_spring2016.csv
Downloading data from http://felix.graule.ch/wp-content/uploads/2019/05/train_stories_rand_combined.csv
Downloading data from http://felix.graule.ch/wp-content/uploads/2019/05/train_stories_nearest_story_sent2vec_combined.csv
/root/.keras/datasets/validation
/root/.keras/datasets/rand_nn

Train data
(176322, 5)
0
She knew she was a failure. Her parents always told her so. She tried hard sometimes, but it did no good. They continued to berate and judge her.
They began to perform together.
1
Dex ordered a new computer. When he got to the store to pick it up, there was a huge line. Apparently, a lot of other people ordered the same one. He had to wait to pay.
When it was his turn, he was told it was sold out.
1
Kelly loved music festivals. Every year she would find a new festival to attend. At work, she discovered that one of her co-workers also loved music. Kelly invited her to go to a f

Quick check whether dataset are fully disjoint (takes really long obviously)


In [0]:
train.shape, test.shape
for j in range(10):
    query = train.iloc[j]['ending']
    for i in range(test.shape[0]):
      tmp = test.iloc[i]['ending']
      if tmp == query:
        print("Found something equal")
        print(tmp)

For us, our input data are the 'context' and 'ending' column and our label is the 'label' column (0, 1 for negative and positive, respecitvely)

In [0]:
CONTEXT_COLUMN = 'context'
ENDING_COLUMN = 'ending'
LABEL_COLUMN = 'label'
label_list = [0, 1]

#Data Preprocessing
We'll need to transform our data into a format BERT understands. This involves two steps. First, we create  `InputExample`'s using the constructor provided in the BERT library.

- `text_a` is the text we want to classify, which in this case, is the `Request` field in our Dataframe. For us, this is the context of the story.
- `text_b` is used if we're training a model to understand the relationship between sentences (i.e. is `text_b` a translation of `text_a`? Is `text_b` an answer to the question asked by `text_a`?). This is the ending in our case
- `label` is the label for our example, i.e. True, False

In [11]:
# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this example
                                                                   text_a = x[CONTEXT_COLUMN], 
                                                                   text_b = x[ENDING_COLUMN], 
                                                                   label = x[LABEL_COLUMN]), axis = 1)
print(train_InputExamples.shape)
test_InputExamples = test.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                                   text_a = x[CONTEXT_COLUMN], 
                                                                   text_b = x[ENDING_COLUMN], 
                                                                   label = x[LABEL_COLUMN]), axis = 1)
print(test_InputExamples.shape)

(176322,)
(3742,)


Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (but don't worry--this is also included in the Python library):


1. Lowercase our text (if we're using a BERT lowercase model)
2. Tokenize it (i.e. "sally says hi" -> ["sally", "says", "hi"])
3. Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
4. Map our words to indexes using a vocab file that BERT provides
5. Add special "CLS" and "SEP" tokens (see the [readme](https://github.com/google-research/bert))
6. Append "index" and "segment" tokens to each input (see the [BERT paper](https://arxiv.org/pdf/1810.04805.pdf))

Happily, we don't have to worry about most of these details.




To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:

In [12]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

Instructions for updating:
Colocations handled automatically by placer.


W0531 22:11:31.814733 139625747429248 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py:3632: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0531 22:11:34.009889 139625747429248 saver.py:1483] Saver not created because there are no variables in the graph to restore


Great--we just learned that the BERT model we're using expects lowercase data (that's what stored in tokenization_info["do_lower_case"]) and we also loaded BERT's vocab file. We also created a tokenizer, which breaks words into word pieces:

In [13]:
tokenizer.tokenize("This here's an example of using the BERT tokenizer")

['this',
 'here',
 "'",
 's',
 'an',
 'example',
 'of',
 'using',
 'the',
 'bert',
 'token',
 '##izer']

Using our tokenizer, we'll call `run_classifier.convert_examples_to_features` on our InputExamples to convert them into features BERT understands.

In [14]:
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128
# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)

INFO:tensorflow:Writing example 0 of 176322


I0531 22:11:40.123496 139625747429248 run_classifier.py:774] Writing example 0 of 176322


INFO:tensorflow:*** Example ***


I0531 22:11:40.130757 139625747429248 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0531 22:11:40.135513 139625747429248 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] she knew she was a failure . her parents always told her so . she tried hard sometimes , but it did no good . they continued to be ##rate and judge her . [SEP] they began to perform together . [SEP]


I0531 22:11:40.139025 139625747429248 run_classifier.py:464] tokens: [CLS] she knew she was a failure . her parents always told her so . she tried hard sometimes , but it did no good . they continued to be ##rate and judge her . [SEP] they began to perform together . [SEP]


INFO:tensorflow:input_ids: 101 2016 2354 2016 2001 1037 4945 1012 2014 3008 2467 2409 2014 2061 1012 2016 2699 2524 2823 1010 2021 2009 2106 2053 2204 1012 2027 2506 2000 2022 11657 1998 3648 2014 1012 102 2027 2211 2000 4685 2362 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:11:40.142158 139625747429248 run_classifier.py:465] input_ids: 101 2016 2354 2016 2001 1037 4945 1012 2014 3008 2467 2409 2014 2061 1012 2016 2699 2524 2823 1010 2021 2009 2106 2053 2204 1012 2027 2506 2000 2022 11657 1998 3648 2014 1012 102 2027 2211 2000 4685 2362 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:11:40.146185 139625747429248 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:11:40.149075 139625747429248 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0531 22:11:40.151918 139625747429248 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0531 22:11:40.155651 139625747429248 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0531 22:11:40.161975 139625747429248 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] dex ordered a new computer . when he got to the store to pick it up , there was a huge line . apparently , a lot of other people ordered the same one . he had to wait to pay . [SEP] when it was his turn , he was told it was sold out . [SEP]


I0531 22:11:40.166367 139625747429248 run_classifier.py:464] tokens: [CLS] dex ordered a new computer . when he got to the store to pick it up , there was a huge line . apparently , a lot of other people ordered the same one . he had to wait to pay . [SEP] when it was his turn , he was told it was sold out . [SEP]


INFO:tensorflow:input_ids: 101 20647 3641 1037 2047 3274 1012 2043 2002 2288 2000 1996 3573 2000 4060 2009 2039 1010 2045 2001 1037 4121 2240 1012 4593 1010 1037 2843 1997 2060 2111 3641 1996 2168 2028 1012 2002 2018 2000 3524 2000 3477 1012 102 2043 2009 2001 2010 2735 1010 2002 2001 2409 2009 2001 2853 2041 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:11:40.168762 139625747429248 run_classifier.py:465] input_ids: 101 20647 3641 1037 2047 3274 1012 2043 2002 2288 2000 1996 3573 2000 4060 2009 2039 1010 2045 2001 1037 4121 2240 1012 4593 1010 1037 2843 1997 2060 2111 3641 1996 2168 2028 1012 2002 2018 2000 3524 2000 3477 1012 102 2043 2009 2001 2010 2735 1010 2002 2001 2409 2009 2001 2853 2041 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:11:40.172408 139625747429248 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:11:40.177315 139625747429248 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0531 22:11:40.182124 139625747429248 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0531 22:11:40.185689 139625747429248 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0531 22:11:40.193571 139625747429248 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] kelly loved music festivals . every year she would find a new festival to attend . at work , she discovered that one of her co - workers also loved music . kelly invited her to go to a festival with her . [SEP] they went to the festival and had a good time . [SEP]


I0531 22:11:40.198935 139625747429248 run_classifier.py:464] tokens: [CLS] kelly loved music festivals . every year she would find a new festival to attend . at work , she discovered that one of her co - workers also loved music . kelly invited her to go to a festival with her . [SEP] they went to the festival and had a good time . [SEP]


INFO:tensorflow:input_ids: 101 5163 3866 2189 7519 1012 2296 2095 2016 2052 2424 1037 2047 2782 2000 5463 1012 2012 2147 1010 2016 3603 2008 2028 1997 2014 2522 1011 3667 2036 3866 2189 1012 5163 4778 2014 2000 2175 2000 1037 2782 2007 2014 1012 102 2027 2253 2000 1996 2782 1998 2018 1037 2204 2051 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:11:40.203009 139625747429248 run_classifier.py:465] input_ids: 101 5163 3866 2189 7519 1012 2296 2095 2016 2052 2424 1037 2047 2782 2000 5463 1012 2012 2147 1010 2016 3603 2008 2028 1997 2014 2522 1011 3667 2036 3866 2189 1012 5163 4778 2014 2000 2175 2000 1037 2782 2007 2014 1012 102 2027 2253 2000 1996 2782 1998 2018 1037 2204 2051 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:11:40.207257 139625747429248 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:11:40.210814 139625747429248 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0531 22:11:40.217887 139625747429248 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0531 22:11:40.221607 139625747429248 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0531 22:11:40.223505 139625747429248 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] terry loved games . he just had to have the best gaming pc money could buy . so he goes on e ##bay to look for parts . terry buys all the parts for his computer . [SEP] terry builds the ultimate gaming pc . [SEP]


I0531 22:11:40.225849 139625747429248 run_classifier.py:464] tokens: [CLS] terry loved games . he just had to have the best gaming pc money could buy . so he goes on e ##bay to look for parts . terry buys all the parts for his computer . [SEP] terry builds the ultimate gaming pc . [SEP]


INFO:tensorflow:input_ids: 101 6609 3866 2399 1012 2002 2074 2018 2000 2031 1996 2190 10355 7473 2769 2071 4965 1012 2061 2002 3632 2006 1041 15907 2000 2298 2005 3033 1012 6609 23311 2035 1996 3033 2005 2010 3274 1012 102 6609 16473 1996 7209 10355 7473 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:11:40.227793 139625747429248 run_classifier.py:465] input_ids: 101 6609 3866 2399 1012 2002 2074 2018 2000 2031 1996 2190 10355 7473 2769 2071 4965 1012 2061 2002 3632 2006 1041 15907 2000 2298 2005 3033 1012 6609 23311 2035 1996 3033 2005 2010 3274 1012 102 6609 16473 1996 7209 10355 7473 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:11:40.230134 139625747429248 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:11:40.232037 139625747429248 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0531 22:11:40.234232 139625747429248 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0531 22:11:40.236682 139625747429248 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0531 22:11:40.238173 139625747429248 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] my friend joe and i send each other books . i sent him a 266 ##6 , a modern classic . today he sent me two books . one was a book i had sent him . [SEP] after one whole bottle of coconut oil her hair is now det ##ang ##led . [SEP]


I0531 22:11:40.240462 139625747429248 run_classifier.py:464] tokens: [CLS] my friend joe and i send each other books . i sent him a 266 ##6 , a modern classic . today he sent me two books . one was a book i had sent him . [SEP] after one whole bottle of coconut oil her hair is now det ##ang ##led . [SEP]


INFO:tensorflow:input_ids: 101 2026 2767 3533 1998 1045 4604 2169 2060 2808 1012 1045 2741 2032 1037 25162 2575 1010 1037 2715 4438 1012 2651 2002 2741 2033 2048 2808 1012 2028 2001 1037 2338 1045 2018 2741 2032 1012 102 2044 2028 2878 5835 1997 16027 3514 2014 2606 2003 2085 20010 5654 3709 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:11:40.242424 139625747429248 run_classifier.py:465] input_ids: 101 2026 2767 3533 1998 1045 4604 2169 2060 2808 1012 1045 2741 2032 1037 25162 2575 1010 1037 2715 4438 1012 2651 2002 2741 2033 2048 2808 1012 2028 2001 1037 2338 1045 2018 2741 2032 1012 102 2044 2028 2878 5835 1997 16027 3514 2014 2606 2003 2085 20010 5654 3709 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:11:40.244762 139625747429248 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:11:40.246711 139625747429248 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0531 22:11:40.249007 139625747429248 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:Writing example 10000 of 176322


I0531 22:11:48.748644 139625747429248 run_classifier.py:774] Writing example 10000 of 176322


INFO:tensorflow:Writing example 20000 of 176322


I0531 22:11:57.312387 139625747429248 run_classifier.py:774] Writing example 20000 of 176322


INFO:tensorflow:Writing example 30000 of 176322


I0531 22:12:05.838979 139625747429248 run_classifier.py:774] Writing example 30000 of 176322


INFO:tensorflow:Writing example 40000 of 176322


I0531 22:12:14.670226 139625747429248 run_classifier.py:774] Writing example 40000 of 176322


INFO:tensorflow:Writing example 50000 of 176322


I0531 22:12:23.140740 139625747429248 run_classifier.py:774] Writing example 50000 of 176322


INFO:tensorflow:Writing example 60000 of 176322


I0531 22:12:31.757906 139625747429248 run_classifier.py:774] Writing example 60000 of 176322


INFO:tensorflow:Writing example 70000 of 176322


I0531 22:12:40.629657 139625747429248 run_classifier.py:774] Writing example 70000 of 176322


INFO:tensorflow:Writing example 80000 of 176322


I0531 22:12:49.174067 139625747429248 run_classifier.py:774] Writing example 80000 of 176322


INFO:tensorflow:Writing example 90000 of 176322


I0531 22:12:57.629710 139625747429248 run_classifier.py:774] Writing example 90000 of 176322


INFO:tensorflow:Writing example 100000 of 176322


I0531 22:13:06.128656 139625747429248 run_classifier.py:774] Writing example 100000 of 176322


INFO:tensorflow:Writing example 110000 of 176322


I0531 22:13:15.059055 139625747429248 run_classifier.py:774] Writing example 110000 of 176322


INFO:tensorflow:Writing example 120000 of 176322


I0531 22:13:23.525838 139625747429248 run_classifier.py:774] Writing example 120000 of 176322


INFO:tensorflow:Writing example 130000 of 176322


I0531 22:13:31.979796 139625747429248 run_classifier.py:774] Writing example 130000 of 176322


INFO:tensorflow:Writing example 140000 of 176322


I0531 22:13:40.465585 139625747429248 run_classifier.py:774] Writing example 140000 of 176322


INFO:tensorflow:Writing example 150000 of 176322


I0531 22:13:48.980864 139625747429248 run_classifier.py:774] Writing example 150000 of 176322


INFO:tensorflow:Writing example 160000 of 176322


I0531 22:13:58.107341 139625747429248 run_classifier.py:774] Writing example 160000 of 176322


INFO:tensorflow:Writing example 170000 of 176322


I0531 22:14:06.686221 139625747429248 run_classifier.py:774] Writing example 170000 of 176322


INFO:tensorflow:Writing example 0 of 3742


I0531 22:14:12.076218 139625747429248 run_classifier.py:774] Writing example 0 of 3742


INFO:tensorflow:*** Example ***


I0531 22:14:12.084744 139625747429248 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0531 22:14:12.087560 139625747429248 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] james loved to challenge himself physically . he decided to begin training for a marathon . james began running every day before work . finally james ran the marathon . [SEP] james was ashamed of what he had done . [SEP]


I0531 22:14:12.089352 139625747429248 run_classifier.py:464] tokens: [CLS] james loved to challenge himself physically . he decided to begin training for a marathon . james began running every day before work . finally james ran the marathon . [SEP] james was ashamed of what he had done . [SEP]


INFO:tensorflow:input_ids: 101 2508 3866 2000 4119 2370 8186 1012 2002 2787 2000 4088 2731 2005 1037 8589 1012 2508 2211 2770 2296 2154 2077 2147 1012 2633 2508 2743 1996 8589 1012 102 2508 2001 14984 1997 2054 2002 2018 2589 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:14:12.091284 139625747429248 run_classifier.py:465] input_ids: 101 2508 3866 2000 4119 2370 8186 1012 2002 2787 2000 4088 2731 2005 1037 8589 1012 2508 2211 2770 2296 2154 2077 2147 1012 2633 2508 2743 1996 8589 1012 102 2508 2001 14984 1997 2054 2002 2018 2589 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:14:12.094085 139625747429248 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:14:12.096086 139625747429248 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0531 22:14:12.097649 139625747429248 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0531 22:14:12.100388 139625747429248 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0531 22:14:12.102004 139625747429248 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] my children and i visited the local animal shelter . we looked at pup ##pies and kitten ##s in their cages . my daughters picked out a friendly puppy . they played with her and rubbed her belly . [SEP] i decided to buy her a candy bar instead . [SEP]


I0531 22:14:12.103585 139625747429248 run_classifier.py:464] tokens: [CLS] my children and i visited the local animal shelter . we looked at pup ##pies and kitten ##s in their cages . my daughters picked out a friendly puppy . they played with her and rubbed her belly . [SEP] i decided to buy her a candy bar instead . [SEP]


INFO:tensorflow:input_ids: 101 2026 2336 1998 1045 4716 1996 2334 4111 7713 1012 2057 2246 2012 26781 13046 1998 18401 2015 1999 2037 27157 1012 2026 5727 3856 2041 1037 5379 17022 1012 2027 2209 2007 2014 1998 7503 2014 7579 1012 102 1045 2787 2000 4965 2014 1037 9485 3347 2612 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:14:12.105171 139625747429248 run_classifier.py:465] input_ids: 101 2026 2336 1998 1045 4716 1996 2334 4111 7713 1012 2057 2246 2012 26781 13046 1998 18401 2015 1999 2037 27157 1012 2026 5727 3856 2041 1037 5379 17022 1012 2027 2209 2007 2014 1998 7503 2014 7579 1012 102 1045 2787 2000 4965 2014 1037 9485 3347 2612 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:14:12.106849 139625747429248 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:14:12.108433 139625747429248 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0531 22:14:12.109978 139625747429248 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0531 22:14:12.112762 139625747429248 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0531 22:14:12.114363 139625747429248 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] there were six pup ##pies running around my foyer . it was going to be hard to choose the one we wanted to buy . each of my daughters picked out a different pup . the owner said we could get a discount on two dogs . [SEP] we did not like the dogs . [SEP]


I0531 22:14:12.115979 139625747429248 run_classifier.py:464] tokens: [CLS] there were six pup ##pies running around my foyer . it was going to be hard to choose the one we wanted to buy . each of my daughters picked out a different pup . the owner said we could get a discount on two dogs . [SEP] we did not like the dogs . [SEP]


INFO:tensorflow:input_ids: 101 2045 2020 2416 26781 13046 2770 2105 2026 16683 1012 2009 2001 2183 2000 2022 2524 2000 5454 1996 2028 2057 2359 2000 4965 1012 2169 1997 2026 5727 3856 2041 1037 2367 26781 1012 1996 3954 2056 2057 2071 2131 1037 19575 2006 2048 6077 1012 102 2057 2106 2025 2066 1996 6077 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:14:12.117600 139625747429248 run_classifier.py:465] input_ids: 101 2045 2020 2416 26781 13046 2770 2105 2026 16683 1012 2009 2001 2183 2000 2022 2524 2000 5454 1996 2028 2057 2359 2000 4965 1012 2169 1997 2026 5727 3856 2041 1037 2367 26781 1012 1996 3954 2056 2057 2071 2131 1037 19575 2006 2048 6077 1012 102 2057 2106 2025 2066 1996 6077 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:14:12.119223 139625747429248 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:14:12.120838 139625747429248 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0531 22:14:12.122357 139625747429248 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0531 22:14:12.125406 139625747429248 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0531 22:14:12.127167 139625747429248 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] jerry had just learned to ride a bike and was excited to go for a ride . he started down the street near his house when he hit a pot ##hole . his bike stopped immediately and flip him over it . he lay there screaming for help with no one in sight . [SEP] jerry ' s mother took twenty minutes to find him . [SEP]


I0531 22:14:12.128683 139625747429248 run_classifier.py:464] tokens: [CLS] jerry had just learned to ride a bike and was excited to go for a ride . he started down the street near his house when he hit a pot ##hole . his bike stopped immediately and flip him over it . he lay there screaming for help with no one in sight . [SEP] jerry ' s mother took twenty minutes to find him . [SEP]


INFO:tensorflow:input_ids: 101 6128 2018 2074 4342 2000 4536 1037 7997 1998 2001 7568 2000 2175 2005 1037 4536 1012 2002 2318 2091 1996 2395 2379 2010 2160 2043 2002 2718 1037 8962 11484 1012 2010 7997 3030 3202 1998 11238 2032 2058 2009 1012 2002 3913 2045 7491 2005 2393 2007 2053 2028 1999 4356 1012 102 6128 1005 1055 2388 2165 3174 2781 2000 2424 2032 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:14:12.130283 139625747429248 run_classifier.py:465] input_ids: 101 6128 2018 2074 4342 2000 4536 1037 7997 1998 2001 7568 2000 2175 2005 1037 4536 1012 2002 2318 2091 1996 2395 2379 2010 2160 2043 2002 2718 1037 8962 11484 1012 2010 7997 3030 3202 1998 11238 2032 2058 2009 1012 2002 3913 2045 7491 2005 2393 2007 2053 2028 1999 4356 1012 102 6128 1005 1055 2388 2165 3174 2781 2000 2424 2032 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:14:12.131987 139625747429248 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:14:12.133553 139625747429248 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0531 22:14:12.135110 139625747429248 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0531 22:14:12.138108 139625747429248 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0531 22:14:12.139599 139625747429248 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] in the autumn samantha always gets a craving for apple cid ##er . she de ##dicate ##s a saturday to drive to an orchard . samantha walks the trails to enjoy the colorful leaves . she buys a few gallons of cid ##er before she leaves . [SEP] samantha loves the apple cid ##er she gets at the orchard . [SEP]


I0531 22:14:12.141164 139625747429248 run_classifier.py:464] tokens: [CLS] in the autumn samantha always gets a craving for apple cid ##er . she de ##dicate ##s a saturday to drive to an orchard . samantha walks the trails to enjoy the colorful leaves . she buys a few gallons of cid ##er before she leaves . [SEP] samantha loves the apple cid ##er she gets at the orchard . [SEP]


INFO:tensorflow:input_ids: 101 1999 1996 7114 11415 2467 4152 1037 26369 2005 6207 28744 2121 1012 2016 2139 16467 2015 1037 5095 2000 3298 2000 2019 15623 1012 11415 7365 1996 9612 2000 5959 1996 14231 3727 1012 2016 23311 1037 2261 18501 1997 28744 2121 2077 2016 3727 1012 102 11415 7459 1996 6207 28744 2121 2016 4152 2012 1996 15623 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:14:12.142725 139625747429248 run_classifier.py:465] input_ids: 101 1999 1996 7114 11415 2467 4152 1037 26369 2005 6207 28744 2121 1012 2016 2139 16467 2015 1037 5095 2000 3298 2000 2019 15623 1012 11415 7365 1996 9612 2000 5959 1996 14231 3727 1012 2016 23311 1037 2261 18501 1997 28744 2121 2077 2016 3727 1012 102 11415 7459 1996 6207 28744 2121 2016 4152 2012 1996 15623 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:14:12.144322 139625747429248 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0531 22:14:12.145841 139625747429248 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0531 22:14:12.147420 139625747429248 run_classifier.py:468] label: 1 (id = 1)


#Creating a model

Now that we've prepared our data, let's focus on building a model. `create_model` does just this below. First, it loads the BERT tf hub module again (this time to extract the computation graph). Next, it creates a single new layer that will be trained to adapt BERT to our sentiment task (i.e. classifying whether a movie review is positive or negative). This strategy of using a mostly trained model is called [fine-tuning](http://wiki.fast.ai/index.php/Fine_tuning).

In [0]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)


Next we'll wrap our model function in a `model_fn_builder` function that adapts our model to work for training, evaluation, and prediction.

In [0]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        f1_score = tf.contrib.metrics.f1_score(
            label_ids,
            predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        return {
            "eval_accuracy": accuracy,
            "f1_score": f1_score,
            "auc": auc,
            "precision": precision,
            "recall": recall,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
        }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


In [0]:
# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 500
SAVE_SUMMARY_STEPS = 100

In [0]:
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [0]:
# Specify outpit directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [20]:
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})

INFO:tensorflow:Using config: {'_model_dir': 'bert_story_cloze_aug', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7efcaf4a73c8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


I0531 22:14:32.234635 139625747429248 estimator.py:201] Using config: {'_model_dir': 'bert_story_cloze_aug', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7efcaf4a73c8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


Next we create an input builder function that takes our training feature set (`train_features`) and produces a generator. This is a pretty standard design pattern for working with Tensorflow [Estimators](https://www.tensorflow.org/guide/estimators).

In [0]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

Now we train our model! For me, using a Colab notebook running on Google's GPUs, my training time was about 14 minutes.

In [22]:
print(f'Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!
INFO:tensorflow:Calling model_fn.


I0531 22:16:03.625751 139625747429248 estimator.py:1111] Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0531 22:16:06.625663 139625747429248 saver.py:1483] Saver not created because there are no variables in the graph to restore


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


W0531 22:16:06.741134 139625747429248 deprecation.py:506] From <ipython-input-15-ca03218f28a6>:34: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


W0531 22:16:06.784562 139625747429248 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/learning_rate_decay_v2.py:321: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


Instructions for updating:
Use tf.cast instead.


W0531 22:16:06.860460 139625747429248 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


Instructions for updating:
Use tf.cast instead.


W0531 22:16:15.911350 139625747429248 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/metrics_impl.py:455: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.



For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

INFO:tensorflow:Done calling model_fn.


I0531 22:16:18.265988 139625747429248 estimator.py:1113] Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


I0531 22:16:18.273620 139625747429248 basic_session_run_hooks.py:527] Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


I0531 22:16:24.880449 139625747429248 monitored_session.py:222] Graph was finalized.


INFO:tensorflow:Running local_init_op.


I0531 22:16:29.457029 139625747429248 session_manager.py:491] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0531 22:16:29.656388 139625747429248 session_manager.py:493] Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into bert_story_cloze_aug/model.ckpt.


I0531 22:17:41.623758 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 0 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:loss = 0.75546336, step = 0


I0531 22:18:04.849996 139625747429248 basic_session_run_hooks.py:249] loss = 0.75546336, step = 0


INFO:tensorflow:global_step/sec: 1.03855


I0531 22:19:41.137838 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.03855


INFO:tensorflow:loss = 0.43812627, step = 100 (96.290 sec)


I0531 22:19:41.140304 139625747429248 basic_session_run_hooks.py:247] loss = 0.43812627, step = 100 (96.290 sec)


INFO:tensorflow:global_step/sec: 1.16375


I0531 22:21:07.066852 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16375


INFO:tensorflow:loss = 0.2260359, step = 200 (85.929 sec)


I0531 22:21:07.069329 139625747429248 basic_session_run_hooks.py:247] loss = 0.2260359, step = 200 (85.929 sec)


INFO:tensorflow:global_step/sec: 1.16552


I0531 22:22:32.865730 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16552


INFO:tensorflow:loss = 0.22896296, step = 300 (85.801 sec)


I0531 22:22:32.870143 139625747429248 basic_session_run_hooks.py:247] loss = 0.22896296, step = 300 (85.801 sec)


INFO:tensorflow:global_step/sec: 1.16407


I0531 22:23:58.771270 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16407


INFO:tensorflow:loss = 0.07476142, step = 400 (85.907 sec)


I0531 22:23:58.777189 139625747429248 basic_session_run_hooks.py:247] loss = 0.07476142, step = 400 (85.907 sec)


INFO:tensorflow:Saving checkpoints for 500 into bert_story_cloze_aug/model.ckpt.


I0531 22:25:23.710928 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 500 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.03476


I0531 22:25:35.411990 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.03476


INFO:tensorflow:loss = 0.08480102, step = 500 (96.639 sec)


I0531 22:25:35.416293 139625747429248 basic_session_run_hooks.py:247] loss = 0.08480102, step = 500 (96.639 sec)


INFO:tensorflow:global_step/sec: 1.15942


I0531 22:27:01.662253 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15942


INFO:tensorflow:loss = 0.19235316, step = 600 (86.248 sec)


I0531 22:27:01.664590 139625747429248 basic_session_run_hooks.py:247] loss = 0.19235316, step = 600 (86.248 sec)


INFO:tensorflow:global_step/sec: 1.16383


I0531 22:28:27.585772 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16383


INFO:tensorflow:loss = 0.13575065, step = 700 (85.924 sec)


I0531 22:28:27.588941 139625747429248 basic_session_run_hooks.py:247] loss = 0.13575065, step = 700 (85.924 sec)


INFO:tensorflow:global_step/sec: 1.16604


I0531 22:29:53.345916 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16604


INFO:tensorflow:loss = 0.1325691, step = 800 (85.759 sec)


I0531 22:29:53.348366 139625747429248 basic_session_run_hooks.py:247] loss = 0.1325691, step = 800 (85.759 sec)


INFO:tensorflow:global_step/sec: 1.16517


I0531 22:31:19.170040 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16517


INFO:tensorflow:loss = 0.041710038, step = 900 (85.827 sec)


I0531 22:31:19.175634 139625747429248 basic_session_run_hooks.py:247] loss = 0.041710038, step = 900 (85.827 sec)


INFO:tensorflow:Saving checkpoints for 1000 into bert_story_cloze_aug/model.ckpt.


I0531 22:32:44.202732 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 1000 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.02986


I0531 22:32:56.270283 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.02986


INFO:tensorflow:loss = 0.13043782, step = 1000 (97.099 sec)


I0531 22:32:56.274697 139625747429248 basic_session_run_hooks.py:247] loss = 0.13043782, step = 1000 (97.099 sec)


INFO:tensorflow:global_step/sec: 1.15968


I0531 22:34:22.500662 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15968


INFO:tensorflow:loss = 0.108095735, step = 1100 (86.228 sec)


I0531 22:34:22.502676 139625747429248 basic_session_run_hooks.py:247] loss = 0.108095735, step = 1100 (86.228 sec)


INFO:tensorflow:global_step/sec: 1.16422


I0531 22:35:48.394786 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16422


INFO:tensorflow:loss = 0.0853968, step = 1200 (85.896 sec)


I0531 22:35:48.400314 139625747429248 basic_session_run_hooks.py:247] loss = 0.0853968, step = 1200 (85.896 sec)


INFO:tensorflow:global_step/sec: 1.16403


I0531 22:37:14.303343 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16403


INFO:tensorflow:loss = 0.025130477, step = 1300 (85.908 sec)


I0531 22:37:14.306335 139625747429248 basic_session_run_hooks.py:247] loss = 0.025130477, step = 1300 (85.908 sec)


INFO:tensorflow:global_step/sec: 1.16751


I0531 22:38:39.955768 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16751


INFO:tensorflow:loss = 0.09948308, step = 1400 (85.652 sec)


I0531 22:38:39.958279 139625747429248 basic_session_run_hooks.py:247] loss = 0.09948308, step = 1400 (85.652 sec)


INFO:tensorflow:Saving checkpoints for 1500 into bert_story_cloze_aug/model.ckpt.


I0531 22:40:04.806283 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 1500 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.0322


I0531 22:40:16.836301 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.0322


INFO:tensorflow:loss = 0.02064028, step = 1500 (96.885 sec)


I0531 22:40:16.843744 139625747429248 basic_session_run_hooks.py:247] loss = 0.02064028, step = 1500 (96.885 sec)


INFO:tensorflow:global_step/sec: 1.1576


I0531 22:41:43.221712 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.1576


INFO:tensorflow:loss = 0.16512342, step = 1600 (86.381 sec)


I0531 22:41:43.224952 139625747429248 basic_session_run_hooks.py:247] loss = 0.16512342, step = 1600 (86.381 sec)


INFO:tensorflow:global_step/sec: 1.16284


I0531 22:43:09.218312 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16284


INFO:tensorflow:loss = 0.020904899, step = 1700 (85.997 sec)


I0531 22:43:09.221988 139625747429248 basic_session_run_hooks.py:247] loss = 0.020904899, step = 1700 (85.997 sec)


INFO:tensorflow:global_step/sec: 1.16626


I0531 22:44:34.962277 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16626


INFO:tensorflow:loss = 0.019316789, step = 1800 (85.743 sec)


I0531 22:44:34.964681 139625747429248 basic_session_run_hooks.py:247] loss = 0.019316789, step = 1800 (85.743 sec)


INFO:tensorflow:global_step/sec: 1.16294


I0531 22:46:00.951333 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16294


INFO:tensorflow:loss = 0.18930817, step = 1900 (85.989 sec)


I0531 22:46:00.953522 139625747429248 basic_session_run_hooks.py:247] loss = 0.18930817, step = 1900 (85.989 sec)


INFO:tensorflow:Saving checkpoints for 2000 into bert_story_cloze_aug/model.ckpt.


I0531 22:47:25.948014 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 2000 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.02969


I0531 22:47:38.068028 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.02969


INFO:tensorflow:loss = 0.021943705, step = 2000 (97.118 sec)


I0531 22:47:38.071610 139625747429248 basic_session_run_hooks.py:247] loss = 0.021943705, step = 2000 (97.118 sec)


INFO:tensorflow:global_step/sec: 1.15768


I0531 22:49:04.447987 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15768


INFO:tensorflow:loss = 0.38491663, step = 2100 (86.379 sec)


I0531 22:49:04.450397 139625747429248 basic_session_run_hooks.py:247] loss = 0.38491663, step = 2100 (86.379 sec)


INFO:tensorflow:global_step/sec: 1.16375


I0531 22:50:30.377266 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16375


INFO:tensorflow:loss = 0.21293963, step = 2200 (85.929 sec)


I0531 22:50:30.379641 139625747429248 basic_session_run_hooks.py:247] loss = 0.21293963, step = 2200 (85.929 sec)


INFO:tensorflow:global_step/sec: 1.16518


I0531 22:51:56.201162 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16518


INFO:tensorflow:loss = 0.03200271, step = 2300 (85.826 sec)


I0531 22:51:56.205215 139625747429248 basic_session_run_hooks.py:247] loss = 0.03200271, step = 2300 (85.826 sec)


INFO:tensorflow:global_step/sec: 1.16468


I0531 22:53:22.061939 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16468


INFO:tensorflow:loss = 0.29439563, step = 2400 (85.859 sec)


I0531 22:53:22.064307 139625747429248 basic_session_run_hooks.py:247] loss = 0.29439563, step = 2400 (85.859 sec)


INFO:tensorflow:Saving checkpoints for 2500 into bert_story_cloze_aug/model.ckpt.


I0531 22:54:47.031068 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 2500 into bert_story_cloze_aug/model.ckpt.


Instructions for updating:
Use standard file APIs to delete files with this prefix.


W0531 22:54:55.037250 139625747429248 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:966: remove_checkpoint (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to delete files with this prefix.


INFO:tensorflow:global_step/sec: 1.02973


I0531 22:54:59.174721 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.02973


INFO:tensorflow:loss = 0.0774978, step = 2500 (97.116 sec)


I0531 22:54:59.180584 139625747429248 basic_session_run_hooks.py:247] loss = 0.0774978, step = 2500 (97.116 sec)


INFO:tensorflow:global_step/sec: 1.15823


I0531 22:56:25.513313 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15823


INFO:tensorflow:loss = 0.003976153, step = 2600 (86.336 sec)


I0531 22:56:25.516783 139625747429248 basic_session_run_hooks.py:247] loss = 0.003976153, step = 2600 (86.336 sec)


INFO:tensorflow:global_step/sec: 1.16399


I0531 22:57:51.424580 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16399


INFO:tensorflow:loss = 0.15488619, step = 2700 (85.910 sec)


I0531 22:57:51.427036 139625747429248 basic_session_run_hooks.py:247] loss = 0.15488619, step = 2700 (85.910 sec)


INFO:tensorflow:global_step/sec: 1.16641


I0531 22:59:17.157540 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16641


INFO:tensorflow:loss = 0.05312115, step = 2800 (85.734 sec)


I0531 22:59:17.161352 139625747429248 basic_session_run_hooks.py:247] loss = 0.05312115, step = 2800 (85.734 sec)


INFO:tensorflow:global_step/sec: 1.16361


I0531 23:00:43.096606 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16361


INFO:tensorflow:loss = 0.01949767, step = 2900 (85.938 sec)


I0531 23:00:43.098935 139625747429248 basic_session_run_hooks.py:247] loss = 0.01949767, step = 2900 (85.938 sec)


INFO:tensorflow:Saving checkpoints for 3000 into bert_story_cloze_aug/model.ckpt.


I0531 23:02:08.234306 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 3000 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.02018


I0531 23:02:21.118935 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.02018


INFO:tensorflow:loss = 0.0057142163, step = 3000 (98.025 sec)


I0531 23:02:21.123811 139625747429248 basic_session_run_hooks.py:247] loss = 0.0057142163, step = 3000 (98.025 sec)


INFO:tensorflow:global_step/sec: 1.15651


I0531 23:03:47.585760 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15651


INFO:tensorflow:loss = 0.17775002, step = 3100 (86.467 sec)


I0531 23:03:47.590782 139625747429248 basic_session_run_hooks.py:247] loss = 0.17775002, step = 3100 (86.467 sec)


INFO:tensorflow:global_step/sec: 1.16434


I0531 23:05:13.471151 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16434


INFO:tensorflow:loss = 0.00884926, step = 3200 (85.883 sec)


I0531 23:05:13.473682 139625747429248 basic_session_run_hooks.py:247] loss = 0.00884926, step = 3200 (85.883 sec)


INFO:tensorflow:global_step/sec: 1.16216


I0531 23:06:39.517680 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16216


INFO:tensorflow:loss = 0.17707478, step = 3300 (86.048 sec)


I0531 23:06:39.521556 139625747429248 basic_session_run_hooks.py:247] loss = 0.17707478, step = 3300 (86.048 sec)


INFO:tensorflow:global_step/sec: 1.16497


I0531 23:08:05.356996 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16497


INFO:tensorflow:loss = 0.0076362137, step = 3400 (85.839 sec)


I0531 23:08:05.360952 139625747429248 basic_session_run_hooks.py:247] loss = 0.0076362137, step = 3400 (85.839 sec)


INFO:tensorflow:Saving checkpoints for 3500 into bert_story_cloze_aug/model.ckpt.


I0531 23:09:30.514712 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 3500 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.03


I0531 23:09:42.444291 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.03


INFO:tensorflow:loss = 0.19685884, step = 3500 (97.085 sec)


I0531 23:09:42.446393 139625747429248 basic_session_run_hooks.py:247] loss = 0.19685884, step = 3500 (97.085 sec)


INFO:tensorflow:global_step/sec: 1.15877


I0531 23:11:08.742407 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15877


INFO:tensorflow:loss = 0.21494885, step = 3600 (86.299 sec)


I0531 23:11:08.745618 139625747429248 basic_session_run_hooks.py:247] loss = 0.21494885, step = 3600 (86.299 sec)


INFO:tensorflow:global_step/sec: 1.16427


I0531 23:12:34.633390 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16427


INFO:tensorflow:loss = 0.14903294, step = 3700 (85.892 sec)


I0531 23:12:34.638039 139625747429248 basic_session_run_hooks.py:247] loss = 0.14903294, step = 3700 (85.892 sec)


INFO:tensorflow:global_step/sec: 1.16357


I0531 23:14:00.575735 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16357


INFO:tensorflow:loss = 0.04305018, step = 3800 (85.944 sec)


I0531 23:14:00.581822 139625747429248 basic_session_run_hooks.py:247] loss = 0.04305018, step = 3800 (85.944 sec)


INFO:tensorflow:global_step/sec: 1.16717


I0531 23:15:26.253401 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16717


INFO:tensorflow:loss = 0.03992913, step = 3900 (85.674 sec)


I0531 23:15:26.256162 139625747429248 basic_session_run_hooks.py:247] loss = 0.03992913, step = 3900 (85.674 sec)


INFO:tensorflow:Saving checkpoints for 4000 into bert_story_cloze_aug/model.ckpt.


I0531 23:16:51.340627 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 4000 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.02499


I0531 23:17:03.815134 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.02499


INFO:tensorflow:loss = 0.025661051, step = 4001 (97.563 sec)


I0531 23:17:03.818684 139625747429248 basic_session_run_hooks.py:247] loss = 0.025661051, step = 4001 (97.563 sec)


INFO:tensorflow:global_step/sec: 1.15873


I0531 23:18:30.116710 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15873


INFO:tensorflow:loss = 0.12724167, step = 4100 (86.301 sec)


I0531 23:18:30.119952 139625747429248 basic_session_run_hooks.py:247] loss = 0.12724167, step = 4100 (86.301 sec)


INFO:tensorflow:global_step/sec: 1.16409


I0531 23:19:56.020412 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16409


INFO:tensorflow:loss = 0.0048502157, step = 4200 (85.904 sec)


I0531 23:19:56.024247 139625747429248 basic_session_run_hooks.py:247] loss = 0.0048502157, step = 4200 (85.904 sec)


INFO:tensorflow:global_step/sec: 1.16667


I0531 23:21:21.734146 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16667


INFO:tensorflow:loss = 0.06929777, step = 4300 (85.712 sec)


I0531 23:21:21.736675 139625747429248 basic_session_run_hooks.py:247] loss = 0.06929777, step = 4300 (85.712 sec)


INFO:tensorflow:global_step/sec: 1.16335


I0531 23:22:47.692763 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16335


INFO:tensorflow:loss = 0.06828561, step = 4400 (85.960 sec)


I0531 23:22:47.697102 139625747429248 basic_session_run_hooks.py:247] loss = 0.06828561, step = 4400 (85.960 sec)


INFO:tensorflow:Saving checkpoints for 4500 into bert_story_cloze_aug/model.ckpt.


I0531 23:24:12.697601 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 4500 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.0288


I0531 23:24:24.893306 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.0288


INFO:tensorflow:loss = 0.09257445, step = 4500 (97.198 sec)


I0531 23:24:24.895505 139625747429248 basic_session_run_hooks.py:247] loss = 0.09257445, step = 4500 (97.198 sec)


INFO:tensorflow:global_step/sec: 1.15803


I0531 23:25:51.246930 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15803


INFO:tensorflow:loss = 0.18821883, step = 4600 (86.354 sec)


I0531 23:25:51.249040 139625747429248 basic_session_run_hooks.py:247] loss = 0.18821883, step = 4600 (86.354 sec)


INFO:tensorflow:global_step/sec: 1.16352


I0531 23:27:17.193263 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16352


INFO:tensorflow:loss = 0.02067092, step = 4700 (85.949 sec)


I0531 23:27:17.198425 139625747429248 basic_session_run_hooks.py:247] loss = 0.02067092, step = 4700 (85.949 sec)


INFO:tensorflow:global_step/sec: 1.16579


I0531 23:28:42.972220 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16579


INFO:tensorflow:loss = 0.011965966, step = 4800 (85.781 sec)


I0531 23:28:42.978354 139625747429248 basic_session_run_hooks.py:247] loss = 0.011965966, step = 4800 (85.781 sec)


INFO:tensorflow:global_step/sec: 1.16395


I0531 23:30:08.886228 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16395


INFO:tensorflow:loss = 0.16028792, step = 4900 (85.911 sec)


I0531 23:30:08.889720 139625747429248 basic_session_run_hooks.py:247] loss = 0.16028792, step = 4900 (85.911 sec)


INFO:tensorflow:Saving checkpoints for 5000 into bert_story_cloze_aug/model.ckpt.


I0531 23:31:33.801334 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 5000 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.03069


I0531 23:31:45.908266 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.03069


INFO:tensorflow:loss = 0.023115838, step = 5000 (97.023 sec)


I0531 23:31:45.912628 139625747429248 basic_session_run_hooks.py:247] loss = 0.023115838, step = 5000 (97.023 sec)


INFO:tensorflow:global_step/sec: 1.16002


I0531 23:33:12.113775 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16002


INFO:tensorflow:loss = 0.01479484, step = 5100 (86.203 sec)


I0531 23:33:12.116098 139625747429248 basic_session_run_hooks.py:247] loss = 0.01479484, step = 5100 (86.203 sec)


INFO:tensorflow:global_step/sec: 1.16498


I0531 23:34:37.951895 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16498


INFO:tensorflow:loss = 0.14488243, step = 5200 (85.840 sec)


I0531 23:34:37.956300 139625747429248 basic_session_run_hooks.py:247] loss = 0.14488243, step = 5200 (85.840 sec)


INFO:tensorflow:global_step/sec: 1.16699


I0531 23:36:03.642655 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16699


INFO:tensorflow:loss = 0.015365346, step = 5300 (85.691 sec)


I0531 23:36:03.647686 139625747429248 basic_session_run_hooks.py:247] loss = 0.015365346, step = 5300 (85.691 sec)


INFO:tensorflow:global_step/sec: 1.16268


I0531 23:37:29.650916 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16268


INFO:tensorflow:loss = 0.014334155, step = 5400 (86.006 sec)


I0531 23:37:29.653479 139625747429248 basic_session_run_hooks.py:247] loss = 0.014334155, step = 5400 (86.006 sec)


INFO:tensorflow:Saving checkpoints for 5500 into bert_story_cloze_aug/model.ckpt.


I0531 23:38:54.632858 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 5500 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.02896


I0531 23:39:06.836771 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.02896


INFO:tensorflow:loss = 0.09699513, step = 5500 (97.187 sec)


I0531 23:39:06.840212 139625747429248 basic_session_run_hooks.py:247] loss = 0.09699513, step = 5500 (97.187 sec)


INFO:tensorflow:global_step/sec: 1.1588


I0531 23:40:33.132412 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.1588


INFO:tensorflow:loss = 0.047619957, step = 5600 (86.298 sec)


I0531 23:40:33.137825 139625747429248 basic_session_run_hooks.py:247] loss = 0.047619957, step = 5600 (86.298 sec)


INFO:tensorflow:global_step/sec: 1.16494


I0531 23:41:58.973955 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16494


INFO:tensorflow:loss = 0.17360961, step = 5700 (85.840 sec)


I0531 23:41:58.978119 139625747429248 basic_session_run_hooks.py:247] loss = 0.17360961, step = 5700 (85.840 sec)


INFO:tensorflow:global_step/sec: 1.16563


I0531 23:43:24.764336 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16563


INFO:tensorflow:loss = 0.00044569888, step = 5800 (85.789 sec)


I0531 23:43:24.767100 139625747429248 basic_session_run_hooks.py:247] loss = 0.00044569888, step = 5800 (85.789 sec)


INFO:tensorflow:global_step/sec: 1.16437


I0531 23:44:50.647465 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16437


INFO:tensorflow:loss = 0.1079307, step = 5900 (85.883 sec)


I0531 23:44:50.649958 139625747429248 basic_session_run_hooks.py:247] loss = 0.1079307, step = 5900 (85.883 sec)


INFO:tensorflow:Saving checkpoints for 6000 into bert_story_cloze_aug/model.ckpt.


I0531 23:46:15.504576 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 6000 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.03006


I0531 23:46:27.729254 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.03006


INFO:tensorflow:loss = 0.007961424, step = 6000 (97.086 sec)


I0531 23:46:27.735547 139625747429248 basic_session_run_hooks.py:247] loss = 0.007961424, step = 6000 (97.086 sec)


INFO:tensorflow:global_step/sec: 1.15765


I0531 23:47:54.110949 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15765


INFO:tensorflow:loss = 0.0005635427, step = 6100 (86.379 sec)


I0531 23:47:54.114513 139625747429248 basic_session_run_hooks.py:247] loss = 0.0005635427, step = 6100 (86.379 sec)


INFO:tensorflow:global_step/sec: 1.16338


I0531 23:49:20.067325 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16338


INFO:tensorflow:loss = 0.045905437, step = 6200 (85.957 sec)


I0531 23:49:20.071302 139625747429248 basic_session_run_hooks.py:247] loss = 0.045905437, step = 6200 (85.957 sec)


INFO:tensorflow:global_step/sec: 1.16671


I0531 23:50:45.778589 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16671


INFO:tensorflow:loss = 0.13262156, step = 6300 (85.712 sec)


I0531 23:50:45.782979 139625747429248 basic_session_run_hooks.py:247] loss = 0.13262156, step = 6300 (85.712 sec)


INFO:tensorflow:global_step/sec: 1.16516


I0531 23:52:11.603981 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16516


INFO:tensorflow:loss = 0.0018283504, step = 6400 (85.831 sec)


I0531 23:52:11.614249 139625747429248 basic_session_run_hooks.py:247] loss = 0.0018283504, step = 6400 (85.831 sec)


INFO:tensorflow:Saving checkpoints for 6500 into bert_story_cloze_aug/model.ckpt.


I0531 23:53:36.586616 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 6500 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.03636


I0531 23:53:48.095158 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.03636


INFO:tensorflow:loss = 0.255202, step = 6500 (96.485 sec)


I0531 23:53:48.098887 139625747429248 basic_session_run_hooks.py:247] loss = 0.255202, step = 6500 (96.485 sec)


INFO:tensorflow:global_step/sec: 1.15879


I0531 23:55:14.392104 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15879


INFO:tensorflow:loss = 0.00047108508, step = 6600 (86.298 sec)


I0531 23:55:14.397216 139625747429248 basic_session_run_hooks.py:247] loss = 0.00047108508, step = 6600 (86.298 sec)


INFO:tensorflow:global_step/sec: 1.16398


I0531 23:56:40.303929 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16398


INFO:tensorflow:loss = 0.0005688128, step = 6700 (85.911 sec)


I0531 23:56:40.308021 139625747429248 basic_session_run_hooks.py:247] loss = 0.0005688128, step = 6700 (85.911 sec)


INFO:tensorflow:global_step/sec: 1.16633


I0531 23:58:06.043249 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16633


INFO:tensorflow:loss = 0.00043201674, step = 6800 (85.740 sec)


I0531 23:58:06.048124 139625747429248 basic_session_run_hooks.py:247] loss = 0.00043201674, step = 6800 (85.740 sec)


INFO:tensorflow:global_step/sec: 1.16459


I0531 23:59:31.910560 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16459


INFO:tensorflow:loss = 0.0009985084, step = 6900 (85.867 sec)


I0531 23:59:31.914892 139625747429248 basic_session_run_hooks.py:247] loss = 0.0009985084, step = 6900 (85.867 sec)


INFO:tensorflow:Saving checkpoints for 7000 into bert_story_cloze_aug/model.ckpt.


I0601 00:00:56.775284 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 7000 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.0295


I0601 00:01:09.045301 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.0295


INFO:tensorflow:loss = 0.00076464855, step = 7000 (97.137 sec)


I0601 00:01:09.052258 139625747429248 basic_session_run_hooks.py:247] loss = 0.00076464855, step = 7000 (97.137 sec)


INFO:tensorflow:global_step/sec: 1.15876


I0601 00:02:35.344234 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15876


INFO:tensorflow:loss = 0.0005119164, step = 7100 (86.297 sec)


I0601 00:02:35.349452 139625747429248 basic_session_run_hooks.py:247] loss = 0.0005119164, step = 7100 (86.297 sec)


INFO:tensorflow:global_step/sec: 1.16348


I0601 00:04:01.293004 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16348


INFO:tensorflow:loss = 0.0005708363, step = 7200 (85.946 sec)


I0601 00:04:01.295193 139625747429248 basic_session_run_hooks.py:247] loss = 0.0005708363, step = 7200 (85.946 sec)


INFO:tensorflow:global_step/sec: 1.16544


I0601 00:05:27.097715 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16544


INFO:tensorflow:loss = 0.009109936, step = 7300 (85.807 sec)


I0601 00:05:27.102001 139625747429248 basic_session_run_hooks.py:247] loss = 0.009109936, step = 7300 (85.807 sec)


INFO:tensorflow:global_step/sec: 1.16535


I0601 00:06:52.909224 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16535


INFO:tensorflow:loss = 0.0009305724, step = 7400 (85.812 sec)


I0601 00:06:52.914315 139625747429248 basic_session_run_hooks.py:247] loss = 0.0009305724, step = 7400 (85.812 sec)


INFO:tensorflow:Saving checkpoints for 7500 into bert_story_cloze_aug/model.ckpt.


I0601 00:08:18.094075 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 7500 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.02869


I0601 00:08:30.119928 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.02869


INFO:tensorflow:loss = 0.0010238977, step = 7500 (97.210 sec)


I0601 00:08:30.124447 139625747429248 basic_session_run_hooks.py:247] loss = 0.0010238977, step = 7500 (97.210 sec)


INFO:tensorflow:global_step/sec: 1.15826


I0601 00:09:56.456479 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15826


INFO:tensorflow:loss = 0.00044547184, step = 7600 (86.334 sec)


I0601 00:09:56.458762 139625747429248 basic_session_run_hooks.py:247] loss = 0.00044547184, step = 7600 (86.334 sec)


INFO:tensorflow:global_step/sec: 1.16391


I0601 00:11:22.373916 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16391


INFO:tensorflow:loss = 0.22799708, step = 7700 (85.918 sec)


I0601 00:11:22.376282 139625747429248 basic_session_run_hooks.py:247] loss = 0.22799708, step = 7700 (85.918 sec)


INFO:tensorflow:global_step/sec: 1.16667


I0601 00:12:48.088010 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16667


INFO:tensorflow:loss = 0.27394623, step = 7800 (85.714 sec)


I0601 00:12:48.090598 139625747429248 basic_session_run_hooks.py:247] loss = 0.27394623, step = 7800 (85.714 sec)


INFO:tensorflow:global_step/sec: 1.16599


I0601 00:14:13.851795 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16599


INFO:tensorflow:loss = 0.0011344467, step = 7900 (85.766 sec)


I0601 00:14:13.856798 139625747429248 basic_session_run_hooks.py:247] loss = 0.0011344467, step = 7900 (85.766 sec)


INFO:tensorflow:Saving checkpoints for 8000 into bert_story_cloze_aug/model.ckpt.


I0601 00:15:38.807130 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 8000 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.02887


I0601 00:15:51.045874 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.02887


INFO:tensorflow:loss = 0.0019265693, step = 8001 (97.193 sec)


I0601 00:15:51.050087 139625747429248 basic_session_run_hooks.py:247] loss = 0.0019265693, step = 8001 (97.193 sec)


INFO:tensorflow:global_step/sec: 1.15802


I0601 00:17:17.400252 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15802


INFO:tensorflow:loss = 0.002042123, step = 8100 (86.355 sec)


I0601 00:17:17.404987 139625747429248 basic_session_run_hooks.py:247] loss = 0.002042123, step = 8100 (86.355 sec)


INFO:tensorflow:global_step/sec: 1.16359


I0601 00:18:43.340850 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16359


INFO:tensorflow:loss = 0.23570883, step = 8200 (85.939 sec)


I0601 00:18:43.343808 139625747429248 basic_session_run_hooks.py:247] loss = 0.23570883, step = 8200 (85.939 sec)


INFO:tensorflow:global_step/sec: 1.16634


I0601 00:20:09.079251 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16634


INFO:tensorflow:loss = 0.0024024723, step = 8300 (85.741 sec)


I0601 00:20:09.084802 139625747429248 basic_session_run_hooks.py:247] loss = 0.0024024723, step = 8300 (85.741 sec)


INFO:tensorflow:global_step/sec: 1.1648


I0601 00:21:34.930625 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.1648


INFO:tensorflow:loss = 0.0011530174, step = 8400 (85.850 sec)


I0601 00:21:34.935189 139625747429248 basic_session_run_hooks.py:247] loss = 0.0011530174, step = 8400 (85.850 sec)


INFO:tensorflow:Saving checkpoints for 8500 into bert_story_cloze_aug/model.ckpt.


I0601 00:22:59.934350 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 8500 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.02903


I0601 00:23:12.109385 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.02903


INFO:tensorflow:loss = 0.09211139, step = 8500 (97.178 sec)


I0601 00:23:12.113288 139625747429248 basic_session_run_hooks.py:247] loss = 0.09211139, step = 8500 (97.178 sec)


INFO:tensorflow:global_step/sec: 1.1578


I0601 00:24:38.479934 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.1578


INFO:tensorflow:loss = 0.00021805405, step = 8600 (86.372 sec)


I0601 00:24:38.484945 139625747429248 basic_session_run_hooks.py:247] loss = 0.00021805405, step = 8600 (86.372 sec)


INFO:tensorflow:global_step/sec: 1.16373


I0601 00:26:04.410681 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16373


INFO:tensorflow:loss = 0.022098655, step = 8700 (85.928 sec)


I0601 00:26:04.412990 139625747429248 basic_session_run_hooks.py:247] loss = 0.022098655, step = 8700 (85.928 sec)


INFO:tensorflow:global_step/sec: 1.1654


I0601 00:27:30.217843 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.1654


INFO:tensorflow:loss = 0.0006242828, step = 8800 (85.807 sec)


I0601 00:27:30.220430 139625747429248 basic_session_run_hooks.py:247] loss = 0.0006242828, step = 8800 (85.807 sec)


INFO:tensorflow:global_step/sec: 1.16235


I0601 00:28:56.250402 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16235


INFO:tensorflow:loss = 0.0027821234, step = 8900 (86.033 sec)


I0601 00:28:56.253152 139625747429248 basic_session_run_hooks.py:247] loss = 0.0027821234, step = 8900 (86.033 sec)


INFO:tensorflow:Saving checkpoints for 9000 into bert_story_cloze_aug/model.ckpt.


I0601 00:30:21.311103 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 9000 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.02338


I0601 00:30:33.965835 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.02338


INFO:tensorflow:loss = 0.00020007197, step = 9000 (97.715 sec)


I0601 00:30:33.968258 139625747429248 basic_session_run_hooks.py:247] loss = 0.00020007197, step = 9000 (97.715 sec)


INFO:tensorflow:global_step/sec: 1.15792


I0601 00:32:00.327355 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15792


INFO:tensorflow:loss = 0.030998027, step = 9100 (86.364 sec)


I0601 00:32:00.331944 139625747429248 basic_session_run_hooks.py:247] loss = 0.030998027, step = 9100 (86.364 sec)


INFO:tensorflow:global_step/sec: 1.16322


I0601 00:33:26.295299 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16322


INFO:tensorflow:loss = 0.000564471, step = 9200 (85.966 sec)


I0601 00:33:26.298276 139625747429248 basic_session_run_hooks.py:247] loss = 0.000564471, step = 9200 (85.966 sec)


INFO:tensorflow:global_step/sec: 1.16373


I0601 00:34:52.226016 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16373


INFO:tensorflow:loss = 0.0137337055, step = 9300 (85.932 sec)


I0601 00:34:52.230138 139625747429248 basic_session_run_hooks.py:247] loss = 0.0137337055, step = 9300 (85.932 sec)


INFO:tensorflow:global_step/sec: 1.16489


I0601 00:36:18.071216 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16489


INFO:tensorflow:loss = 0.0004198182, step = 9400 (85.846 sec)


I0601 00:36:18.076276 139625747429248 basic_session_run_hooks.py:247] loss = 0.0004198182, step = 9400 (85.846 sec)


INFO:tensorflow:Saving checkpoints for 9500 into bert_story_cloze_aug/model.ckpt.


I0601 00:37:43.040017 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 9500 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.03084


I0601 00:37:55.079776 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.03084


INFO:tensorflow:loss = 0.0014263121, step = 9500 (97.006 sec)


I0601 00:37:55.082246 139625747429248 basic_session_run_hooks.py:247] loss = 0.0014263121, step = 9500 (97.006 sec)


INFO:tensorflow:global_step/sec: 1.15724


I0601 00:39:21.492170 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15724


INFO:tensorflow:loss = 0.0006656403, step = 9600 (86.415 sec)


I0601 00:39:21.497268 139625747429248 basic_session_run_hooks.py:247] loss = 0.0006656403, step = 9600 (86.415 sec)


INFO:tensorflow:global_step/sec: 1.1637


I0601 00:40:47.425090 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.1637


INFO:tensorflow:loss = 0.00056274305, step = 9700 (85.934 sec)


I0601 00:40:47.430784 139625747429248 basic_session_run_hooks.py:247] loss = 0.00056274305, step = 9700 (85.934 sec)


INFO:tensorflow:global_step/sec: 1.16453


I0601 00:42:13.296306 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16453


INFO:tensorflow:loss = 0.0009353013, step = 9800 (85.868 sec)


I0601 00:42:13.298433 139625747429248 basic_session_run_hooks.py:247] loss = 0.0009353013, step = 9800 (85.868 sec)


INFO:tensorflow:global_step/sec: 1.16423


I0601 00:43:39.190078 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16423


INFO:tensorflow:loss = 0.0005471114, step = 9900 (85.895 sec)


I0601 00:43:39.192924 139625747429248 basic_session_run_hooks.py:247] loss = 0.0005471114, step = 9900 (85.895 sec)


INFO:tensorflow:Saving checkpoints for 10000 into bert_story_cloze_aug/model.ckpt.


I0601 00:45:04.316368 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 10000 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.02019


I0601 00:45:17.210633 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.02019


INFO:tensorflow:loss = 0.00452405, step = 10000 (98.024 sec)


I0601 00:45:17.216650 139625747429248 basic_session_run_hooks.py:247] loss = 0.00452405, step = 10000 (98.024 sec)


INFO:tensorflow:global_step/sec: 1.15687


I0601 00:46:43.650528 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15687


INFO:tensorflow:loss = 0.00028632054, step = 10100 (86.436 sec)


I0601 00:46:43.652735 139625747429248 basic_session_run_hooks.py:247] loss = 0.00028632054, step = 10100 (86.436 sec)


INFO:tensorflow:global_step/sec: 1.16228


I0601 00:48:09.688472 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16228


INFO:tensorflow:loss = 0.00026613427, step = 10200 (86.038 sec)


I0601 00:48:09.690496 139625747429248 basic_session_run_hooks.py:247] loss = 0.00026613427, step = 10200 (86.038 sec)


INFO:tensorflow:global_step/sec: 1.16512


I0601 00:49:35.516671 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16512


INFO:tensorflow:loss = 0.0006803302, step = 10300 (85.832 sec)


I0601 00:49:35.522116 139625747429248 basic_session_run_hooks.py:247] loss = 0.0006803302, step = 10300 (85.832 sec)


INFO:tensorflow:global_step/sec: 1.16162


I0601 00:51:01.603574 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16162


INFO:tensorflow:loss = 0.0004186069, step = 10400 (86.084 sec)


I0601 00:51:01.605827 139625747429248 basic_session_run_hooks.py:247] loss = 0.0004186069, step = 10400 (86.084 sec)


INFO:tensorflow:Saving checkpoints for 10500 into bert_story_cloze_aug/model.ckpt.


I0601 00:52:26.737916 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 10500 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.0257


I0601 00:52:39.098298 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.0257


INFO:tensorflow:loss = 0.0005057511, step = 10500 (97.497 sec)


I0601 00:52:39.102468 139625747429248 basic_session_run_hooks.py:247] loss = 0.0005057511, step = 10500 (97.497 sec)


INFO:tensorflow:global_step/sec: 1.15871


I0601 00:54:05.400878 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15871


INFO:tensorflow:loss = 0.003249991, step = 10600 (86.301 sec)


I0601 00:54:05.403010 139625747429248 basic_session_run_hooks.py:247] loss = 0.003249991, step = 10600 (86.301 sec)


INFO:tensorflow:global_step/sec: 1.16456


I0601 00:55:31.270499 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16456


INFO:tensorflow:loss = 0.20803478, step = 10700 (85.870 sec)


I0601 00:55:31.272632 139625747429248 basic_session_run_hooks.py:247] loss = 0.20803478, step = 10700 (85.870 sec)


INFO:tensorflow:global_step/sec: 1.16165


I0601 00:56:57.354665 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16165


INFO:tensorflow:loss = 0.0010367584, step = 10800 (86.084 sec)


I0601 00:56:57.356809 139625747429248 basic_session_run_hooks.py:247] loss = 0.0010367584, step = 10800 (86.084 sec)


INFO:tensorflow:global_step/sec: 1.16509


I0601 00:58:23.185056 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16509


INFO:tensorflow:loss = 0.00015115595, step = 10900 (85.830 sec)


I0601 00:58:23.187221 139625747429248 basic_session_run_hooks.py:247] loss = 0.00015115595, step = 10900 (85.830 sec)


INFO:tensorflow:Saving checkpoints for 11000 into bert_story_cloze_aug/model.ckpt.


I0601 00:59:48.450288 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 11000 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.02629


I0601 01:00:00.623833 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.02629


INFO:tensorflow:loss = 0.00017572558, step = 11000 (97.440 sec)


I0601 01:00:00.627665 139625747429248 basic_session_run_hooks.py:247] loss = 0.00017572558, step = 11000 (97.440 sec)


INFO:tensorflow:global_step/sec: 1.1579


I0601 01:01:26.986832 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.1579


INFO:tensorflow:loss = 0.000638427, step = 11100 (86.363 sec)


I0601 01:01:26.990644 139625747429248 basic_session_run_hooks.py:247] loss = 0.000638427, step = 11100 (86.363 sec)


INFO:tensorflow:global_step/sec: 1.16344


I0601 01:02:52.938906 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16344


INFO:tensorflow:loss = 0.00021542757, step = 11200 (85.953 sec)


I0601 01:02:52.943463 139625747429248 basic_session_run_hooks.py:247] loss = 0.00021542757, step = 11200 (85.953 sec)


INFO:tensorflow:global_step/sec: 1.1652


I0601 01:04:18.760784 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.1652


INFO:tensorflow:loss = 0.0103524355, step = 11300 (85.820 sec)


I0601 01:04:18.763238 139625747429248 basic_session_run_hooks.py:247] loss = 0.0103524355, step = 11300 (85.820 sec)


INFO:tensorflow:global_step/sec: 1.16396


I0601 01:05:44.674211 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16396


INFO:tensorflow:loss = 0.00019838879, step = 11400 (85.922 sec)


I0601 01:05:44.685274 139625747429248 basic_session_run_hooks.py:247] loss = 0.00019838879, step = 11400 (85.922 sec)


INFO:tensorflow:Saving checkpoints for 11500 into bert_story_cloze_aug/model.ckpt.


I0601 01:07:09.708907 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 11500 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.03039


I0601 01:07:21.724902 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.03039


INFO:tensorflow:loss = 0.00017972081, step = 11500 (97.042 sec)


I0601 01:07:21.726953 139625747429248 basic_session_run_hooks.py:247] loss = 0.00017972081, step = 11500 (97.042 sec)


INFO:tensorflow:global_step/sec: 1.15738


I0601 01:08:48.127028 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15738


INFO:tensorflow:loss = 0.00031250302, step = 11600 (86.405 sec)


I0601 01:08:48.131815 139625747429248 basic_session_run_hooks.py:247] loss = 0.00031250302, step = 11600 (86.405 sec)


INFO:tensorflow:global_step/sec: 1.16287


I0601 01:10:14.120825 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16287


INFO:tensorflow:loss = 0.00033081637, step = 11700 (85.991 sec)


I0601 01:10:14.122917 139625747429248 basic_session_run_hooks.py:247] loss = 0.00033081637, step = 11700 (85.991 sec)


INFO:tensorflow:global_step/sec: 1.16131


I0601 01:11:40.230535 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16131


INFO:tensorflow:loss = 0.00013511257, step = 11800 (86.110 sec)


I0601 01:11:40.232874 139625747429248 basic_session_run_hooks.py:247] loss = 0.00013511257, step = 11800 (86.110 sec)


INFO:tensorflow:global_step/sec: 1.16165


I0601 01:13:06.314802 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16165


INFO:tensorflow:loss = 0.000113771246, step = 11900 (86.084 sec)


I0601 01:13:06.317318 139625747429248 basic_session_run_hooks.py:247] loss = 0.000113771246, step = 11900 (86.084 sec)


INFO:tensorflow:Saving checkpoints for 12000 into bert_story_cloze_aug/model.ckpt.


I0601 01:14:31.418860 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 12000 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.02889


I0601 01:14:43.506721 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.02889


INFO:tensorflow:loss = 0.00017691567, step = 12000 (97.193 sec)


I0601 01:14:43.509922 139625747429248 basic_session_run_hooks.py:247] loss = 0.00017691567, step = 12000 (97.193 sec)


INFO:tensorflow:global_step/sec: 1.15731


I0601 01:16:09.914216 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15731


INFO:tensorflow:loss = 9.7374636e-05, step = 12100 (86.407 sec)


I0601 01:16:09.916596 139625747429248 basic_session_run_hooks.py:247] loss = 9.7374636e-05, step = 12100 (86.407 sec)


INFO:tensorflow:global_step/sec: 1.16311


I0601 01:17:35.890589 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16311


INFO:tensorflow:loss = 0.00012429537, step = 12200 (85.977 sec)


I0601 01:17:35.893177 139625747429248 basic_session_run_hooks.py:247] loss = 0.00012429537, step = 12200 (85.977 sec)


INFO:tensorflow:global_step/sec: 1.16622


I0601 01:19:01.637664 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16622


INFO:tensorflow:loss = 6.178569e-05, step = 12300 (85.747 sec)


I0601 01:19:01.640023 139625747429248 basic_session_run_hooks.py:247] loss = 6.178569e-05, step = 12300 (85.747 sec)


INFO:tensorflow:global_step/sec: 1.16209


I0601 01:20:27.689918 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16209


INFO:tensorflow:loss = 5.6155674e-05, step = 12400 (86.052 sec)


I0601 01:20:27.692462 139625747429248 basic_session_run_hooks.py:247] loss = 5.6155674e-05, step = 12400 (86.052 sec)


INFO:tensorflow:Saving checkpoints for 12500 into bert_story_cloze_aug/model.ckpt.


I0601 01:21:52.765498 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 12500 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.02856


I0601 01:22:04.912846 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.02856


INFO:tensorflow:loss = 0.00013043992, step = 12500 (97.225 sec)


I0601 01:22:04.917531 139625747429248 basic_session_run_hooks.py:247] loss = 0.00013043992, step = 12500 (97.225 sec)


INFO:tensorflow:global_step/sec: 1.15795


I0601 01:23:31.272414 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15795


INFO:tensorflow:loss = 0.00014808116, step = 12600 (86.357 sec)


I0601 01:23:31.274833 139625747429248 basic_session_run_hooks.py:247] loss = 0.00014808116, step = 12600 (86.357 sec)


INFO:tensorflow:global_step/sec: 1.16335


I0601 01:24:57.231306 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16335


INFO:tensorflow:loss = 6.969944e-05, step = 12700 (85.961 sec)


I0601 01:24:57.235414 139625747429248 basic_session_run_hooks.py:247] loss = 6.969944e-05, step = 12700 (85.961 sec)


INFO:tensorflow:global_step/sec: 1.16559


I0601 01:26:23.024823 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16559


INFO:tensorflow:loss = 0.00046988888, step = 12800 (85.793 sec)


I0601 01:26:23.028906 139625747429248 basic_session_run_hooks.py:247] loss = 0.00046988888, step = 12800 (85.793 sec)


INFO:tensorflow:global_step/sec: 1.1649


I0601 01:27:48.869249 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.1649


INFO:tensorflow:loss = 7.109124e-05, step = 12900 (85.843 sec)


I0601 01:27:48.871539 139625747429248 basic_session_run_hooks.py:247] loss = 7.109124e-05, step = 12900 (85.843 sec)


INFO:tensorflow:Saving checkpoints for 13000 into bert_story_cloze_aug/model.ckpt.


I0601 01:29:13.886108 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 13000 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.02278


I0601 01:29:26.642340 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.02278


INFO:tensorflow:loss = 4.8352238e-05, step = 13001 (97.773 sec)


I0601 01:29:26.644451 139625747429248 basic_session_run_hooks.py:247] loss = 4.8352238e-05, step = 13001 (97.773 sec)


INFO:tensorflow:global_step/sec: 1.15772


I0601 01:30:53.019110 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15772


INFO:tensorflow:loss = 4.6023415e-05, step = 13100 (86.378 sec)


I0601 01:30:53.022016 139625747429248 basic_session_run_hooks.py:247] loss = 4.6023415e-05, step = 13100 (86.378 sec)


INFO:tensorflow:global_step/sec: 1.1623


I0601 01:32:19.055241 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.1623


INFO:tensorflow:loss = 4.5044177e-05, step = 13200 (86.037 sec)


I0601 01:32:19.059250 139625747429248 basic_session_run_hooks.py:247] loss = 4.5044177e-05, step = 13200 (86.037 sec)


INFO:tensorflow:global_step/sec: 1.16464


I0601 01:33:44.918613 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16464


INFO:tensorflow:loss = 0.00010848805, step = 13300 (85.863 sec)


I0601 01:33:44.921940 139625747429248 basic_session_run_hooks.py:247] loss = 0.00010848805, step = 13300 (85.863 sec)


INFO:tensorflow:global_step/sec: 1.16257


I0601 01:35:10.934898 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16257


INFO:tensorflow:loss = 5.0278664e-05, step = 13400 (86.019 sec)


I0601 01:35:10.941152 139625747429248 basic_session_run_hooks.py:247] loss = 5.0278664e-05, step = 13400 (86.019 sec)


INFO:tensorflow:Saving checkpoints for 13500 into bert_story_cloze_aug/model.ckpt.


I0601 01:36:36.037127 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 13500 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.03211


I0601 01:36:47.823631 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.03211


INFO:tensorflow:loss = 0.0012030866, step = 13500 (96.885 sec)


I0601 01:36:47.825812 139625747429248 basic_session_run_hooks.py:247] loss = 0.0012030866, step = 13500 (96.885 sec)


INFO:tensorflow:global_step/sec: 1.15812


I0601 01:38:14.170352 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15812


INFO:tensorflow:loss = 3.3016604e-05, step = 13600 (86.350 sec)


I0601 01:38:14.176259 139625747429248 basic_session_run_hooks.py:247] loss = 3.3016604e-05, step = 13600 (86.350 sec)


INFO:tensorflow:global_step/sec: 1.16248


I0601 01:39:40.193671 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16248


INFO:tensorflow:loss = 0.00013043145, step = 13700 (86.022 sec)


I0601 01:39:40.198178 139625747429248 basic_session_run_hooks.py:247] loss = 0.00013043145, step = 13700 (86.022 sec)


INFO:tensorflow:global_step/sec: 1.16567


I0601 01:41:05.981384 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16567


INFO:tensorflow:loss = 4.8235845e-05, step = 13800 (85.790 sec)


I0601 01:41:05.987857 139625747429248 basic_session_run_hooks.py:247] loss = 4.8235845e-05, step = 13800 (85.790 sec)


INFO:tensorflow:global_step/sec: 1.16408


I0601 01:42:31.885915 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16408


INFO:tensorflow:loss = 3.947424e-05, step = 13900 (85.900 sec)


I0601 01:42:31.888202 139625747429248 basic_session_run_hooks.py:247] loss = 3.947424e-05, step = 13900 (85.900 sec)


INFO:tensorflow:Saving checkpoints for 14000 into bert_story_cloze_aug/model.ckpt.


I0601 01:43:56.972306 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 14000 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.02393


I0601 01:44:09.549120 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.02393


INFO:tensorflow:loss = 3.7180493e-05, step = 14000 (97.666 sec)


I0601 01:44:09.553983 139625747429248 basic_session_run_hooks.py:247] loss = 3.7180493e-05, step = 14000 (97.666 sec)


INFO:tensorflow:global_step/sec: 1.15754


I0601 01:45:35.939355 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15754


INFO:tensorflow:loss = 4.392342e-05, step = 14100 (86.388 sec)


I0601 01:45:35.942263 139625747429248 basic_session_run_hooks.py:247] loss = 4.392342e-05, step = 14100 (86.388 sec)


INFO:tensorflow:global_step/sec: 1.16209


I0601 01:47:01.991297 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16209


INFO:tensorflow:loss = 3.9152572e-05, step = 14200 (86.053 sec)


I0601 01:47:01.995468 139625747429248 basic_session_run_hooks.py:247] loss = 3.9152572e-05, step = 14200 (86.053 sec)


INFO:tensorflow:global_step/sec: 1.16465


I0601 01:48:27.853933 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16465


INFO:tensorflow:loss = 0.00029983564, step = 14300 (85.863 sec)


I0601 01:48:27.858510 139625747429248 basic_session_run_hooks.py:247] loss = 0.00029983564, step = 14300 (85.863 sec)


INFO:tensorflow:global_step/sec: 1.16492


I0601 01:49:53.696438 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16492


INFO:tensorflow:loss = 2.7719314e-05, step = 14400 (85.840 sec)


I0601 01:49:53.698821 139625747429248 basic_session_run_hooks.py:247] loss = 2.7719314e-05, step = 14400 (85.840 sec)


INFO:tensorflow:Saving checkpoints for 14500 into bert_story_cloze_aug/model.ckpt.


I0601 01:51:18.681831 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 14500 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.0266


I0601 01:51:31.104936 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.0266


INFO:tensorflow:loss = 0.0001777325, step = 14500 (97.411 sec)


I0601 01:51:31.109423 139625747429248 basic_session_run_hooks.py:247] loss = 0.0001777325, step = 14500 (97.411 sec)


INFO:tensorflow:global_step/sec: 1.15753


I0601 01:52:57.495542 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15753


INFO:tensorflow:loss = 0.00023536064, step = 14600 (86.391 sec)


I0601 01:52:57.500013 139625747429248 basic_session_run_hooks.py:247] loss = 0.00023536064, step = 14600 (86.391 sec)


INFO:tensorflow:global_step/sec: 1.16229


I0601 01:54:23.532652 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16229


INFO:tensorflow:loss = 3.024099e-05, step = 14700 (86.038 sec)


I0601 01:54:23.538361 139625747429248 basic_session_run_hooks.py:247] loss = 3.024099e-05, step = 14700 (86.038 sec)


INFO:tensorflow:global_step/sec: 1.16603


I0601 01:55:49.293795 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16603


INFO:tensorflow:loss = 4.4202483e-05, step = 14800 (85.761 sec)


I0601 01:55:49.299396 139625747429248 basic_session_run_hooks.py:247] loss = 4.4202483e-05, step = 14800 (85.761 sec)


INFO:tensorflow:global_step/sec: 1.16435


I0601 01:57:15.178703 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16435


INFO:tensorflow:loss = 4.7991976e-05, step = 14900 (85.887 sec)


I0601 01:57:15.186259 139625747429248 basic_session_run_hooks.py:247] loss = 4.7991976e-05, step = 14900 (85.887 sec)


INFO:tensorflow:Saving checkpoints for 15000 into bert_story_cloze_aug/model.ckpt.


I0601 01:58:40.217013 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 15000 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.02124


I0601 01:58:53.098891 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.02124


INFO:tensorflow:loss = 0.00021848569, step = 15000 (97.917 sec)


I0601 01:58:53.103321 139625747429248 basic_session_run_hooks.py:247] loss = 0.00021848569, step = 15000 (97.917 sec)


INFO:tensorflow:global_step/sec: 1.15671


I0601 02:00:19.550812 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15671


INFO:tensorflow:loss = 5.5867196e-05, step = 15100 (86.451 sec)


I0601 02:00:19.554090 139625747429248 basic_session_run_hooks.py:247] loss = 5.5867196e-05, step = 15100 (86.451 sec)


INFO:tensorflow:global_step/sec: 1.16332


I0601 02:01:45.511836 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16332


INFO:tensorflow:loss = 7.32705e-05, step = 15200 (85.965 sec)


I0601 02:01:45.519277 139625747429248 basic_session_run_hooks.py:247] loss = 7.32705e-05, step = 15200 (85.965 sec)


INFO:tensorflow:global_step/sec: 1.16397


I0601 02:03:11.424926 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16397


INFO:tensorflow:loss = 3.6919922e-05, step = 15300 (85.910 sec)


I0601 02:03:11.428928 139625747429248 basic_session_run_hooks.py:247] loss = 3.6919922e-05, step = 15300 (85.910 sec)


INFO:tensorflow:global_step/sec: 1.16143


I0601 02:04:37.525792 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16143


INFO:tensorflow:loss = 7.68858e-05, step = 15400 (86.099 sec)


I0601 02:04:37.528251 139625747429248 basic_session_run_hooks.py:247] loss = 7.68858e-05, step = 15400 (86.099 sec)


INFO:tensorflow:Saving checkpoints for 15500 into bert_story_cloze_aug/model.ckpt.


I0601 02:06:02.566405 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 15500 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.02729


I0601 02:06:14.869228 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.02729


INFO:tensorflow:loss = 5.6307254e-05, step = 15500 (97.344 sec)


I0601 02:06:14.872512 139625747429248 basic_session_run_hooks.py:247] loss = 5.6307254e-05, step = 15500 (97.344 sec)


INFO:tensorflow:global_step/sec: 1.15833


I0601 02:07:41.200157 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15833


INFO:tensorflow:loss = 0.0005727761, step = 15600 (86.330 sec)


I0601 02:07:41.202676 139625747429248 basic_session_run_hooks.py:247] loss = 0.0005727761, step = 15600 (86.330 sec)


INFO:tensorflow:global_step/sec: 1.1638


I0601 02:09:07.125356 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.1638


INFO:tensorflow:loss = 2.766326e-05, step = 15700 (85.925 sec)


I0601 02:09:07.127967 139625747429248 basic_session_run_hooks.py:247] loss = 2.766326e-05, step = 15700 (85.925 sec)


INFO:tensorflow:global_step/sec: 1.16567


I0601 02:10:32.913167 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16567


INFO:tensorflow:loss = 5.6650548e-05, step = 15800 (85.788 sec)


I0601 02:10:32.915701 139625747429248 basic_session_run_hooks.py:247] loss = 5.6650548e-05, step = 15800 (85.788 sec)


INFO:tensorflow:global_step/sec: 1.1614


I0601 02:11:59.016444 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.1614


INFO:tensorflow:loss = 4.2499436e-05, step = 15900 (86.103 sec)


I0601 02:11:59.018620 139625747429248 basic_session_run_hooks.py:247] loss = 4.2499436e-05, step = 15900 (86.103 sec)


INFO:tensorflow:Saving checkpoints for 16000 into bert_story_cloze_aug/model.ckpt.


I0601 02:13:24.181426 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 16000 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.02762


I0601 02:13:36.328490 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.02762


INFO:tensorflow:loss = 2.358822e-05, step = 16000 (97.317 sec)


I0601 02:13:36.335331 139625747429248 basic_session_run_hooks.py:247] loss = 2.358822e-05, step = 16000 (97.317 sec)


INFO:tensorflow:global_step/sec: 1.15817


I0601 02:15:02.671618 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.15817


INFO:tensorflow:loss = 3.5220597e-05, step = 16100 (86.341 sec)


I0601 02:15:02.676743 139625747429248 basic_session_run_hooks.py:247] loss = 3.5220597e-05, step = 16100 (86.341 sec)


INFO:tensorflow:global_step/sec: 1.16356


I0601 02:16:28.614982 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16356


INFO:tensorflow:loss = 4.579941e-05, step = 16200 (85.940 sec)


I0601 02:16:28.617196 139625747429248 basic_session_run_hooks.py:247] loss = 4.579941e-05, step = 16200 (85.940 sec)


INFO:tensorflow:global_step/sec: 1.16466


I0601 02:17:54.476972 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16466


INFO:tensorflow:loss = 2.9000796e-05, step = 16300 (85.862 sec)


I0601 02:17:54.479205 139625747429248 basic_session_run_hooks.py:247] loss = 2.9000796e-05, step = 16300 (85.862 sec)


INFO:tensorflow:global_step/sec: 1.16427


I0601 02:19:20.367886 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.16427


INFO:tensorflow:loss = 6.55273e-05, step = 16400 (85.891 sec)


I0601 02:19:20.370480 139625747429248 basic_session_run_hooks.py:247] loss = 6.55273e-05, step = 16400 (85.891 sec)


INFO:tensorflow:Saving checkpoints for 16500 into bert_story_cloze_aug/model.ckpt.


I0601 02:20:45.426778 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 16500 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:global_step/sec: 1.03046


I0601 02:20:57.412248 139625747429248 basic_session_run_hooks.py:680] global_step/sec: 1.03046


INFO:tensorflow:loss = 3.1988264e-05, step = 16500 (97.046 sec)


I0601 02:20:57.416252 139625747429248 basic_session_run_hooks.py:247] loss = 3.1988264e-05, step = 16500 (97.046 sec)


INFO:tensorflow:Saving checkpoints for 16530 into bert_story_cloze_aug/model.ckpt.


I0601 02:21:22.403669 139625747429248 basic_session_run_hooks.py:594] Saving checkpoints for 16530 into bert_story_cloze_aug/model.ckpt.


INFO:tensorflow:Loss for final step: 2.8251932e-05.


I0601 02:21:35.267403 139625747429248 estimator.py:359] Loss for final step: 2.8251932e-05.


Training took time  4:07:00.112676


Now let's use our test data to see how well our model did:

In [0]:
test_input_fn = bert.run_classifier.input_fn_builder(
    features=test_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

In [24]:
estimator.evaluate(input_fn=test_input_fn, steps=None)

INFO:tensorflow:Calling model_fn.


I0601 02:21:37.431790 139625747429248 estimator.py:1111] Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0601 02:21:39.830175 139625747429248 saver.py:1483] Saver not created because there are no variables in the graph to restore
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


INFO:tensorflow:Done calling model_fn.


I0601 02:21:50.299166 139625747429248 estimator.py:1113] Done calling model_fn.


INFO:tensorflow:Starting evaluation at 2019-06-01T02:21:50Z


I0601 02:21:50.323488 139625747429248 evaluation.py:257] Starting evaluation at 2019-06-01T02:21:50Z


INFO:tensorflow:Graph was finalized.


I0601 02:21:51.744436 139625747429248 monitored_session.py:222] Graph was finalized.


Instructions for updating:
Use standard file APIs to check for files with this prefix.


W0601 02:21:51.751335 139625747429248 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.


INFO:tensorflow:Restoring parameters from bert_story_cloze_aug/model.ckpt-16530


I0601 02:21:51.756262 139625747429248 saver.py:1270] Restoring parameters from bert_story_cloze_aug/model.ckpt-16530


INFO:tensorflow:Running local_init_op.


I0601 02:21:54.026093 139625747429248 session_manager.py:491] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0601 02:21:54.287571 139625747429248 session_manager.py:493] Done running local_init_op.


INFO:tensorflow:Finished evaluation at 2019-06-01-02:22:31


I0601 02:22:31.126819 139625747429248 evaluation.py:277] Finished evaluation at 2019-06-01-02:22:31


INFO:tensorflow:Saving dict for global step 16530: auc = 0.53393906, eval_accuracy = 0.53393906, f1_score = 0.67775303, false_negatives = 37.0, false_positives = 1707.0, global_step = 16530, loss = 4.6483426, precision = 0.5179328, recall = 0.9802245, true_negatives = 164.0, true_positives = 1834.0


I0601 02:22:31.128983 139625747429248 estimator.py:1979] Saving dict for global step 16530: auc = 0.53393906, eval_accuracy = 0.53393906, f1_score = 0.67775303, false_negatives = 37.0, false_positives = 1707.0, global_step = 16530, loss = 4.6483426, precision = 0.5179328, recall = 0.9802245, true_negatives = 164.0, true_positives = 1834.0


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 16530: bert_story_cloze_aug/model.ckpt-16530


I0601 02:22:33.501134 139625747429248 estimator.py:2039] Saving 'checkpoint_path' summary for global step 16530: bert_story_cloze_aug/model.ckpt-16530


{'auc': 0.53393906,
 'eval_accuracy': 0.53393906,
 'f1_score': 0.67775303,
 'false_negatives': 37.0,
 'false_positives': 1707.0,
 'global_step': 16530,
 'loss': 4.6483426,
 'precision': 0.5179328,
 'recall': 0.9802245,
 'true_negatives': 164.0,
 'true_positives': 1834.0}