<a href="https://colab.research.google.com/github/graulef/bert/blob/master/Predicting_Story_Cloze_with_BERT_usc_nn_only.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
# Copyright 2019 Google Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#Predicting Story Cloze task with BERT on TF Hub

If you’ve been following Natural Language Processing over the past year, you’ve probably heard of BERT: Bidirectional Encoder Representations from Transformers. It’s a neural network architecture designed by Google researchers that’s totally transformed what’s state-of-the-art for NLP tasks, like text classification, translation, summarization, and question answering.

Now that BERT's been added to [TF Hub](https://www.tensorflow.org/hub) as a loadable module, it's easy(ish) to add into existing Tensorflow text pipelines. In an existing pipeline, BERT can replace text embedding layers like ELMO and GloVE. Alternatively, [finetuning](http://wiki.fast.ai/index.php/Fine_tuning) BERT can provide both an accuracy boost and faster training time in many cases.

In [1]:
!pip list | grep tensorflow
!python --version

mesh-tensorflow          0.0.5                
tensorflow               1.13.1               
tensorflow-estimator     1.13.0               
tensorflow-hub           0.4.0                
tensorflow-metadata      0.13.0               
tensorflow-probability   0.6.0                
Python 3.6.7


In [2]:
from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime

import os
cwd = os.getcwd()
print(cwd)

W0601 08:11:06.210944 139940245366656 __init__.py:56] Some hub symbols are not available because TensorFlow version is less than 1.14


/content


In addition to the standard libraries we imported above, we'll need to install BERT's python package.

In [3]:
!pip install bert-tensorflow

Collecting bert-tensorflow
[?25l  Downloading https://files.pythonhosted.org/packages/a6/66/7eb4e8b6ea35b7cc54c322c816f976167a43019750279a8473d355800a93/bert_tensorflow-1.0.1-py2.py3-none-any.whl (67kB)
[K     |████████████████████████████████| 71kB 8.4MB/s 
Installing collected packages: bert-tensorflow
Successfully installed bert-tensorflow-1.0.1


In [0]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization

Below, we'll set an output directory location to store our model output and checkpoints. This can be a local directory, in which case you'd set OUTPUT_DIR to the name of the directory you'd like to create. If you're running this code in Google's hosted Colab, the directory won't persist after the Colab session ends.

Alternatively, if you're a GCP user, you can store output in a GCP bucket. To do that, set a directory name in OUTPUT_DIR and the name of the GCP bucket in the BUCKET field.

Set DO_DELETE to rewrite the OUTPUT_DIR if it exists. Otherwise, Tensorflow will load existing model checkpoints from that directory (if they exist).

In [6]:
# Set the output directory for saving model file
# Optionally, set a GCP bucket location

OUTPUT_DIR = 'bert_story_cloze_usc'#@param {type:"string"}
#@markdown Whether or not to clear/delete the directory and create a new one
DO_DELETE = False #@param {type:"boolean"}

print('***** Model output directory: {} *****'.format(OUTPUT_DIR))


***** Model output directory: bert_story_cloze_usc *****


#Data

In [0]:
from tensorflow import keras
import os
import re
import csv

PATH_EVAL_DATA = "http://felix.graule.ch/wp-content/uploads/2019/05/cloze_test_val_spring2016.csv"
PATH_SENT_NN_DATA = "http://felix.graule.ch/wp-content/uploads/2019/05/train_stories_nearest_story_sent2vec_combined.csv"
PATH_RAND_NN_DATA = "http://felix.graule.ch/wp-content/uploads/2019/05/train_stories_rand_combined.csv"
PATH_USC_NN_DATA = "http://felix.graule.ch/wp-content/uploads/2019/05/train_stories_nearest_story_usc_combined.csv"
#PATH_EVAL_DATA = "glue_data/StoryCloze/cloze_test_val_spring2016.csv"
#PATH_RAND_NN_DATA = "glue_data/StoryCloze/train_stories_rand_combined.csv"
#PATH_SENT_NN_DATA = "glue_data/StoryCloze/train_stories_nearest_story_sent2vec_combined.csv"

# Load all files from a directory in a DataFrame.
def load_data(path):
  data_1 = {}
  data_1["label"] = []
  data_1["id_1"] = []
  data_1["id_2"] = []
  data_1["context"] = []
  data_1["ending"] = []
  
  data_2 = {}
  data_2["label"] = []
  data_2["id_1"] = []
  data_2["id_2"] = []
  data_2["context"] = []
  data_2["ending"] = []
  
  print(path)
  with open(path) as f:
    csv_reader = csv.reader(f, delimiter=',')
    line_count = 0
    for row in csv_reader:
      if line_count == 0:
        #print("Columns = " + str(row))
        line_count += 1
      else:
        line_count += 1
        
        # Create two lines from one in order to have same label layout as 
        # MRPC task
        seperator = ' '
        data_1["id_1"].append(row[0])
        data_1["id_2"].append(row[0] + "_end_bli")
        data_1["context"].append(str(seperator.join(row[1:5])))
        
        data_2["id_1"].append(row[0])
        data_2["id_2"].append(row[0] + "_end_bla")
        data_2["context"].append(str(seperator.join(row[1:5])))
        
        if row[7] == "1": # First ending is the correct one
          data_1["ending"].append(row[5])
          data_1["label"].append(1)
          data_2["ending"].append(row[6])
          data_2["label"].append(0)
        else: # Second ending is the correct one
          data_1["ending"].append(row[6])
          data_1["label"].append(1)
          data_2["ending"].append(row[5])
          data_2["label"].append(0) 
          
    data_df_1 = pd.DataFrame.from_dict(data_1)
    data_df_2 = pd.DataFrame.from_dict(data_2)
    data = pd.concat([data_df_1, data_df_2])      
    return data     

# Merge positive and negative examples, add a polarity column and shuffle.
def load_validation_only(eval_file):
    eval_data_df = load_data(eval_file)
    total_eval = eval_data_df.shape[0]
    eval_split = 0.3
    eval_num = int(total_eval * eval_split)
    eval_data_df = eval_data_df.sample(frac=1).reset_index(drop=True)
    test_df = eval_data_df.iloc[:eval_num, :]
    train_df = eval_data_df.iloc[eval_num:, :]
    return train_df, test_df

def load_augmented(eval_file, random_nn_file, sent_nn_file, ):
    eval_data_df = load_data(eval_file)
    total_eval = eval_data_df.shape[0]
    eval_split = 0
    eval_data_df = eval_data_df.sample(frac=1).reset_index(drop=True)
    # Eval split defines the ratio of data going into the training set
    #train_df = eval_data_df.iloc[:int(total_eval * eval_split), :]
    # The rest of the validation data is used as test set
    test_df = eval_data_df.iloc[int(total_eval * eval_split):, :]   
    
    usc_nn_df = load_data(sent_nn_file)
    usc_nn_df = usc_nn_df.sample(frac=1).reset_index(drop=True)
    total_usc_nn = usc_nn_df.shape[0]
    usc_nn_df.reset_index(drop=True)
    train_df = pd.DataFrame()
    usc_nn_split = 1
    ext_df = usc_nn_df.iloc[:int(total_usc_nn * usc_nn_split), :]
    train_df = train_df.append(ext_df, ignore_index=True)
    
    return train_df, test_df

# Download and process the dataset files.
def download_and_load_eval_datasets(force_download=False):
  validation = tf.keras.utils.get_file(
      fname="validation", 
      origin=PATH_EVAL_DATA)
  random_nn = tf.keras.utils.get_file(
    fname="rand_nn", 
    origin=PATH_RAND_NN_DATA)
  sent_nn = tf.keras.utils.get_file(
    fname="sent_nn", 
    origin=PATH_USC_NN_DATA)

  #train_df, test_df = load_validation_only(validation)
  train_df, test_df = load_augmented(validation, random_nn, sent_nn)
  
  return train_df, test_df


In [8]:
train, test = download_and_load_eval_datasets()

print("\nTrain data")
print(train.shape)
for i in range(5):
  print(train.iloc[i]['label'])
  print(train.iloc[i]['context'])
  print(train.iloc[i]['ending'])

print("\nTest data")
print(test.shape)
for i in range(5):
  print(test.iloc[i]['label'])
  print(test.iloc[i]['context'])
  print(test.iloc[i]['ending'])

Downloading data from http://felix.graule.ch/wp-content/uploads/2019/05/cloze_test_val_spring2016.csv
Downloading data from http://felix.graule.ch/wp-content/uploads/2019/05/train_stories_rand_combined.csv
Downloading data from http://felix.graule.ch/wp-content/uploads/2019/05/train_stories_nearest_story_usc_combined.csv
/root/.keras/datasets/validation
/root/.keras/datasets/sent_nn

Train data
(176322, 5)
0
Suzy went to the local ice cream shop. She wanted some good flavors. But they had none. So she left empty handed.
When she tasted it it was heaven.
0
Max hated doing homework. One day, he decided he was going to say that his dog ate his work. Max told his teacher the story. Max's teacher did not believe him.
John was furious and spent the morning doing his homework again.
1
Carlton was making a sandwich. He decided to cut the sandwich in half. Carlton accidently cut his finger. He bandaged his wound.
Carlton finally ate his sandwich.
0
Denise always got her eyebrows waxed. However,

Quick check whether dataset are fully disjoint (takes really long obviously)


In [0]:
train.shape, test.shape
for j in range(10):
    query = train.iloc[j]['ending']
    for i in range(test.shape[0]):
      tmp = test.iloc[i]['ending']
      if tmp == query:
        print("Found something equal")
        print(tmp)

For us, our input data are the 'context' and 'ending' column and our label is the 'label' column (0, 1 for negative and positive, respecitvely)

In [0]:
CONTEXT_COLUMN = 'context'
ENDING_COLUMN = 'ending'
LABEL_COLUMN = 'label'
label_list = [0, 1]

#Data Preprocessing
We'll need to transform our data into a format BERT understands. This involves two steps. First, we create  `InputExample`'s using the constructor provided in the BERT library.

- `text_a` is the text we want to classify, which in this case, is the `Request` field in our Dataframe. For us, this is the context of the story.
- `text_b` is used if we're training a model to understand the relationship between sentences (i.e. is `text_b` a translation of `text_a`? Is `text_b` an answer to the question asked by `text_a`?). This is the ending in our case
- `label` is the label for our example, i.e. True, False

In [11]:
# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this example
                                                                   text_a = x[CONTEXT_COLUMN], 
                                                                   text_b = x[ENDING_COLUMN], 
                                                                   label = x[LABEL_COLUMN]), axis = 1)
print(train_InputExamples.shape)
test_InputExamples = test.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                                   text_a = x[CONTEXT_COLUMN], 
                                                                   text_b = x[ENDING_COLUMN], 
                                                                   label = x[LABEL_COLUMN]), axis = 1)
print(test_InputExamples.shape)

(176322,)
(3742,)


Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (but don't worry--this is also included in the Python library):


1. Lowercase our text (if we're using a BERT lowercase model)
2. Tokenize it (i.e. "sally says hi" -> ["sally", "says", "hi"])
3. Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
4. Map our words to indexes using a vocab file that BERT provides
5. Add special "CLS" and "SEP" tokens (see the [readme](https://github.com/google-research/bert))
6. Append "index" and "segment" tokens to each input (see the [BERT paper](https://arxiv.org/pdf/1810.04805.pdf))

Happily, we don't have to worry about most of these details.




To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:

In [12]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

Instructions for updating:
Colocations handled automatically by placer.


W0601 08:12:04.844397 139940245366656 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py:3632: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0601 08:12:06.775159 139940245366656 saver.py:1483] Saver not created because there are no variables in the graph to restore


Great--we just learned that the BERT model we're using expects lowercase data (that's what stored in tokenization_info["do_lower_case"]) and we also loaded BERT's vocab file. We also created a tokenizer, which breaks words into word pieces:

In [13]:
tokenizer.tokenize("This here's an example of using the BERT tokenizer")

['this',
 'here',
 "'",
 's',
 'an',
 'example',
 'of',
 'using',
 'the',
 'bert',
 'token',
 '##izer']

Using our tokenizer, we'll call `run_classifier.convert_examples_to_features` on our InputExamples to convert them into features BERT understands.

In [14]:
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128
# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)

INFO:tensorflow:Writing example 0 of 176322


I0601 08:12:14.712474 139940245366656 run_classifier.py:774] Writing example 0 of 176322


INFO:tensorflow:*** Example ***


I0601 08:12:14.717930 139940245366656 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0601 08:12:14.721317 139940245366656 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] suzy went to the local ice cream shop . she wanted some good flavors . but they had none . so she left empty handed . [SEP] when she tasted it it was heaven . [SEP]


I0601 08:12:14.724427 139940245366656 run_classifier.py:464] tokens: [CLS] suzy went to the local ice cream shop . she wanted some good flavors . but they had none . so she left empty handed . [SEP] when she tasted it it was heaven . [SEP]


INFO:tensorflow:input_ids: 101 28722 2253 2000 1996 2334 3256 6949 4497 1012 2016 2359 2070 2204 26389 1012 2021 2027 2018 3904 1012 2061 2016 2187 4064 4375 1012 102 2043 2016 12595 2009 2009 2001 6014 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:12:14.727173 139940245366656 run_classifier.py:465] input_ids: 101 28722 2253 2000 1996 2334 3256 6949 4497 1012 2016 2359 2070 2204 26389 1012 2021 2027 2018 3904 1012 2061 2016 2187 4064 4375 1012 102 2043 2016 12595 2009 2009 2001 6014 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:12:14.729909 139940245366656 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:12:14.732416 139940245366656 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0601 08:12:14.734837 139940245366656 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0601 08:12:14.739149 139940245366656 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0601 08:12:14.741569 139940245366656 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] max hated doing homework . one day , he decided he was going to say that his dog ate his work . max told his teacher the story . max ' s teacher did not believe him . [SEP] john was furious and spent the morning doing his homework again . [SEP]


I0601 08:12:14.743460 139940245366656 run_classifier.py:464] tokens: [CLS] max hated doing homework . one day , he decided he was going to say that his dog ate his work . max told his teacher the story . max ' s teacher did not believe him . [SEP] john was furious and spent the morning doing his homework again . [SEP]


INFO:tensorflow:input_ids: 101 4098 6283 2725 19453 1012 2028 2154 1010 2002 2787 2002 2001 2183 2000 2360 2008 2010 3899 8823 2010 2147 1012 4098 2409 2010 3836 1996 2466 1012 4098 1005 1055 3836 2106 2025 2903 2032 1012 102 2198 2001 9943 1998 2985 1996 2851 2725 2010 19453 2153 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:12:14.746022 139940245366656 run_classifier.py:465] input_ids: 101 4098 6283 2725 19453 1012 2028 2154 1010 2002 2787 2002 2001 2183 2000 2360 2008 2010 3899 8823 2010 2147 1012 4098 2409 2010 3836 1996 2466 1012 4098 1005 1055 3836 2106 2025 2903 2032 1012 102 2198 2001 9943 1998 2985 1996 2851 2725 2010 19453 2153 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:12:14.748591 139940245366656 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:12:14.751113 139940245366656 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0601 08:12:14.753523 139940245366656 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0601 08:12:14.757317 139940245366656 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0601 08:12:14.759394 139940245366656 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] carlton was making a sandwich . he decided to cut the sandwich in half . carlton accident ##ly cut his finger . he bandage ##d his wound . [SEP] carlton finally ate his sandwich . [SEP]


I0601 08:12:14.761800 139940245366656 run_classifier.py:464] tokens: [CLS] carlton was making a sandwich . he decided to cut the sandwich in half . carlton accident ##ly cut his finger . he bandage ##d his wound . [SEP] carlton finally ate his sandwich . [SEP]


INFO:tensorflow:input_ids: 101 12989 2001 2437 1037 11642 1012 2002 2787 2000 3013 1996 11642 1999 2431 1012 12989 4926 2135 3013 2010 4344 1012 2002 24446 2094 2010 6357 1012 102 12989 2633 8823 2010 11642 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:12:14.764221 139940245366656 run_classifier.py:465] input_ids: 101 12989 2001 2437 1037 11642 1012 2002 2787 2000 3013 1996 11642 1999 2431 1012 12989 4926 2135 3013 2010 4344 1012 2002 24446 2094 2010 6357 1012 102 12989 2633 8823 2010 11642 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:12:14.766842 139940245366656 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:12:14.769310 139940245366656 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0601 08:12:14.771802 139940245366656 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0601 08:12:14.776029 139940245366656 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0601 08:12:14.778108 139940245366656 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] denise always got her eyebrows wax ##ed . however , denise recently learned about eyebrow thread ##ing . denise decided to give eyebrow thread ##ing a try . after locating a specialist , denise got the thread ##ing procedure . [SEP] denise was thrilled with the outcome of her eyebrow thread ##ing . [SEP]


I0601 08:12:14.780009 139940245366656 run_classifier.py:464] tokens: [CLS] denise always got her eyebrows wax ##ed . however , denise recently learned about eyebrow thread ##ing . denise decided to give eyebrow thread ##ing a try . after locating a specialist , denise got the thread ##ing procedure . [SEP] denise was thrilled with the outcome of her eyebrow thread ##ing . [SEP]


INFO:tensorflow:input_ids: 101 15339 2467 2288 2014 8407 13844 2098 1012 2174 1010 15339 3728 4342 2055 9522 11689 2075 1012 15339 2787 2000 2507 9522 11689 2075 1037 3046 1012 2044 26339 1037 8325 1010 15339 2288 1996 11689 2075 7709 1012 102 15339 2001 16082 2007 1996 9560 1997 2014 9522 11689 2075 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:12:14.784812 139940245366656 run_classifier.py:465] input_ids: 101 15339 2467 2288 2014 8407 13844 2098 1012 2174 1010 15339 3728 4342 2055 9522 11689 2075 1012 15339 2787 2000 2507 9522 11689 2075 1037 3046 1012 2044 26339 1037 8325 1010 15339 2288 1996 11689 2075 7709 1012 102 15339 2001 16082 2007 1996 9560 1997 2014 9522 11689 2075 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:12:14.786859 139940245366656 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:12:14.789630 139940245366656 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0601 08:12:14.792026 139940245366656 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0601 08:12:14.795991 139940245366656 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0601 08:12:14.798460 139940245366656 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] christina ' s aunt had a baby . christina was really excited to meet him . her aunt arrived at her home with the baby . christina got to hold him on the couch . [SEP] she cu ##ddled the baby and promised to be a good cousin . [SEP]


I0601 08:12:14.800930 139940245366656 run_classifier.py:464] tokens: [CLS] christina ' s aunt had a baby . christina was really excited to meet him . her aunt arrived at her home with the baby . christina got to hold him on the couch . [SEP] she cu ##ddled the baby and promised to be a good cousin . [SEP]


INFO:tensorflow:input_ids: 101 12657 1005 1055 5916 2018 1037 3336 1012 12657 2001 2428 7568 2000 3113 2032 1012 2014 5916 3369 2012 2014 2188 2007 1996 3336 1012 12657 2288 2000 2907 2032 2006 1996 6411 1012 102 2016 12731 28090 1996 3336 1998 5763 2000 2022 1037 2204 5542 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:12:14.803010 139940245366656 run_classifier.py:465] input_ids: 101 12657 1005 1055 5916 2018 1037 3336 1012 12657 2001 2428 7568 2000 3113 2032 1012 2014 5916 3369 2012 2014 2188 2007 1996 3336 1012 12657 2288 2000 2907 2032 2006 1996 6411 1012 102 2016 12731 28090 1996 3336 1998 5763 2000 2022 1037 2204 5542 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:12:14.805415 139940245366656 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:12:14.807897 139940245366656 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0601 08:12:14.810482 139940245366656 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:Writing example 10000 of 176322


I0601 08:12:24.644911 139940245366656 run_classifier.py:774] Writing example 10000 of 176322


INFO:tensorflow:Writing example 20000 of 176322


I0601 08:12:33.437610 139940245366656 run_classifier.py:774] Writing example 20000 of 176322


INFO:tensorflow:Writing example 30000 of 176322


I0601 08:12:43.112353 139940245366656 run_classifier.py:774] Writing example 30000 of 176322


INFO:tensorflow:Writing example 40000 of 176322


I0601 08:12:52.544986 139940245366656 run_classifier.py:774] Writing example 40000 of 176322


INFO:tensorflow:Writing example 50000 of 176322


I0601 08:13:00.906580 139940245366656 run_classifier.py:774] Writing example 50000 of 176322


INFO:tensorflow:Writing example 60000 of 176322


I0601 08:13:09.274322 139940245366656 run_classifier.py:774] Writing example 60000 of 176322


INFO:tensorflow:Writing example 70000 of 176322


I0601 08:13:17.985007 139940245366656 run_classifier.py:774] Writing example 70000 of 176322


INFO:tensorflow:Writing example 80000 of 176322


I0601 08:13:26.414134 139940245366656 run_classifier.py:774] Writing example 80000 of 176322


INFO:tensorflow:Writing example 90000 of 176322


I0601 08:13:34.785308 139940245366656 run_classifier.py:774] Writing example 90000 of 176322


INFO:tensorflow:Writing example 100000 of 176322


I0601 08:13:43.124537 139940245366656 run_classifier.py:774] Writing example 100000 of 176322


INFO:tensorflow:Writing example 110000 of 176322


I0601 08:13:51.927272 139940245366656 run_classifier.py:774] Writing example 110000 of 176322


INFO:tensorflow:Writing example 120000 of 176322


I0601 08:14:00.749648 139940245366656 run_classifier.py:774] Writing example 120000 of 176322


INFO:tensorflow:Writing example 130000 of 176322


I0601 08:14:10.286998 139940245366656 run_classifier.py:774] Writing example 130000 of 176322


INFO:tensorflow:Writing example 140000 of 176322


I0601 08:14:18.642335 139940245366656 run_classifier.py:774] Writing example 140000 of 176322


INFO:tensorflow:Writing example 150000 of 176322


I0601 08:14:26.972516 139940245366656 run_classifier.py:774] Writing example 150000 of 176322


INFO:tensorflow:Writing example 160000 of 176322


I0601 08:14:35.962883 139940245366656 run_classifier.py:774] Writing example 160000 of 176322


INFO:tensorflow:Writing example 170000 of 176322


I0601 08:14:44.339380 139940245366656 run_classifier.py:774] Writing example 170000 of 176322


INFO:tensorflow:Writing example 0 of 3742


I0601 08:14:49.606702 139940245366656 run_classifier.py:774] Writing example 0 of 3742


INFO:tensorflow:*** Example ***


I0601 08:14:49.609277 139940245366656 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0601 08:14:49.611447 139940245366656 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] evan complained about everything . his girlfriend started making him dinner , but he said it smelled bad . she was so offended ! she left mid - cooking , leaving the meal unfinished . [SEP] evan was pleased . [SEP]


I0601 08:14:49.613325 139940245366656 run_classifier.py:464] tokens: [CLS] evan complained about everything . his girlfriend started making him dinner , but he said it smelled bad . she was so offended ! she left mid - cooking , leaving the meal unfinished . [SEP] evan was pleased . [SEP]


INFO:tensorflow:input_ids: 101 9340 10865 2055 2673 1012 2010 6513 2318 2437 2032 4596 1010 2021 2002 2056 2009 9557 2919 1012 2016 2001 2061 15807 999 2016 2187 3054 1011 8434 1010 2975 1996 7954 14342 1012 102 9340 2001 7537 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:14:49.615489 139940245366656 run_classifier.py:465] input_ids: 101 9340 10865 2055 2673 1012 2010 6513 2318 2437 2032 4596 1010 2021 2002 2056 2009 9557 2919 1012 2016 2001 2061 15807 999 2016 2187 3054 1011 8434 1010 2975 1996 7954 14342 1012 102 9340 2001 7537 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:14:49.618442 139940245366656 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:14:49.622142 139940245366656 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0601 08:14:49.625305 139940245366656 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0601 08:14:49.633429 139940245366656 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0601 08:14:49.635244 139940245366656 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] gina ' s local library had no books on dolphins . she needed to find another library . the closest one was downtown . her mother refused to take her . [SEP] gina and her mother went immediately . [SEP]


I0601 08:14:49.636967 139940245366656 run_classifier.py:464] tokens: [CLS] gina ' s local library had no books on dolphins . she needed to find another library . the closest one was downtown . her mother refused to take her . [SEP] gina and her mother went immediately . [SEP]


INFO:tensorflow:input_ids: 101 17508 1005 1055 2334 3075 2018 2053 2808 2006 13600 1012 2016 2734 2000 2424 2178 3075 1012 1996 7541 2028 2001 5116 1012 2014 2388 4188 2000 2202 2014 1012 102 17508 1998 2014 2388 2253 3202 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:14:49.638842 139940245366656 run_classifier.py:465] input_ids: 101 17508 1005 1055 2334 3075 2018 2053 2808 2006 13600 1012 2016 2734 2000 2424 2178 3075 1012 1996 7541 2028 2001 5116 1012 2014 2388 4188 2000 2202 2014 1012 102 17508 1998 2014 2388 2253 3202 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:14:49.641345 139940245366656 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:14:49.643419 139940245366656 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0601 08:14:49.646639 139940245366656 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0601 08:14:49.651936 139940245366656 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0601 08:14:49.654263 139940245366656 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] last week jared was driving to work . he had been party ##ing hard the night before . he didn ' t get any sleep . before he knew it his car was facing on ##coming traffic . [SEP] jared decided to accelerate . [SEP]


I0601 08:14:49.656228 139940245366656 run_classifier.py:464] tokens: [CLS] last week jared was driving to work . he had been party ##ing hard the night before . he didn ' t get any sleep . before he knew it his car was facing on ##coming traffic . [SEP] jared decided to accelerate . [SEP]


INFO:tensorflow:input_ids: 101 2197 2733 8334 2001 4439 2000 2147 1012 2002 2018 2042 2283 2075 2524 1996 2305 2077 1012 2002 2134 1005 1056 2131 2151 3637 1012 2077 2002 2354 2009 2010 2482 2001 5307 2006 18935 4026 1012 102 8334 2787 2000 23306 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:14:49.659828 139940245366656 run_classifier.py:465] input_ids: 101 2197 2733 8334 2001 4439 2000 2147 1012 2002 2018 2042 2283 2075 2524 1996 2305 2077 1012 2002 2134 1005 1056 2131 2151 3637 1012 2077 2002 2354 2009 2010 2482 2001 5307 2006 18935 4026 1012 102 8334 2787 2000 23306 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:14:49.662288 139940245366656 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:14:49.666015 139940245366656 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0601 08:14:49.668714 139940245366656 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0601 08:14:49.674420 139940245366656 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0601 08:14:49.677419 139940245366656 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] bill had just moved to a big city for work . he felt lonely because he did not know anybody in the town . after work one day bill decided to join some cow ##or ##kers for drinks . he had a lot of fun hanging out with them . [SEP] bill was upset at his cow ##or ##kers . [SEP]


I0601 08:14:49.679993 139940245366656 run_classifier.py:464] tokens: [CLS] bill had just moved to a big city for work . he felt lonely because he did not know anybody in the town . after work one day bill decided to join some cow ##or ##kers for drinks . he had a lot of fun hanging out with them . [SEP] bill was upset at his cow ##or ##kers . [SEP]


INFO:tensorflow:input_ids: 101 3021 2018 2074 2333 2000 1037 2502 2103 2005 2147 1012 2002 2371 9479 2138 2002 2106 2025 2113 10334 1999 1996 2237 1012 2044 2147 2028 2154 3021 2787 2000 3693 2070 11190 2953 11451 2005 8974 1012 2002 2018 1037 2843 1997 4569 5689 2041 2007 2068 1012 102 3021 2001 6314 2012 2010 11190 2953 11451 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:14:49.682549 139940245366656 run_classifier.py:465] input_ids: 101 3021 2018 2074 2333 2000 1037 2502 2103 2005 2147 1012 2002 2371 9479 2138 2002 2106 2025 2113 10334 1999 1996 2237 1012 2044 2147 2028 2154 3021 2787 2000 3693 2070 11190 2953 11451 2005 8974 1012 2002 2018 1037 2843 1997 4569 5689 2041 2007 2068 1012 102 3021 2001 6314 2012 2010 11190 2953 11451 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:14:49.685225 139940245366656 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:14:49.687863 139940245366656 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0601 08:14:49.690436 139940245366656 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0601 08:14:49.694683 139940245366656 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0601 08:14:49.697342 139940245366656 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] donovan asked tri ##na to go dancing with him . tri ##na liked donovan so she said yes but she was really worried . after school she ran home to ask her dad for help . when her dad came home she explained the situation and dad smiled . [SEP] tri ##na ' s dad taught her how to dance . [SEP]


I0601 08:14:49.699956 139940245366656 run_classifier.py:464] tokens: [CLS] donovan asked tri ##na to go dancing with him . tri ##na liked donovan so she said yes but she was really worried . after school she ran home to ask her dad for help . when her dad came home she explained the situation and dad smiled . [SEP] tri ##na ' s dad taught her how to dance . [SEP]


INFO:tensorflow:input_ids: 101 12729 2356 13012 2532 2000 2175 5613 2007 2032 1012 13012 2532 4669 12729 2061 2016 2056 2748 2021 2016 2001 2428 5191 1012 2044 2082 2016 2743 2188 2000 3198 2014 3611 2005 2393 1012 2043 2014 3611 2234 2188 2016 4541 1996 3663 1998 3611 3281 1012 102 13012 2532 1005 1055 3611 4036 2014 2129 2000 3153 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:14:49.702548 139940245366656 run_classifier.py:465] input_ids: 101 12729 2356 13012 2532 2000 2175 5613 2007 2032 1012 13012 2532 4669 12729 2061 2016 2056 2748 2021 2016 2001 2428 5191 1012 2044 2082 2016 2743 2188 2000 3198 2014 3611 2005 2393 1012 2043 2014 3611 2234 2188 2016 4541 1996 3663 1998 3611 3281 1012 102 13012 2532 1005 1055 3611 4036 2014 2129 2000 3153 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:14:49.705132 139940245366656 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0601 08:14:49.707709 139940245366656 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0601 08:14:49.710278 139940245366656 run_classifier.py:468] label: 1 (id = 1)


#Creating a model

Now that we've prepared our data, let's focus on building a model. `create_model` does just this below. First, it loads the BERT tf hub module again (this time to extract the computation graph). Next, it creates a single new layer that will be trained to adapt BERT to our sentiment task (i.e. classifying whether a movie review is positive or negative). This strategy of using a mostly trained model is called [fine-tuning](http://wiki.fast.ai/index.php/Fine_tuning).

In [0]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)


Next we'll wrap our model function in a `model_fn_builder` function that adapts our model to work for training, evaluation, and prediction.

In [0]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        f1_score = tf.contrib.metrics.f1_score(
            label_ids,
            predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        return {
            "eval_accuracy": accuracy,
            "f1_score": f1_score,
            "auc": auc,
            "precision": precision,
            "recall": recall,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
        }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


In [0]:
# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 500
SAVE_SUMMARY_STEPS = 100

In [0]:
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [0]:
# Specify outpit directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [20]:
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})

INFO:tensorflow:Using config: {'_model_dir': 'bert_story_cloze_usc', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f45e8e0fe80>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


I0601 08:14:52.954023 139940245366656 estimator.py:201] Using config: {'_model_dir': 'bert_story_cloze_usc', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f45e8e0fe80>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


Next we create an input builder function that takes our training feature set (`train_features`) and produces a generator. This is a pretty standard design pattern for working with Tensorflow [Estimators](https://www.tensorflow.org/guide/estimators).

In [0]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

Now we train our model! For me, using a Colab notebook running on Google's GPUs, my training time was about 14 minutes.

In [22]:
print(f'Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!
INFO:tensorflow:Calling model_fn.


I0601 08:16:22.070276 139940245366656 estimator.py:1111] Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0601 08:16:25.114854 139940245366656 saver.py:1483] Saver not created because there are no variables in the graph to restore


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


W0601 08:16:25.230838 139940245366656 deprecation.py:506] From <ipython-input-15-ca03218f28a6>:34: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


W0601 08:16:25.273984 139940245366656 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/learning_rate_decay_v2.py:321: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


Instructions for updating:
Use tf.cast instead.


W0601 08:16:25.348211 139940245366656 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


Instructions for updating:
Use tf.cast instead.


W0601 08:16:34.378244 139940245366656 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/metrics_impl.py:455: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.



For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

INFO:tensorflow:Done calling model_fn.


I0601 08:16:36.553196 139940245366656 estimator.py:1113] Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


I0601 08:16:36.557875 139940245366656 basic_session_run_hooks.py:527] Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


I0601 08:16:43.492038 139940245366656 monitored_session.py:222] Graph was finalized.


INFO:tensorflow:Running local_init_op.


I0601 08:16:48.569535 139940245366656 session_manager.py:491] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0601 08:16:48.778971 139940245366656 session_manager.py:493] Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into bert_story_cloze_usc/model.ckpt.


I0601 08:18:06.873062 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 0 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:loss = 0.6756137, step = 0


I0601 08:18:29.437026 139940245366656 basic_session_run_hooks.py:249] loss = 0.6756137, step = 0


INFO:tensorflow:global_step/sec: 1.00223


I0601 08:20:09.214457 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00223


INFO:tensorflow:loss = 0.632988, step = 100 (99.783 sec)


I0601 08:20:09.219679 139940245366656 basic_session_run_hooks.py:247] loss = 0.632988, step = 100 (99.783 sec)


INFO:tensorflow:global_step/sec: 1.13113


I0601 08:21:37.621594 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13113


INFO:tensorflow:loss = 0.5901122, step = 200 (88.405 sec)


I0601 08:21:37.624702 139940245366656 basic_session_run_hooks.py:247] loss = 0.5901122, step = 200 (88.405 sec)


INFO:tensorflow:global_step/sec: 1.13443


I0601 08:23:05.771948 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13443


INFO:tensorflow:loss = 0.4470895, step = 300 (88.152 sec)


I0601 08:23:05.776972 139940245366656 basic_session_run_hooks.py:247] loss = 0.4470895, step = 300 (88.152 sec)


INFO:tensorflow:global_step/sec: 1.13933


I0601 08:24:33.543066 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13933


INFO:tensorflow:loss = 0.4655684, step = 400 (87.771 sec)


I0601 08:24:33.547868 139940245366656 basic_session_run_hooks.py:247] loss = 0.4655684, step = 400 (87.771 sec)


INFO:tensorflow:Saving checkpoints for 500 into bert_story_cloze_usc/model.ckpt.


I0601 08:26:00.441878 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 500 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00691


I0601 08:26:12.856731 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00691


INFO:tensorflow:loss = 0.33426297, step = 500 (99.314 sec)


I0601 08:26:12.862117 139940245366656 basic_session_run_hooks.py:247] loss = 0.33426297, step = 500 (99.314 sec)


INFO:tensorflow:global_step/sec: 1.12909


I0601 08:27:41.423748 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12909


INFO:tensorflow:loss = 0.3481841, step = 600 (88.567 sec)


I0601 08:27:41.429075 139940245366656 basic_session_run_hooks.py:247] loss = 0.3481841, step = 600 (88.567 sec)


INFO:tensorflow:global_step/sec: 1.13425


I0601 08:29:09.587774 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13425


INFO:tensorflow:loss = 0.466493, step = 700 (88.163 sec)


I0601 08:29:09.591684 139940245366656 basic_session_run_hooks.py:247] loss = 0.466493, step = 700 (88.163 sec)


INFO:tensorflow:global_step/sec: 1.13875


I0601 08:30:37.403449 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13875


INFO:tensorflow:loss = 0.47444075, step = 800 (87.817 sec)


I0601 08:30:37.408694 139940245366656 basic_session_run_hooks.py:247] loss = 0.47444075, step = 800 (87.817 sec)


INFO:tensorflow:global_step/sec: 1.13717


I0601 08:32:05.341082 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13717


INFO:tensorflow:loss = 0.36750597, step = 900 (87.935 sec)


I0601 08:32:05.343488 139940245366656 basic_session_run_hooks.py:247] loss = 0.36750597, step = 900 (87.935 sec)


INFO:tensorflow:Saving checkpoints for 1000 into bert_story_cloze_usc/model.ckpt.


I0601 08:33:32.434577 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 1000 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00851


I0601 08:33:44.497618 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00851


INFO:tensorflow:loss = 0.52443284, step = 1000 (99.158 sec)


I0601 08:33:44.501294 139940245366656 basic_session_run_hooks.py:247] loss = 0.52443284, step = 1000 (99.158 sec)


INFO:tensorflow:global_step/sec: 1.12965


I0601 08:35:13.020908 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12965


INFO:tensorflow:loss = 0.44653141, step = 1100 (88.525 sec)


I0601 08:35:13.026765 139940245366656 basic_session_run_hooks.py:247] loss = 0.44653141, step = 1100 (88.525 sec)


INFO:tensorflow:global_step/sec: 1.135


I0601 08:36:41.126250 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.135


INFO:tensorflow:loss = 0.7119365, step = 1200 (88.103 sec)


I0601 08:36:41.129659 139940245366656 basic_session_run_hooks.py:247] loss = 0.7119365, step = 1200 (88.103 sec)


INFO:tensorflow:global_step/sec: 1.13507


I0601 08:38:09.226634 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13507


INFO:tensorflow:loss = 0.39263627, step = 1300 (88.102 sec)


I0601 08:38:09.231218 139940245366656 basic_session_run_hooks.py:247] loss = 0.39263627, step = 1300 (88.102 sec)


INFO:tensorflow:global_step/sec: 1.13355


I0601 08:39:37.444818 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13355


INFO:tensorflow:loss = 0.32633144, step = 1400 (88.218 sec)


I0601 08:39:37.448925 139940245366656 basic_session_run_hooks.py:247] loss = 0.32633144, step = 1400 (88.218 sec)


INFO:tensorflow:Saving checkpoints for 1500 into bert_story_cloze_usc/model.ckpt.


I0601 08:41:04.698054 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 1500 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00278


I0601 08:41:17.167839 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00278


INFO:tensorflow:loss = 0.4187672, step = 1500 (99.723 sec)


I0601 08:41:17.171917 139940245366656 basic_session_run_hooks.py:247] loss = 0.4187672, step = 1500 (99.723 sec)


INFO:tensorflow:global_step/sec: 1.12558


I0601 08:42:46.010766 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12558


INFO:tensorflow:loss = 0.34740704, step = 1600 (88.843 sec)


I0601 08:42:46.015115 139940245366656 basic_session_run_hooks.py:247] loss = 0.34740704, step = 1600 (88.843 sec)


INFO:tensorflow:global_step/sec: 1.13185


I0601 08:44:14.362033 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13185


INFO:tensorflow:loss = 0.43676728, step = 1700 (88.351 sec)


I0601 08:44:14.365630 139940245366656 basic_session_run_hooks.py:247] loss = 0.43676728, step = 1700 (88.351 sec)


INFO:tensorflow:global_step/sec: 1.13332


I0601 08:45:42.598281 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13332


INFO:tensorflow:loss = 0.4277192, step = 1800 (88.239 sec)


I0601 08:45:42.604624 139940245366656 basic_session_run_hooks.py:247] loss = 0.4277192, step = 1800 (88.239 sec)


INFO:tensorflow:global_step/sec: 1.13458


I0601 08:47:10.736611 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13458


INFO:tensorflow:loss = 0.33416843, step = 1900 (88.134 sec)


I0601 08:47:10.738886 139940245366656 basic_session_run_hooks.py:247] loss = 0.33416843, step = 1900 (88.134 sec)


INFO:tensorflow:Saving checkpoints for 2000 into bert_story_cloze_usc/model.ckpt.


I0601 08:48:38.009855 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 2000 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00543


I0601 08:48:50.197000 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00543


INFO:tensorflow:loss = 0.44717535, step = 2000 (99.462 sec)


I0601 08:48:50.200947 139940245366656 basic_session_run_hooks.py:247] loss = 0.44717535, step = 2000 (99.462 sec)


INFO:tensorflow:global_step/sec: 1.1278


I0601 08:50:18.865532 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.1278


INFO:tensorflow:loss = 0.5275962, step = 2100 (88.672 sec)


I0601 08:50:18.872536 139940245366656 basic_session_run_hooks.py:247] loss = 0.5275962, step = 2100 (88.672 sec)


INFO:tensorflow:global_step/sec: 1.13129


I0601 08:51:47.260219 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13129


INFO:tensorflow:loss = 0.6741233, step = 2200 (88.390 sec)


I0601 08:51:47.262715 139940245366656 basic_session_run_hooks.py:247] loss = 0.6741233, step = 2200 (88.390 sec)


INFO:tensorflow:global_step/sec: 1.13239


I0601 08:53:15.568989 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13239


INFO:tensorflow:loss = 0.2926872, step = 2300 (88.312 sec)


I0601 08:53:15.575224 139940245366656 basic_session_run_hooks.py:247] loss = 0.2926872, step = 2300 (88.312 sec)


INFO:tensorflow:global_step/sec: 1.13392


I0601 08:54:43.758966 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13392


INFO:tensorflow:loss = 0.5156459, step = 2400 (88.188 sec)


I0601 08:54:43.763658 139940245366656 basic_session_run_hooks.py:247] loss = 0.5156459, step = 2400 (88.188 sec)


INFO:tensorflow:Saving checkpoints for 2500 into bert_story_cloze_usc/model.ckpt.


I0601 08:56:11.171825 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 2500 into bert_story_cloze_usc/model.ckpt.


Instructions for updating:
Use standard file APIs to delete files with this prefix.


W0601 08:56:19.203486 139940245366656 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:966: remove_checkpoint (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to delete files with this prefix.


INFO:tensorflow:global_step/sec: 1.00008


I0601 08:56:23.750699 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00008


INFO:tensorflow:loss = 0.43178302, step = 2500 (99.993 sec)


I0601 08:56:23.756346 139940245366656 basic_session_run_hooks.py:247] loss = 0.43178302, step = 2500 (99.993 sec)


INFO:tensorflow:global_step/sec: 1.12922


I0601 08:57:52.307501 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12922


INFO:tensorflow:loss = 0.36364433, step = 2600 (88.554 sec)


I0601 08:57:52.310117 139940245366656 basic_session_run_hooks.py:247] loss = 0.36364433, step = 2600 (88.554 sec)


INFO:tensorflow:global_step/sec: 1.13495


I0601 08:59:20.417153 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13495


INFO:tensorflow:loss = 0.4373616, step = 2700 (88.109 sec)


I0601 08:59:20.419522 139940245366656 basic_session_run_hooks.py:247] loss = 0.4373616, step = 2700 (88.109 sec)


INFO:tensorflow:global_step/sec: 1.13444


I0601 09:00:48.566468 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13444


INFO:tensorflow:loss = 0.52648854, step = 2800 (88.152 sec)


I0601 09:00:48.571433 139940245366656 basic_session_run_hooks.py:247] loss = 0.52648854, step = 2800 (88.152 sec)


INFO:tensorflow:global_step/sec: 1.13685


I0601 09:02:16.528589 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13685


INFO:tensorflow:loss = 0.3208322, step = 2900 (87.962 sec)


I0601 09:02:16.533668 139940245366656 basic_session_run_hooks.py:247] loss = 0.3208322, step = 2900 (87.962 sec)


INFO:tensorflow:Saving checkpoints for 3000 into bert_story_cloze_usc/model.ckpt.


I0601 09:03:43.774582 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 3000 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00409


I0601 09:03:56.121121 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00409


INFO:tensorflow:loss = 0.3794538, step = 3000 (99.592 sec)


I0601 09:03:56.125643 139940245366656 basic_session_run_hooks.py:247] loss = 0.3794538, step = 3000 (99.592 sec)


INFO:tensorflow:global_step/sec: 1.12741


I0601 09:05:24.820590 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12741


INFO:tensorflow:loss = 0.3355801, step = 3100 (88.702 sec)


I0601 09:05:24.827327 139940245366656 basic_session_run_hooks.py:247] loss = 0.3355801, step = 3100 (88.702 sec)


INFO:tensorflow:global_step/sec: 1.13413


I0601 09:06:52.993559 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13413


INFO:tensorflow:loss = 0.46160346, step = 3200 (88.169 sec)


I0601 09:06:52.995939 139940245366656 basic_session_run_hooks.py:247] loss = 0.46160346, step = 3200 (88.169 sec)


INFO:tensorflow:global_step/sec: 1.13497


I0601 09:08:21.101633 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13497


INFO:tensorflow:loss = 0.38912216, step = 3300 (88.108 sec)


I0601 09:08:21.103838 139940245366656 basic_session_run_hooks.py:247] loss = 0.38912216, step = 3300 (88.108 sec)


INFO:tensorflow:global_step/sec: 1.1325


I0601 09:09:49.402054 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.1325


INFO:tensorflow:loss = 0.43936306, step = 3400 (88.301 sec)


I0601 09:09:49.404404 139940245366656 basic_session_run_hooks.py:247] loss = 0.43936306, step = 3400 (88.301 sec)


INFO:tensorflow:Saving checkpoints for 3500 into bert_story_cloze_usc/model.ckpt.


I0601 09:11:16.813041 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 3500 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 0.999757


I0601 09:11:29.426347 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 0.999757


INFO:tensorflow:loss = 0.4541797, step = 3500 (100.025 sec)


I0601 09:11:29.428971 139940245366656 basic_session_run_hooks.py:247] loss = 0.4541797, step = 3500 (100.025 sec)


INFO:tensorflow:global_step/sec: 1.12839


I0601 09:12:58.048188 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12839


INFO:tensorflow:loss = 0.56471616, step = 3600 (88.623 sec)


I0601 09:12:58.051601 139940245366656 basic_session_run_hooks.py:247] loss = 0.56471616, step = 3600 (88.623 sec)


INFO:tensorflow:global_step/sec: 1.13021


I0601 09:14:26.527653 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13021


INFO:tensorflow:loss = 0.3957388, step = 3700 (88.482 sec)


I0601 09:14:26.533359 139940245366656 basic_session_run_hooks.py:247] loss = 0.3957388, step = 3700 (88.482 sec)


INFO:tensorflow:global_step/sec: 1.13128


I0601 09:15:54.922997 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13128


INFO:tensorflow:loss = 0.3332871, step = 3800 (88.392 sec)


I0601 09:15:54.925167 139940245366656 basic_session_run_hooks.py:247] loss = 0.3332871, step = 3800 (88.392 sec)


INFO:tensorflow:global_step/sec: 1.13423


I0601 09:17:23.088629 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13423


INFO:tensorflow:loss = 0.50094855, step = 3900 (88.168 sec)


I0601 09:17:23.092904 139940245366656 basic_session_run_hooks.py:247] loss = 0.50094855, step = 3900 (88.168 sec)


INFO:tensorflow:Saving checkpoints for 4000 into bert_story_cloze_usc/model.ckpt.


I0601 09:18:50.465529 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 4000 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00162


I0601 09:19:02.927061 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00162


INFO:tensorflow:loss = 0.40507075, step = 4000 (99.837 sec)


I0601 09:19:02.929573 139940245366656 basic_session_run_hooks.py:247] loss = 0.40507075, step = 4000 (99.837 sec)


INFO:tensorflow:global_step/sec: 1.12893


I0601 09:20:31.506284 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12893


INFO:tensorflow:loss = 0.3865024, step = 4100 (88.581 sec)


I0601 09:20:31.511059 139940245366656 basic_session_run_hooks.py:247] loss = 0.3865024, step = 4100 (88.581 sec)


INFO:tensorflow:global_step/sec: 1.13296


I0601 09:21:59.770405 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13296


INFO:tensorflow:loss = 0.34759477, step = 4200 (88.262 sec)


I0601 09:21:59.772612 139940245366656 basic_session_run_hooks.py:247] loss = 0.34759477, step = 4200 (88.262 sec)


INFO:tensorflow:global_step/sec: 1.13377


I0601 09:23:27.971952 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13377


INFO:tensorflow:loss = 0.47700846, step = 4300 (88.211 sec)


I0601 09:23:27.983793 139940245366656 basic_session_run_hooks.py:247] loss = 0.47700846, step = 4300 (88.211 sec)


INFO:tensorflow:global_step/sec: 1.13532


I0601 09:24:56.053138 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13532


INFO:tensorflow:loss = 0.4378821, step = 4400 (88.076 sec)


I0601 09:24:56.059314 139940245366656 basic_session_run_hooks.py:247] loss = 0.4378821, step = 4400 (88.076 sec)


INFO:tensorflow:Saving checkpoints for 4500 into bert_story_cloze_usc/model.ckpt.


I0601 09:26:23.378437 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 4500 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00109


I0601 09:26:35.943860 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00109


INFO:tensorflow:loss = 0.2710629, step = 4500 (99.889 sec)


I0601 09:26:35.948140 139940245366656 basic_session_run_hooks.py:247] loss = 0.2710629, step = 4500 (99.889 sec)


INFO:tensorflow:global_step/sec: 1.12784


I0601 09:28:04.609253 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12784


INFO:tensorflow:loss = 0.44726634, step = 4600 (88.665 sec)


I0601 09:28:04.613263 139940245366656 basic_session_run_hooks.py:247] loss = 0.44726634, step = 4600 (88.665 sec)


INFO:tensorflow:global_step/sec: 1.13329


I0601 09:29:32.848146 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13329


INFO:tensorflow:loss = 0.4675079, step = 4700 (88.238 sec)


I0601 09:29:32.851637 139940245366656 basic_session_run_hooks.py:247] loss = 0.4675079, step = 4700 (88.238 sec)


INFO:tensorflow:global_step/sec: 1.13348


I0601 09:31:01.071953 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13348


INFO:tensorflow:loss = 0.6270321, step = 4800 (88.225 sec)


I0601 09:31:01.076680 139940245366656 basic_session_run_hooks.py:247] loss = 0.6270321, step = 4800 (88.225 sec)


INFO:tensorflow:global_step/sec: 1.13581


I0601 09:32:29.114867 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13581


INFO:tensorflow:loss = 0.34747392, step = 4900 (88.043 sec)


I0601 09:32:29.120096 139940245366656 basic_session_run_hooks.py:247] loss = 0.34747392, step = 4900 (88.043 sec)


INFO:tensorflow:Saving checkpoints for 5000 into bert_story_cloze_usc/model.ckpt.


I0601 09:33:56.173475 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 5000 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00346


I0601 09:34:08.770462 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00346


INFO:tensorflow:loss = 0.36054248, step = 5000 (99.656 sec)


I0601 09:34:08.775770 139940245366656 basic_session_run_hooks.py:247] loss = 0.36054248, step = 5000 (99.656 sec)


INFO:tensorflow:global_step/sec: 1.12752


I0601 09:35:37.460593 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12752


INFO:tensorflow:loss = 0.29258066, step = 5100 (88.693 sec)


I0601 09:35:37.468568 139940245366656 basic_session_run_hooks.py:247] loss = 0.29258066, step = 5100 (88.693 sec)


INFO:tensorflow:global_step/sec: 1.13363


I0601 09:37:05.672212 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13363


INFO:tensorflow:loss = 0.40564686, step = 5200 (88.208 sec)


I0601 09:37:05.676174 139940245366656 basic_session_run_hooks.py:247] loss = 0.40564686, step = 5200 (88.208 sec)


INFO:tensorflow:global_step/sec: 1.13656


I0601 09:38:33.656799 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13656


INFO:tensorflow:loss = 0.5436486, step = 5300 (87.985 sec)


I0601 09:38:33.661307 139940245366656 basic_session_run_hooks.py:247] loss = 0.5436486, step = 5300 (87.985 sec)


INFO:tensorflow:global_step/sec: 1.13354


I0601 09:40:01.875951 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13354


INFO:tensorflow:loss = 0.30060935, step = 5400 (88.217 sec)


I0601 09:40:01.878230 139940245366656 basic_session_run_hooks.py:247] loss = 0.30060935, step = 5400 (88.217 sec)


INFO:tensorflow:Saving checkpoints for 5500 into bert_story_cloze_usc/model.ckpt.


I0601 09:41:29.072517 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 5500 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00201


I0601 09:41:41.675109 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00201


INFO:tensorflow:loss = 0.47437295, step = 5500 (99.799 sec)


I0601 09:41:41.677102 139940245366656 basic_session_run_hooks.py:247] loss = 0.47437295, step = 5500 (99.799 sec)


INFO:tensorflow:global_step/sec: 1.12889


I0601 09:43:10.258129 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12889


INFO:tensorflow:loss = 0.41083828, step = 5600 (88.586 sec)


I0601 09:43:10.262959 139940245366656 basic_session_run_hooks.py:247] loss = 0.41083828, step = 5600 (88.586 sec)


INFO:tensorflow:global_step/sec: 1.13366


I0601 09:44:38.468288 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13366


INFO:tensorflow:loss = 0.23687021, step = 5700 (88.208 sec)


I0601 09:44:38.470444 139940245366656 basic_session_run_hooks.py:247] loss = 0.23687021, step = 5700 (88.208 sec)


INFO:tensorflow:global_step/sec: 1.13718


I0601 09:46:06.405481 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13718


INFO:tensorflow:loss = 0.35686308, step = 5800 (87.938 sec)


I0601 09:46:06.408620 139940245366656 basic_session_run_hooks.py:247] loss = 0.35686308, step = 5800 (87.938 sec)


INFO:tensorflow:global_step/sec: 1.13654


I0601 09:47:34.391634 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13654


INFO:tensorflow:loss = 0.43126565, step = 5900 (87.985 sec)


I0601 09:47:34.393915 139940245366656 basic_session_run_hooks.py:247] loss = 0.43126565, step = 5900 (87.985 sec)


INFO:tensorflow:Saving checkpoints for 6000 into bert_story_cloze_usc/model.ckpt.


I0601 09:49:01.603432 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 6000 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 0.999617


I0601 09:49:14.429904 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 0.999617


INFO:tensorflow:loss = 0.27974242, step = 6000 (100.038 sec)


I0601 09:49:14.432336 139940245366656 basic_session_run_hooks.py:247] loss = 0.27974242, step = 6000 (100.038 sec)


INFO:tensorflow:global_step/sec: 1.12832


I0601 09:50:43.057485 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12832


INFO:tensorflow:loss = 0.3152129, step = 6100 (88.628 sec)


I0601 09:50:43.060379 139940245366656 basic_session_run_hooks.py:247] loss = 0.3152129, step = 6100 (88.628 sec)


INFO:tensorflow:global_step/sec: 1.13424


I0601 09:52:11.222133 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13424


INFO:tensorflow:loss = 0.40401343, step = 6200 (88.169 sec)


I0601 09:52:11.229342 139940245366656 basic_session_run_hooks.py:247] loss = 0.40401343, step = 6200 (88.169 sec)


INFO:tensorflow:global_step/sec: 1.13704


I0601 09:53:39.169503 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13704


INFO:tensorflow:loss = 0.3029573, step = 6300 (87.943 sec)


I0601 09:53:39.172029 139940245366656 basic_session_run_hooks.py:247] loss = 0.3029573, step = 6300 (87.943 sec)


INFO:tensorflow:global_step/sec: 1.13462


I0601 09:55:07.304636 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13462


INFO:tensorflow:loss = 0.5002052, step = 6400 (88.139 sec)


I0601 09:55:07.310674 139940245366656 basic_session_run_hooks.py:247] loss = 0.5002052, step = 6400 (88.139 sec)


INFO:tensorflow:Saving checkpoints for 6500 into bert_story_cloze_usc/model.ckpt.


I0601 09:56:34.480550 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 6500 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00193


I0601 09:56:47.112389 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00193


INFO:tensorflow:loss = 0.37610075, step = 6500 (99.806 sec)


I0601 09:56:47.117088 139940245366656 basic_session_run_hooks.py:247] loss = 0.37610075, step = 6500 (99.806 sec)


INFO:tensorflow:global_step/sec: 1.12668


I0601 09:58:15.869163 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12668


INFO:tensorflow:loss = 0.4665627, step = 6600 (88.757 sec)


I0601 09:58:15.873934 139940245366656 basic_session_run_hooks.py:247] loss = 0.4665627, step = 6600 (88.757 sec)


INFO:tensorflow:global_step/sec: 1.13338


I0601 09:59:44.101209 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13338


INFO:tensorflow:loss = 0.31337404, step = 6700 (88.232 sec)


I0601 09:59:44.106163 139940245366656 basic_session_run_hooks.py:247] loss = 0.31337404, step = 6700 (88.232 sec)


INFO:tensorflow:global_step/sec: 1.13309


I0601 10:01:12.355630 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13309


INFO:tensorflow:loss = 0.28793126, step = 6800 (88.255 sec)


I0601 10:01:12.361513 139940245366656 basic_session_run_hooks.py:247] loss = 0.28793126, step = 6800 (88.255 sec)


INFO:tensorflow:global_step/sec: 1.1323


I0601 10:02:40.671438 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.1323


INFO:tensorflow:loss = 0.17237204, step = 6900 (88.318 sec)


I0601 10:02:40.679773 139940245366656 basic_session_run_hooks.py:247] loss = 0.17237204, step = 6900 (88.318 sec)


INFO:tensorflow:Saving checkpoints for 7000 into bert_story_cloze_usc/model.ckpt.


I0601 10:04:08.003457 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 7000 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00037


I0601 10:04:20.633975 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00037


INFO:tensorflow:loss = 0.3051492, step = 7000 (99.958 sec)


I0601 10:04:20.637526 139940245366656 basic_session_run_hooks.py:247] loss = 0.3051492, step = 7000 (99.958 sec)


INFO:tensorflow:global_step/sec: 1.1275


I0601 10:05:49.325469 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.1275


INFO:tensorflow:loss = 0.19397856, step = 7100 (88.690 sec)


I0601 10:05:49.327594 139940245366656 basic_session_run_hooks.py:247] loss = 0.19397856, step = 7100 (88.690 sec)


INFO:tensorflow:global_step/sec: 1.13332


I0601 10:07:17.561525 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13332


INFO:tensorflow:loss = 0.26287577, step = 7200 (88.240 sec)


I0601 10:07:17.567654 139940245366656 basic_session_run_hooks.py:247] loss = 0.26287577, step = 7200 (88.240 sec)


INFO:tensorflow:global_step/sec: 1.13503


I0601 10:08:45.664675 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13503


INFO:tensorflow:loss = 0.30078268, step = 7300 (88.103 sec)


I0601 10:08:45.670222 139940245366656 basic_session_run_hooks.py:247] loss = 0.30078268, step = 7300 (88.103 sec)


INFO:tensorflow:global_step/sec: 1.13534


I0601 10:10:13.744088 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13534


INFO:tensorflow:loss = 0.10658298, step = 7400 (88.078 sec)


I0601 10:10:13.748397 139940245366656 basic_session_run_hooks.py:247] loss = 0.10658298, step = 7400 (88.078 sec)


INFO:tensorflow:Saving checkpoints for 7500 into bert_story_cloze_usc/model.ckpt.


I0601 10:11:41.103529 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 7500 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.0027


I0601 10:11:53.474529 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.0027


INFO:tensorflow:loss = 0.30001718, step = 7500 (99.732 sec)


I0601 10:11:53.480414 139940245366656 basic_session_run_hooks.py:247] loss = 0.30001718, step = 7500 (99.732 sec)


INFO:tensorflow:global_step/sec: 1.12946


I0601 10:13:22.012571 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12946


INFO:tensorflow:loss = 0.39251298, step = 7600 (88.535 sec)


I0601 10:13:22.015064 139940245366656 basic_session_run_hooks.py:247] loss = 0.39251298, step = 7600 (88.535 sec)


INFO:tensorflow:global_step/sec: 1.13217


I0601 10:14:50.338360 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13217


INFO:tensorflow:loss = 0.170849, step = 7700 (88.331 sec)


I0601 10:14:50.346115 139940245366656 basic_session_run_hooks.py:247] loss = 0.170849, step = 7700 (88.331 sec)


INFO:tensorflow:global_step/sec: 1.13353


I0601 10:16:18.558341 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13353


INFO:tensorflow:loss = 0.13963507, step = 7800 (88.216 sec)


I0601 10:16:18.562585 139940245366656 basic_session_run_hooks.py:247] loss = 0.13963507, step = 7800 (88.216 sec)


INFO:tensorflow:global_step/sec: 1.13611


I0601 10:17:46.577959 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13611


INFO:tensorflow:loss = 0.27359766, step = 7900 (88.020 sec)


I0601 10:17:46.582415 139940245366656 basic_session_run_hooks.py:247] loss = 0.27359766, step = 7900 (88.020 sec)


INFO:tensorflow:Saving checkpoints for 8000 into bert_story_cloze_usc/model.ckpt.


I0601 10:19:13.676105 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 8000 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00323


I0601 10:19:26.255537 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00323


INFO:tensorflow:loss = 0.3617409, step = 8000 (99.679 sec)


I0601 10:19:26.261570 139940245366656 basic_session_run_hooks.py:247] loss = 0.3617409, step = 8000 (99.679 sec)


INFO:tensorflow:global_step/sec: 1.12902


I0601 10:20:54.828024 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12902


INFO:tensorflow:loss = 0.32557335, step = 8100 (88.573 sec)


I0601 10:20:54.834285 139940245366656 basic_session_run_hooks.py:247] loss = 0.32557335, step = 8100 (88.573 sec)


INFO:tensorflow:global_step/sec: 1.13379


I0601 10:22:23.027870 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13379


INFO:tensorflow:loss = 0.29277432, step = 8200 (88.197 sec)


I0601 10:22:23.031714 139940245366656 basic_session_run_hooks.py:247] loss = 0.29277432, step = 8200 (88.197 sec)


INFO:tensorflow:global_step/sec: 1.1324


I0601 10:23:51.335854 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.1324


INFO:tensorflow:loss = 0.29680425, step = 8300 (88.311 sec)


I0601 10:23:51.342646 139940245366656 basic_session_run_hooks.py:247] loss = 0.29680425, step = 8300 (88.311 sec)


INFO:tensorflow:global_step/sec: 1.136


I0601 10:25:19.364232 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.136


INFO:tensorflow:loss = 0.2686693, step = 8400 (88.026 sec)


I0601 10:25:19.368960 139940245366656 basic_session_run_hooks.py:247] loss = 0.2686693, step = 8400 (88.026 sec)


INFO:tensorflow:Saving checkpoints for 8500 into bert_story_cloze_usc/model.ckpt.


I0601 10:26:46.657106 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 8500 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 0.99959


I0601 10:26:59.405340 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 0.99959


INFO:tensorflow:loss = 0.39872694, step = 8500 (100.044 sec)


I0601 10:26:59.413115 139940245366656 basic_session_run_hooks.py:247] loss = 0.39872694, step = 8500 (100.044 sec)


INFO:tensorflow:global_step/sec: 1.12813


I0601 10:28:28.047392 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12813


INFO:tensorflow:loss = 0.2944889, step = 8600 (88.639 sec)


I0601 10:28:28.052418 139940245366656 basic_session_run_hooks.py:247] loss = 0.2944889, step = 8600 (88.639 sec)


INFO:tensorflow:global_step/sec: 1.13418


I0601 10:29:56.217018 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13418


INFO:tensorflow:loss = 0.17361563, step = 8700 (88.169 sec)


I0601 10:29:56.220866 139940245366656 basic_session_run_hooks.py:247] loss = 0.17361563, step = 8700 (88.169 sec)


INFO:tensorflow:global_step/sec: 1.13399


I0601 10:31:24.401074 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13399


INFO:tensorflow:loss = 0.17466132, step = 8800 (88.183 sec)


I0601 10:31:24.403468 139940245366656 basic_session_run_hooks.py:247] loss = 0.17466132, step = 8800 (88.183 sec)


INFO:tensorflow:global_step/sec: 1.13435


I0601 10:32:52.557412 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13435


INFO:tensorflow:loss = 0.3181656, step = 8900 (88.156 sec)


I0601 10:32:52.559628 139940245366656 basic_session_run_hooks.py:247] loss = 0.3181656, step = 8900 (88.156 sec)


INFO:tensorflow:Saving checkpoints for 9000 into bert_story_cloze_usc/model.ckpt.


I0601 10:34:19.780453 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 9000 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00112


I0601 10:34:32.445492 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00112


INFO:tensorflow:loss = 0.13036309, step = 9000 (99.895 sec)


I0601 10:34:32.454297 139940245366656 basic_session_run_hooks.py:247] loss = 0.13036309, step = 9000 (99.895 sec)


INFO:tensorflow:global_step/sec: 1.12856


I0601 10:36:01.054171 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12856


INFO:tensorflow:loss = 0.39478803, step = 9100 (88.606 sec)


I0601 10:36:01.060244 139940245366656 basic_session_run_hooks.py:247] loss = 0.39478803, step = 9100 (88.606 sec)


INFO:tensorflow:global_step/sec: 1.1344


I0601 10:37:29.206567 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.1344


INFO:tensorflow:loss = 0.4199794, step = 9200 (88.151 sec)


I0601 10:37:29.210952 139940245366656 basic_session_run_hooks.py:247] loss = 0.4199794, step = 9200 (88.151 sec)


INFO:tensorflow:global_step/sec: 1.13682


I0601 10:38:57.171121 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13682


INFO:tensorflow:loss = 0.3875946, step = 9300 (87.965 sec)


I0601 10:38:57.175671 139940245366656 basic_session_run_hooks.py:247] loss = 0.3875946, step = 9300 (87.965 sec)


INFO:tensorflow:global_step/sec: 1.13384


I0601 10:40:25.367080 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13384


INFO:tensorflow:loss = 0.3340182, step = 9400 (88.197 sec)


I0601 10:40:25.373058 139940245366656 basic_session_run_hooks.py:247] loss = 0.3340182, step = 9400 (88.197 sec)


INFO:tensorflow:Saving checkpoints for 9500 into bert_story_cloze_usc/model.ckpt.


I0601 10:41:52.894972 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 9500 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 0.998482


I0601 10:42:05.519134 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 0.998482


INFO:tensorflow:loss = 0.27870774, step = 9500 (100.150 sec)


I0601 10:42:05.523445 139940245366656 basic_session_run_hooks.py:247] loss = 0.27870774, step = 9500 (100.150 sec)


INFO:tensorflow:global_step/sec: 1.14345


I0601 10:43:32.973670 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.14345


INFO:tensorflow:loss = 0.5667405, step = 9600 (87.454 sec)


I0601 10:43:32.977463 139940245366656 basic_session_run_hooks.py:247] loss = 0.5667405, step = 9600 (87.454 sec)


INFO:tensorflow:global_step/sec: 1.14402


I0601 10:45:00.384633 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.14402


INFO:tensorflow:loss = 0.42810518, step = 9700 (87.411 sec)


I0601 10:45:00.388040 139940245366656 basic_session_run_hooks.py:247] loss = 0.42810518, step = 9700 (87.411 sec)


INFO:tensorflow:global_step/sec: 1.14842


I0601 10:46:27.460900 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.14842


INFO:tensorflow:loss = 0.28382826, step = 9800 (87.079 sec)


I0601 10:46:27.466807 139940245366656 basic_session_run_hooks.py:247] loss = 0.28382826, step = 9800 (87.079 sec)


INFO:tensorflow:global_step/sec: 1.13278


I0601 10:47:55.739189 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13278


INFO:tensorflow:loss = 0.32114255, step = 9900 (88.278 sec)


I0601 10:47:55.744651 139940245366656 basic_session_run_hooks.py:247] loss = 0.32114255, step = 9900 (88.278 sec)


INFO:tensorflow:Saving checkpoints for 10000 into bert_story_cloze_usc/model.ckpt.


I0601 10:49:23.061892 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 10000 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00002


I0601 10:49:35.736840 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00002


INFO:tensorflow:loss = 0.3828555, step = 10000 (99.996 sec)


I0601 10:49:35.741058 139940245366656 basic_session_run_hooks.py:247] loss = 0.3828555, step = 10000 (99.996 sec)


INFO:tensorflow:global_step/sec: 1.12737


I0601 10:51:04.438951 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12737


INFO:tensorflow:loss = 0.42766485, step = 10100 (88.701 sec)


I0601 10:51:04.442377 139940245366656 basic_session_run_hooks.py:247] loss = 0.42766485, step = 10100 (88.701 sec)


INFO:tensorflow:global_step/sec: 1.13289


I0601 10:52:32.708778 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13289


INFO:tensorflow:loss = 0.21888511, step = 10200 (88.270 sec)


I0601 10:52:32.712399 139940245366656 basic_session_run_hooks.py:247] loss = 0.21888511, step = 10200 (88.270 sec)


INFO:tensorflow:global_step/sec: 1.1325


I0601 10:54:01.008838 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.1325


INFO:tensorflow:loss = 0.2783426, step = 10300 (88.301 sec)


I0601 10:54:01.012796 139940245366656 basic_session_run_hooks.py:247] loss = 0.2783426, step = 10300 (88.301 sec)


INFO:tensorflow:global_step/sec: 1.13195


I0601 10:55:29.351764 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13195


INFO:tensorflow:loss = 0.21943986, step = 10400 (88.348 sec)


I0601 10:55:29.360916 139940245366656 basic_session_run_hooks.py:247] loss = 0.21943986, step = 10400 (88.348 sec)


INFO:tensorflow:Saving checkpoints for 10500 into bert_story_cloze_usc/model.ckpt.


I0601 10:56:56.853022 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 10500 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 0.997534


I0601 10:57:09.598925 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 0.997534


INFO:tensorflow:loss = 0.32716256, step = 10500 (100.241 sec)


I0601 10:57:09.602097 139940245366656 basic_session_run_hooks.py:247] loss = 0.32716256, step = 10500 (100.241 sec)


INFO:tensorflow:global_step/sec: 1.12777


I0601 10:58:38.269764 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12777


INFO:tensorflow:loss = 0.33191863, step = 10600 (88.670 sec)


I0601 10:58:38.272197 139940245366656 basic_session_run_hooks.py:247] loss = 0.33191863, step = 10600 (88.670 sec)


INFO:tensorflow:global_step/sec: 1.13351


I0601 11:00:06.491698 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13351


INFO:tensorflow:loss = 0.31869453, step = 10700 (88.222 sec)


I0601 11:00:06.493893 139940245366656 basic_session_run_hooks.py:247] loss = 0.31869453, step = 10700 (88.222 sec)


INFO:tensorflow:global_step/sec: 1.13563


I0601 11:01:34.548589 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13563


INFO:tensorflow:loss = 0.27886963, step = 10800 (88.061 sec)


I0601 11:01:34.554467 139940245366656 basic_session_run_hooks.py:247] loss = 0.27886963, step = 10800 (88.061 sec)


INFO:tensorflow:global_step/sec: 1.13584


I0601 11:03:02.589282 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13584


INFO:tensorflow:loss = 0.3134951, step = 10900 (88.038 sec)


I0601 11:03:02.592860 139940245366656 basic_session_run_hooks.py:247] loss = 0.3134951, step = 10900 (88.038 sec)


INFO:tensorflow:Saving checkpoints for 11000 into bert_story_cloze_usc/model.ckpt.


I0601 11:04:30.026006 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 11000 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00028


I0601 11:04:42.561212 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00028


INFO:tensorflow:loss = 0.42159447, step = 11000 (99.973 sec)


I0601 11:04:42.565630 139940245366656 basic_session_run_hooks.py:247] loss = 0.42159447, step = 11000 (99.973 sec)


INFO:tensorflow:global_step/sec: 1.1294


I0601 11:06:11.104087 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.1294


INFO:tensorflow:loss = 0.37511587, step = 11100 (88.541 sec)


I0601 11:06:11.106325 139940245366656 basic_session_run_hooks.py:247] loss = 0.37511587, step = 11100 (88.541 sec)


INFO:tensorflow:global_step/sec: 1.13335


I0601 11:07:39.337945 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13335


INFO:tensorflow:loss = 0.3082881, step = 11200 (88.234 sec)


I0601 11:07:39.340317 139940245366656 basic_session_run_hooks.py:247] loss = 0.3082881, step = 11200 (88.234 sec)


INFO:tensorflow:global_step/sec: 1.13744


I0601 11:09:07.254643 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13744


INFO:tensorflow:loss = 0.1079683, step = 11300 (87.920 sec)


I0601 11:09:07.259974 139940245366656 basic_session_run_hooks.py:247] loss = 0.1079683, step = 11300 (87.920 sec)


INFO:tensorflow:global_step/sec: 1.13567


I0601 11:10:35.308530 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13567


INFO:tensorflow:loss = 0.26682624, step = 11400 (88.053 sec)


I0601 11:10:35.312949 139940245366656 basic_session_run_hooks.py:247] loss = 0.26682624, step = 11400 (88.053 sec)


INFO:tensorflow:Saving checkpoints for 11500 into bert_story_cloze_usc/model.ckpt.


I0601 11:12:02.626929 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 11500 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00155


I0601 11:12:15.154109 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00155


INFO:tensorflow:loss = 0.1309683, step = 11500 (99.847 sec)


I0601 11:12:15.159708 139940245366656 basic_session_run_hooks.py:247] loss = 0.1309683, step = 11500 (99.847 sec)


INFO:tensorflow:global_step/sec: 1.12761


I0601 11:13:43.837116 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12761


INFO:tensorflow:loss = 0.20792305, step = 11600 (88.682 sec)


I0601 11:13:43.841551 139940245366656 basic_session_run_hooks.py:247] loss = 0.20792305, step = 11600 (88.682 sec)


INFO:tensorflow:global_step/sec: 1.13415


I0601 11:15:12.008682 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13415


INFO:tensorflow:loss = 0.33503306, step = 11700 (88.174 sec)


I0601 11:15:12.015355 139940245366656 basic_session_run_hooks.py:247] loss = 0.33503306, step = 11700 (88.174 sec)


INFO:tensorflow:global_step/sec: 1.13418


I0601 11:16:40.178149 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13418


INFO:tensorflow:loss = 0.21410964, step = 11800 (88.167 sec)


I0601 11:16:40.181935 139940245366656 basic_session_run_hooks.py:247] loss = 0.21410964, step = 11800 (88.167 sec)


INFO:tensorflow:global_step/sec: 1.13206


I0601 11:18:08.512995 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13206


INFO:tensorflow:loss = 0.30793026, step = 11900 (88.337 sec)


I0601 11:18:08.518766 139940245366656 basic_session_run_hooks.py:247] loss = 0.30793026, step = 11900 (88.337 sec)


INFO:tensorflow:Saving checkpoints for 12000 into bert_story_cloze_usc/model.ckpt.


I0601 11:19:35.940714 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 12000 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00209


I0601 11:19:48.304617 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00209


INFO:tensorflow:loss = 0.36752054, step = 12000 (99.788 sec)


I0601 11:19:48.306682 139940245366656 basic_session_run_hooks.py:247] loss = 0.36752054, step = 12000 (99.788 sec)


INFO:tensorflow:global_step/sec: 1.12853


I0601 11:21:16.915819 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12853


INFO:tensorflow:loss = 0.098972976, step = 12100 (88.613 sec)


I0601 11:21:16.919991 139940245366656 basic_session_run_hooks.py:247] loss = 0.098972976, step = 12100 (88.613 sec)


INFO:tensorflow:global_step/sec: 1.13416


I0601 11:22:45.086955 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13416


INFO:tensorflow:loss = 0.32491577, step = 12200 (88.174 sec)


I0601 11:22:45.093763 139940245366656 basic_session_run_hooks.py:247] loss = 0.32491577, step = 12200 (88.174 sec)


INFO:tensorflow:global_step/sec: 1.13618


I0601 11:24:13.101253 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13618


INFO:tensorflow:loss = 0.09933954, step = 12300 (88.011 sec)


I0601 11:24:13.105212 139940245366656 basic_session_run_hooks.py:247] loss = 0.09933954, step = 12300 (88.011 sec)


INFO:tensorflow:global_step/sec: 1.13384


I0601 11:25:41.297239 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13384


INFO:tensorflow:loss = 0.23884627, step = 12400 (88.194 sec)


I0601 11:25:41.299515 139940245366656 basic_session_run_hooks.py:247] loss = 0.23884627, step = 12400 (88.194 sec)


INFO:tensorflow:Saving checkpoints for 12500 into bert_story_cloze_usc/model.ckpt.


I0601 11:27:08.633297 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 12500 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00075


I0601 11:27:21.221801 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00075


INFO:tensorflow:loss = 0.09919145, step = 12500 (99.927 sec)


I0601 11:27:21.226543 139940245366656 basic_session_run_hooks.py:247] loss = 0.09919145, step = 12500 (99.927 sec)


INFO:tensorflow:global_step/sec: 1.12866


I0601 11:28:49.822702 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12866


INFO:tensorflow:loss = 0.17009851, step = 12600 (88.603 sec)


I0601 11:28:49.829456 139940245366656 basic_session_run_hooks.py:247] loss = 0.17009851, step = 12600 (88.603 sec)


INFO:tensorflow:global_step/sec: 1.13284


I0601 11:30:18.096443 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13284


INFO:tensorflow:loss = 0.17374843, step = 12700 (88.273 sec)


I0601 11:30:18.102360 139940245366656 basic_session_run_hooks.py:247] loss = 0.17374843, step = 12700 (88.273 sec)


INFO:tensorflow:global_step/sec: 1.13682


I0601 11:31:46.061394 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13682


INFO:tensorflow:loss = 0.1398445, step = 12800 (87.963 sec)


I0601 11:31:46.065716 139940245366656 basic_session_run_hooks.py:247] loss = 0.1398445, step = 12800 (87.963 sec)


INFO:tensorflow:global_step/sec: 1.13179


I0601 11:33:14.416963 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13179


INFO:tensorflow:loss = 0.16550194, step = 12900 (88.356 sec)


I0601 11:33:14.422125 139940245366656 basic_session_run_hooks.py:247] loss = 0.16550194, step = 12900 (88.356 sec)


INFO:tensorflow:Saving checkpoints for 13000 into bert_story_cloze_usc/model.ckpt.


I0601 11:34:41.822677 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 13000 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00054


I0601 11:34:54.362452 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00054


INFO:tensorflow:loss = 0.39585102, step = 13000 (99.944 sec)


I0601 11:34:54.366150 139940245366656 basic_session_run_hooks.py:247] loss = 0.39585102, step = 13000 (99.944 sec)


INFO:tensorflow:global_step/sec: 1.12793


I0601 11:36:23.020398 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12793


INFO:tensorflow:loss = 0.49606445, step = 13100 (88.659 sec)


I0601 11:36:23.025469 139940245366656 basic_session_run_hooks.py:247] loss = 0.49606445, step = 13100 (88.659 sec)


INFO:tensorflow:global_step/sec: 1.13418


I0601 11:37:51.189826 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13418


INFO:tensorflow:loss = 0.5749501, step = 13200 (88.167 sec)


I0601 11:37:51.192378 139940245366656 basic_session_run_hooks.py:247] loss = 0.5749501, step = 13200 (88.167 sec)


INFO:tensorflow:global_step/sec: 1.13302


I0601 11:39:19.449369 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13302


INFO:tensorflow:loss = 0.10749458, step = 13300 (88.267 sec)


I0601 11:39:19.459173 139940245366656 basic_session_run_hooks.py:247] loss = 0.10749458, step = 13300 (88.267 sec)


INFO:tensorflow:global_step/sec: 1.13215


I0601 11:40:47.776815 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13215


INFO:tensorflow:loss = 0.32188097, step = 13400 (88.322 sec)


I0601 11:40:47.781214 139940245366656 basic_session_run_hooks.py:247] loss = 0.32188097, step = 13400 (88.322 sec)


INFO:tensorflow:Saving checkpoints for 13500 into bert_story_cloze_usc/model.ckpt.


I0601 11:42:15.113152 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 13500 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00122


I0601 11:42:27.654600 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00122


INFO:tensorflow:loss = 0.15604255, step = 13500 (99.877 sec)


I0601 11:42:27.658572 139940245366656 basic_session_run_hooks.py:247] loss = 0.15604255, step = 13500 (99.877 sec)


INFO:tensorflow:global_step/sec: 1.13022


I0601 11:43:56.132719 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13022


INFO:tensorflow:loss = 0.15065081, step = 13600 (88.477 sec)


I0601 11:43:56.136067 139940245366656 basic_session_run_hooks.py:247] loss = 0.15065081, step = 13600 (88.477 sec)


INFO:tensorflow:global_step/sec: 1.13167


I0601 11:45:24.497503 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13167


INFO:tensorflow:loss = 0.20444652, step = 13700 (88.366 sec)


I0601 11:45:24.502285 139940245366656 basic_session_run_hooks.py:247] loss = 0.20444652, step = 13700 (88.366 sec)


INFO:tensorflow:global_step/sec: 1.1326


I0601 11:46:52.789789 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.1326


INFO:tensorflow:loss = 0.17815866, step = 13800 (88.295 sec)


I0601 11:46:52.797577 139940245366656 basic_session_run_hooks.py:247] loss = 0.17815866, step = 13800 (88.295 sec)


INFO:tensorflow:global_step/sec: 1.13698


I0601 11:48:20.741977 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13698


INFO:tensorflow:loss = 0.053899497, step = 13900 (87.948 sec)


I0601 11:48:20.745809 139940245366656 basic_session_run_hooks.py:247] loss = 0.053899497, step = 13900 (87.948 sec)


INFO:tensorflow:Saving checkpoints for 14000 into bert_story_cloze_usc/model.ckpt.


I0601 11:49:47.945991 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 14000 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 0.999826


I0601 11:50:00.759306 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 0.999826


INFO:tensorflow:loss = 0.4264536, step = 14000 (100.018 sec)


I0601 11:50:00.764057 139940245366656 basic_session_run_hooks.py:247] loss = 0.4264536, step = 14000 (100.018 sec)


INFO:tensorflow:global_step/sec: 1.12847


I0601 11:51:29.374700 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12847


INFO:tensorflow:loss = 0.19755119, step = 14100 (88.613 sec)


I0601 11:51:29.376940 139940245366656 basic_session_run_hooks.py:247] loss = 0.19755119, step = 14100 (88.613 sec)


INFO:tensorflow:global_step/sec: 1.13265


I0601 11:52:57.663179 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13265


INFO:tensorflow:loss = 0.1873793, step = 14200 (88.290 sec)


I0601 11:52:57.666896 139940245366656 basic_session_run_hooks.py:247] loss = 0.1873793, step = 14200 (88.290 sec)


INFO:tensorflow:global_step/sec: 1.13352


I0601 11:54:25.884157 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13352


INFO:tensorflow:loss = 0.26261503, step = 14300 (88.222 sec)


I0601 11:54:25.888983 139940245366656 basic_session_run_hooks.py:247] loss = 0.26261503, step = 14300 (88.222 sec)


INFO:tensorflow:global_step/sec: 1.13657


I0601 11:55:53.868380 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13657


INFO:tensorflow:loss = 0.07499649, step = 14400 (87.985 sec)


I0601 11:55:53.874410 139940245366656 basic_session_run_hooks.py:247] loss = 0.07499649, step = 14400 (87.985 sec)


INFO:tensorflow:Saving checkpoints for 14500 into bert_story_cloze_usc/model.ckpt.


I0601 11:57:21.307123 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 14500 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00353


I0601 11:57:33.516792 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00353


INFO:tensorflow:loss = 0.18678202, step = 14500 (99.647 sec)


I0601 11:57:33.520940 139940245366656 basic_session_run_hooks.py:247] loss = 0.18678202, step = 14500 (99.647 sec)


INFO:tensorflow:global_step/sec: 1.13073


I0601 11:59:01.955144 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13073


INFO:tensorflow:loss = 0.2801425, step = 14600 (88.439 sec)


I0601 11:59:01.959825 139940245366656 basic_session_run_hooks.py:247] loss = 0.2801425, step = 14600 (88.439 sec)


INFO:tensorflow:global_step/sec: 1.13703


I0601 12:00:29.903806 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13703


INFO:tensorflow:loss = 0.23717669, step = 14700 (87.948 sec)


I0601 12:00:29.908080 139940245366656 basic_session_run_hooks.py:247] loss = 0.23717669, step = 14700 (87.948 sec)


INFO:tensorflow:global_step/sec: 1.13311


I0601 12:01:58.156458 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13311


INFO:tensorflow:loss = 0.24450843, step = 14800 (88.252 sec)


I0601 12:01:58.160112 139940245366656 basic_session_run_hooks.py:247] loss = 0.24450843, step = 14800 (88.252 sec)


INFO:tensorflow:global_step/sec: 1.13378


I0601 12:03:26.357079 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13378


INFO:tensorflow:loss = 0.17562264, step = 14900 (88.204 sec)


I0601 12:03:26.364499 139940245366656 basic_session_run_hooks.py:247] loss = 0.17562264, step = 14900 (88.204 sec)


INFO:tensorflow:Saving checkpoints for 15000 into bert_story_cloze_usc/model.ckpt.


I0601 12:04:53.631674 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 15000 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 0.999949


I0601 12:05:06.362111 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 0.999949


INFO:tensorflow:loss = 0.3652302, step = 15000 (100.002 sec)


I0601 12:05:06.366190 139940245366656 basic_session_run_hooks.py:247] loss = 0.3652302, step = 15000 (100.002 sec)


INFO:tensorflow:global_step/sec: 1.12757


I0601 12:06:35.048649 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12757


INFO:tensorflow:loss = 0.15795486, step = 15100 (88.692 sec)


I0601 12:06:35.058396 139940245366656 basic_session_run_hooks.py:247] loss = 0.15795486, step = 15100 (88.692 sec)


INFO:tensorflow:global_step/sec: 1.13415


I0601 12:08:03.220482 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13415


INFO:tensorflow:loss = 0.42644852, step = 15200 (88.169 sec)


I0601 12:08:03.227399 139940245366656 basic_session_run_hooks.py:247] loss = 0.42644852, step = 15200 (88.169 sec)


INFO:tensorflow:global_step/sec: 1.1358


I0601 12:09:31.263832 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.1358


INFO:tensorflow:loss = 0.22022413, step = 15300 (88.044 sec)


I0601 12:09:31.270912 139940245366656 basic_session_run_hooks.py:247] loss = 0.22022413, step = 15300 (88.044 sec)


INFO:tensorflow:global_step/sec: 1.13476


I0601 12:10:59.387923 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13476


INFO:tensorflow:loss = 0.4267581, step = 15400 (88.121 sec)


I0601 12:10:59.393315 139940245366656 basic_session_run_hooks.py:247] loss = 0.4267581, step = 15400 (88.121 sec)


INFO:tensorflow:Saving checkpoints for 15500 into bert_story_cloze_usc/model.ckpt.


I0601 12:12:26.493487 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 15500 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00367


I0601 12:12:39.022506 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00367


INFO:tensorflow:loss = 0.2257185, step = 15500 (99.636 sec)


I0601 12:12:39.027375 139940245366656 basic_session_run_hooks.py:247] loss = 0.2257185, step = 15500 (99.636 sec)


INFO:tensorflow:global_step/sec: 1.12684


I0601 12:14:07.766505 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12684


INFO:tensorflow:loss = 0.3182452, step = 15600 (88.742 sec)


I0601 12:14:07.769019 139940245366656 basic_session_run_hooks.py:247] loss = 0.3182452, step = 15600 (88.742 sec)


INFO:tensorflow:global_step/sec: 1.13253


I0601 12:15:36.064701 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13253


INFO:tensorflow:loss = 0.33870167, step = 15700 (88.305 sec)


I0601 12:15:36.073543 139940245366656 basic_session_run_hooks.py:247] loss = 0.33870167, step = 15700 (88.305 sec)


INFO:tensorflow:global_step/sec: 1.13701


I0601 12:17:04.014936 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13701


INFO:tensorflow:loss = 0.19820558, step = 15800 (87.945 sec)


I0601 12:17:04.018941 139940245366656 basic_session_run_hooks.py:247] loss = 0.19820558, step = 15800 (87.945 sec)


INFO:tensorflow:global_step/sec: 1.13283


I0601 12:18:32.289180 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13283


INFO:tensorflow:loss = 0.27545875, step = 15900 (88.274 sec)


I0601 12:18:32.293178 139940245366656 basic_session_run_hooks.py:247] loss = 0.27545875, step = 15900 (88.274 sec)


INFO:tensorflow:Saving checkpoints for 16000 into bert_story_cloze_usc/model.ckpt.


I0601 12:19:59.715979 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 16000 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 0.991525


I0601 12:20:13.143917 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 0.991525


INFO:tensorflow:loss = 0.25943372, step = 16000 (100.859 sec)


I0601 12:20:13.151901 139940245366656 basic_session_run_hooks.py:247] loss = 0.25943372, step = 16000 (100.859 sec)


INFO:tensorflow:global_step/sec: 1.12875


I0601 12:21:41.737493 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.12875


INFO:tensorflow:loss = 0.26169872, step = 16100 (88.588 sec)


I0601 12:21:41.739808 139940245366656 basic_session_run_hooks.py:247] loss = 0.26169872, step = 16100 (88.588 sec)


INFO:tensorflow:global_step/sec: 1.13354


I0601 12:23:09.957012 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13354


INFO:tensorflow:loss = 0.14821358, step = 16200 (88.225 sec)


I0601 12:23:09.964429 139940245366656 basic_session_run_hooks.py:247] loss = 0.14821358, step = 16200 (88.225 sec)


INFO:tensorflow:global_step/sec: 1.13627


I0601 12:24:37.964620 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13627


INFO:tensorflow:loss = 0.19547492, step = 16300 (88.004 sec)


I0601 12:24:37.968775 139940245366656 basic_session_run_hooks.py:247] loss = 0.19547492, step = 16300 (88.004 sec)


INFO:tensorflow:global_step/sec: 1.13286


I0601 12:26:06.236688 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.13286


INFO:tensorflow:loss = 0.061319925, step = 16400 (88.271 sec)


I0601 12:26:06.239339 139940245366656 basic_session_run_hooks.py:247] loss = 0.061319925, step = 16400 (88.271 sec)


INFO:tensorflow:Saving checkpoints for 16500 into bert_story_cloze_usc/model.ckpt.


I0601 12:27:33.477437 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 16500 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:global_step/sec: 1.00182


I0601 12:27:46.055378 139940245366656 basic_session_run_hooks.py:680] global_step/sec: 1.00182


INFO:tensorflow:loss = 0.29688954, step = 16500 (99.823 sec)


I0601 12:27:46.062633 139940245366656 basic_session_run_hooks.py:247] loss = 0.29688954, step = 16500 (99.823 sec)


INFO:tensorflow:Saving checkpoints for 16530 into bert_story_cloze_usc/model.ckpt.


I0601 12:28:11.654163 139940245366656 basic_session_run_hooks.py:594] Saving checkpoints for 16530 into bert_story_cloze_usc/model.ckpt.


INFO:tensorflow:Loss for final step: 0.12569565.


I0601 12:28:23.938557 139940245366656 estimator.py:359] Loss for final step: 0.12569565.


Training took time  4:13:30.860008


Now let's use our test data to see how well our model did:

In [0]:
test_input_fn = bert.run_classifier.input_fn_builder(
    features=test_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

In [24]:
estimator.evaluate(input_fn=test_input_fn, steps=None)

INFO:tensorflow:Calling model_fn.


I0601 12:28:26.262933 139940245366656 estimator.py:1111] Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0601 12:28:28.667941 139940245366656 saver.py:1483] Saver not created because there are no variables in the graph to restore
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


INFO:tensorflow:Done calling model_fn.


I0601 12:28:39.113522 139940245366656 estimator.py:1113] Done calling model_fn.


INFO:tensorflow:Starting evaluation at 2019-06-01T12:28:39Z


I0601 12:28:39.139018 139940245366656 evaluation.py:257] Starting evaluation at 2019-06-01T12:28:39Z


INFO:tensorflow:Graph was finalized.


I0601 12:28:40.553277 139940245366656 monitored_session.py:222] Graph was finalized.


Instructions for updating:
Use standard file APIs to check for files with this prefix.


W0601 12:28:40.560324 139940245366656 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.


INFO:tensorflow:Restoring parameters from bert_story_cloze_usc/model.ckpt-16530


I0601 12:28:40.570446 139940245366656 saver.py:1270] Restoring parameters from bert_story_cloze_usc/model.ckpt-16530


INFO:tensorflow:Running local_init_op.


I0601 12:28:42.882720 139940245366656 session_manager.py:491] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0601 12:28:43.136390 139940245366656 session_manager.py:493] Done running local_init_op.


INFO:tensorflow:Finished evaluation at 2019-06-01-12:29:22


I0601 12:29:22.087505 139940245366656 evaluation.py:277] Finished evaluation at 2019-06-01-12:29:22


INFO:tensorflow:Saving dict for global step 16530: auc = 0.5865847, eval_accuracy = 0.5865847, f1_score = 0.67710286, false_negatives = 249.0, false_positives = 1298.0, global_step = 16530, loss = 0.93236387, precision = 0.55547947, recall = 0.86691606, true_negatives = 573.0, true_positives = 1622.0


I0601 12:29:22.090049 139940245366656 estimator.py:1979] Saving dict for global step 16530: auc = 0.5865847, eval_accuracy = 0.5865847, f1_score = 0.67710286, false_negatives = 249.0, false_positives = 1298.0, global_step = 16530, loss = 0.93236387, precision = 0.55547947, recall = 0.86691606, true_negatives = 573.0, true_positives = 1622.0


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 16530: bert_story_cloze_usc/model.ckpt-16530


I0601 12:29:24.463346 139940245366656 estimator.py:2039] Saving 'checkpoint_path' summary for global step 16530: bert_story_cloze_usc/model.ckpt-16530


{'auc': 0.5865847,
 'eval_accuracy': 0.5865847,
 'f1_score': 0.67710286,
 'false_negatives': 249.0,
 'false_positives': 1298.0,
 'global_step': 16530,
 'loss': 0.93236387,
 'precision': 0.55547947,
 'recall': 0.86691606,
 'true_negatives': 573.0,
 'true_positives': 1622.0}