# Text Classification with TF2/Keras and BERT

In this notebook, we'll train a model to predict whether an IMDB movie review is positive or negative using BERT in Tensorflow with [TF-Hub](https://www.tensorflow.org/hub).

BERT can replace text embedding layers like ELMO and GloVE. Alternatively, finetuning BERT can provide both an accuracy boost and faster training time in many cases.

[TF-Hub](https://www.tensorflow.org/hub) is a platform to share machine learning expertise packaged in reusable resources, notably pre-trained modules. In this notebook, we will use a TF-Hub text embedding module to train a simple sentiment classifier with a reasonable baseline accuracy. We will then submit the predictions to Kaggle.

For the most part the code in this notebook is copied from [this Google Research notebook](https://github.com/google-research/bert/blob/master/predicting_movie_reviews_with_bert_on_tf_hub.ipynb)

### Setup

In [1]:
import os
import re
from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
from tensorflow import keras
import tensorflow_hub as hub
from datetime import datetime
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])





  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])


In [15]:
DATASETS_DIRECTORY = '/Users/garyb/Develop/TF2/tensorflow_datasets'
OUTPUT_DIRECTORY = '/Users/garyb/Develop/TF2/tensorflow_models'

### Get the Data

In [3]:
# Load all files from a directory in a DataFrame.
def load_directory_data(directory):
    data = {}
    data["sentence"] = []
    data["sentiment"] = []
    for file_path in os.listdir(directory):
        with tf.gfile.GFile(os.path.join(directory, file_path), "r") as f:
            data["sentence"].append(f.read())
            data["sentiment"].append(re.match("\d+_(\d+)\.txt", file_path).group(1))
    return pd.DataFrame.from_dict(data)

In [4]:
# Merge positive and negative examples, add a polarity column and shuffle.
def load_dataset(directory):
    pos_df = load_directory_data(os.path.join(directory, "pos"))
    neg_df = load_directory_data(os.path.join(directory, "neg"))
    pos_df["polarity"] = 1
    neg_df["polarity"] = 0
    return pd.concat([pos_df, neg_df]).sample(frac=1).reset_index(drop=True)

If it has not already been downloaded, download the data to the cache. The ```tf.keras.utils.get_file()``` method will first check the cache and won't download the data again if it already exists.

In [7]:
# Download and process the dataset files.
# By default the file at the url origin is downloaded to the cache_dir ~/.keras, 
# placed in the cache_subdir datasets, and given the filename fname. The final location 
# of a file example.txt would therefore be ~/.keras/datasets/example.txt
# cache_subdir: Subdirectory under the Keras cache dir where the file is saved. 
# If an absolute path /path/to/folder is specified the file will be saved at that location.
def download_and_load_datasets(force_download=False):
    dataset = tf.keras.utils.get_file(
        fname="aclImdb.tar.gz", 
        origin="http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz",
        cache_subdir=DATASETS_DIRECTORY,  # If an absolute path /path/to/folder is specified the file will be saved at that location.
        extract=True)
    
    train_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                         "aclImdb", "train"))
    test_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                        "aclImdb", "test"))
  
    return train_df, test_df

In [8]:
train, test = download_and_load_datasets()

Downloading data from http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz


In [11]:
print(train.shape, test.shape)

(25000, 3) (25000, 3)


To keep training fast, we'll take a sample of 5000 train and test examples, respectively.

In [12]:
train = train.sample(5000)
test = test.sample(5000)

### Data Preprocessing

We need to transform our data into a format BERT understands. This involves two steps. First, we create InputExamples using the constructor provided in the BERT library.

- ```text_a``` is the text we want to classify, which in this case, is the Request field in our Dataframe.
- ```text_b``` is used if we're training a model to understand the relationship between sentences (i.e. is text_b a translation of text_a? Is text_b an answer to the question asked by text_a?). This doesn't apply to our task, so we can leave text_b blank.
- ```label``` is the label for our example, i.e. True, Fals

In [13]:
train.columns

Index(['sentence', 'sentiment', 'polarity'], dtype='object')

In [16]:
DATA_COLUMN = 'sentence'
LABEL_COLUMN = 'polarity'

label_list = [0, 1]

For us, our input data is the 'sentence' column and our label is the 'polarity' column (0, 1 for negative and positive, respecitvely)

In [17]:
# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train.apply(
    lambda x: bert.run_classifier.InputExample(guid=None, 
                                               text_a = x[DATA_COLUMN], 
                                               text_b = None, 
                                               label = x[LABEL_COLUMN]), axis = 1)

test_InputExamples = test.apply(
    lambda x: bert.run_classifier.InputExample(guid=None, 
                                               text_a = x[DATA_COLUMN], 
                                               text_b = None, 
                                               label = x[LABEL_COLUMN]), axis = 1)

Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (but don't worry--this is also included in the Python library):

- Lowercase our text (if we're using a BERT lowercase model)
- Tokenize it (i.e. "sally says hi" -> ["sally", "says", "hi"])
- Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
- Map our words to indexes using a vocab file that BERT provides
- Add special "CLS" and "SEP" tokens (see the readme)
- Append "index" and "segment" tokens to each input (see the BERT paper)

Happily, we don't have to worry about most of these details.

To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:

In [18]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

In [19]:
def create_tokenizer_from_hub_module():
    """Get the vocab file and casing info from the Hub module."""
    with tf.Graph().as_default():
        bert_module = hub.Module(BERT_MODEL_HUB)
        tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
        with tf.Session() as sess:
            vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
    return bert.tokenization.FullTokenizer(
        vocab_file=vocab_file, do_lower_case=do_lower_case)

In [20]:
# Create a tokenizer
tokenizer = create_tokenizer_from_hub_module()

INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore








In [21]:
# Check an example of the tokenizer in action:
tokenizer.tokenize("Now is the time for all good men to come to the aid of their country.")

['now',
 'is',
 'the',
 'time',
 'for',
 'all',
 'good',
 'men',
 'to',
 'come',
 'to',
 'the',
 'aid',
 'of',
 'their',
 'country',
 '.']

In [22]:
# Set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128

In [23]:
# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(
    train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = bert.run_classifier.convert_examples_to_features(
    test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)







INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] < br / > < br / > have you ever felt like your being watched , like someone keeps tab ##s on every move you make ? well , just remember before you decide to break the law , the fbi will always be there . at least that ' s the feeling you get after watching the gripping but slightly mel ##low crime drama , the fbi story . it traces the roots of the organization from a small bureau to one of the most modern facilities in the world ( in 1959 ) , by telling the story through they eyes of one of its agents , chip hardest ##y ( james stewart ) . < br / > < br / > [SEP]


INFO:tensorflow:tokens: [CLS] < br / > < br / > have you ever felt like your being watched , like someone keeps tab ##s on every move you make ? well , just remember before you decide to break the law , the fbi will always be there . at least that ' s the feeling you get after watching the gripping but slightly mel ##low crime drama , the fbi story . it traces the roots of the organization from a small bureau to one of the most modern facilities in the world ( in 1959 ) , by telling the story through they eyes of one of its agents , chip hardest ##y ( james stewart ) . < br / > < br / > [SEP]


INFO:tensorflow:input_ids: 101 1026 7987 1013 1028 1026 7987 1013 1028 2031 2017 2412 2371 2066 2115 2108 3427 1010 2066 2619 7906 21628 2015 2006 2296 2693 2017 2191 1029 2092 1010 2074 3342 2077 2017 5630 2000 3338 1996 2375 1010 1996 8495 2097 2467 2022 2045 1012 2012 2560 2008 1005 1055 1996 3110 2017 2131 2044 3666 1996 13940 2021 3621 11463 8261 4126 3689 1010 1996 8495 2466 1012 2009 10279 1996 6147 1997 1996 3029 2013 1037 2235 4879 2000 2028 1997 1996 2087 2715 4128 1999 1996 2088 1006 1999 3851 1007 1010 2011 4129 1996 2466 2083 2027 2159 1997 2028 1997 2049 6074 1010 9090 18263 2100 1006 2508 5954 1007 1012 1026 7987 1013 1028 1026 7987 1013 1028 102


INFO:tensorflow:input_ids: 101 1026 7987 1013 1028 1026 7987 1013 1028 2031 2017 2412 2371 2066 2115 2108 3427 1010 2066 2619 7906 21628 2015 2006 2296 2693 2017 2191 1029 2092 1010 2074 3342 2077 2017 5630 2000 3338 1996 2375 1010 1996 8495 2097 2467 2022 2045 1012 2012 2560 2008 1005 1055 1996 3110 2017 2131 2044 3666 1996 13940 2021 3621 11463 8261 4126 3689 1010 1996 8495 2466 1012 2009 10279 1996 6147 1997 1996 3029 2013 1037 2235 4879 2000 2028 1997 1996 2087 2715 4128 1999 1996 2088 1006 1999 3851 1007 1010 2011 4129 1996 2466 2083 2027 2159 1997 2028 1997 2049 6074 1010 9090 18263 2100 1006 2508 5954 1007 1012 1026 7987 1013 1028 1026 7987 1013 1028 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] hidden frontier is a fan made show , in the world of star trek . the story takes place after voyager has returned from the delta - quadrant . it has some characters from the official star trek shows , but most of them are original to the show . the show takes place on the star base deep space 12 and on several space ships , which gives it opportunities the official shows don ' t have . the characters have the opportunity of a rising in the hierarchy , which characters in shows with only one ship doesn ' t have . the show has good computer animation of spaceship ##s , but the acting takes place in front of at green - screen [SEP]


INFO:tensorflow:tokens: [CLS] hidden frontier is a fan made show , in the world of star trek . the story takes place after voyager has returned from the delta - quadrant . it has some characters from the official star trek shows , but most of them are original to the show . the show takes place on the star base deep space 12 and on several space ships , which gives it opportunities the official shows don ' t have . the characters have the opportunity of a rising in the hierarchy , which characters in shows with only one ship doesn ' t have . the show has good computer animation of spaceship ##s , but the acting takes place in front of at green - screen [SEP]


INFO:tensorflow:input_ids: 101 5023 8880 2003 1037 5470 2081 2265 1010 1999 1996 2088 1997 2732 10313 1012 1996 2466 3138 2173 2044 23493 2038 2513 2013 1996 7160 1011 29371 1012 2009 2038 2070 3494 2013 1996 2880 2732 10313 3065 1010 2021 2087 1997 2068 2024 2434 2000 1996 2265 1012 1996 2265 3138 2173 2006 1996 2732 2918 2784 2686 2260 1998 2006 2195 2686 3719 1010 2029 3957 2009 6695 1996 2880 3065 2123 1005 1056 2031 1012 1996 3494 2031 1996 4495 1997 1037 4803 1999 1996 12571 1010 2029 3494 1999 3065 2007 2069 2028 2911 2987 1005 1056 2031 1012 1996 2265 2038 2204 3274 7284 1997 25516 2015 1010 2021 1996 3772 3138 2173 1999 2392 1997 2012 2665 1011 3898 102


INFO:tensorflow:input_ids: 101 5023 8880 2003 1037 5470 2081 2265 1010 1999 1996 2088 1997 2732 10313 1012 1996 2466 3138 2173 2044 23493 2038 2513 2013 1996 7160 1011 29371 1012 2009 2038 2070 3494 2013 1996 2880 2732 10313 3065 1010 2021 2087 1997 2068 2024 2434 2000 1996 2265 1012 1996 2265 3138 2173 2006 1996 2732 2918 2784 2686 2260 1998 2006 2195 2686 3719 1010 2029 3957 2009 6695 1996 2880 3065 2123 1005 1056 2031 1012 1996 3494 2031 1996 4495 1997 1037 4803 1999 1996 12571 1010 2029 3494 1999 3065 2007 2069 2028 2911 2987 1005 1056 2031 1012 1996 2265 2038 2204 3274 7284 1997 25516 2015 1010 2021 1996 3772 3138 2173 1999 2392 1997 2012 2665 1011 3898 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] a student filmmaker enlist ##s a b - grade actress ( a del ##ect ##ably diva - is ##h molly ring ##wald ! ) to complete the horror film that her mother ( a dreadful ##ly dull kylie minogue ! ) tried to make 12 years ago . it ' s a curious plot choice to say the least , as any aus ##sie horror fan knows that the genre is sadly lacking in women directors . the film has a curse on it , because molly had to kill some psycho murderer on the original set . but she ' s back , because she needs the exposure . unfortunately , the curse is still there and people start dying on the " set . [SEP]


INFO:tensorflow:tokens: [CLS] a student filmmaker enlist ##s a b - grade actress ( a del ##ect ##ably diva - is ##h molly ring ##wald ! ) to complete the horror film that her mother ( a dreadful ##ly dull kylie minogue ! ) tried to make 12 years ago . it ' s a curious plot choice to say the least , as any aus ##sie horror fan knows that the genre is sadly lacking in women directors . the film has a curse on it , because molly had to kill some psycho murderer on the original set . but she ' s back , because she needs the exposure . unfortunately , the curse is still there and people start dying on the " set . [SEP]


INFO:tensorflow:input_ids: 101 1037 3076 12127 28845 2015 1037 1038 1011 3694 3883 1006 1037 3972 22471 8231 25992 1011 2003 2232 9618 3614 11191 999 1007 2000 3143 1996 5469 2143 2008 2014 2388 1006 1037 21794 2135 10634 9008 27736 999 1007 2699 2000 2191 2260 2086 3283 1012 2009 1005 1055 1037 8025 5436 3601 2000 2360 1996 2560 1010 2004 2151 17151 11741 5469 5470 4282 2008 1996 6907 2003 13718 11158 1999 2308 5501 1012 1996 2143 2038 1037 8364 2006 2009 1010 2138 9618 2018 2000 3102 2070 18224 13422 2006 1996 2434 2275 1012 2021 2016 1005 1055 2067 1010 2138 2016 3791 1996 7524 1012 6854 1010 1996 8364 2003 2145 2045 1998 2111 2707 5996 2006 1996 1000 2275 1012 102


INFO:tensorflow:input_ids: 101 1037 3076 12127 28845 2015 1037 1038 1011 3694 3883 1006 1037 3972 22471 8231 25992 1011 2003 2232 9618 3614 11191 999 1007 2000 3143 1996 5469 2143 2008 2014 2388 1006 1037 21794 2135 10634 9008 27736 999 1007 2699 2000 2191 2260 2086 3283 1012 2009 1005 1055 1037 8025 5436 3601 2000 2360 1996 2560 1010 2004 2151 17151 11741 5469 5470 4282 2008 1996 6907 2003 13718 11158 1999 2308 5501 1012 1996 2143 2038 1037 8364 2006 2009 1010 2138 9618 2018 2000 3102 2070 18224 13422 2006 1996 2434 2275 1012 2021 2016 1005 1055 2067 1010 2138 2016 3791 1996 7524 1012 6854 1010 1996 8364 2003 2145 2045 1998 2111 2707 5996 2006 1996 1000 2275 1012 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] long before " broke ##back mountain " ( about 23 years before ) , " death ##tra ##p " was the first time i ever saw two men passionate ##ly kissing on screen , and frankly , i was shocked . i understood it in terms of the plot , and it didn ' t really upset my sen ##si ##bilities ( not much ) , but it was the first time i ever saw it , at least , in a " mainstream " movie . i thought it was a guts ##y move for its time , and took courage for them to try it , especially christopher reeve , in the midst of his time as pg - rated superman . male bisexual [SEP]


INFO:tensorflow:tokens: [CLS] long before " broke ##back mountain " ( about 23 years before ) , " death ##tra ##p " was the first time i ever saw two men passionate ##ly kissing on screen , and frankly , i was shocked . i understood it in terms of the plot , and it didn ' t really upset my sen ##si ##bilities ( not much ) , but it was the first time i ever saw it , at least , in a " mainstream " movie . i thought it was a guts ##y move for its time , and took courage for them to try it , especially christopher reeve , in the midst of his time as pg - rated superman . male bisexual [SEP]


INFO:tensorflow:input_ids: 101 2146 2077 1000 3631 5963 3137 1000 1006 2055 2603 2086 2077 1007 1010 1000 2331 6494 2361 1000 2001 1996 2034 2051 1045 2412 2387 2048 2273 13459 2135 7618 2006 3898 1010 1998 19597 1010 1045 2001 7135 1012 1045 5319 2009 1999 3408 1997 1996 5436 1010 1998 2009 2134 1005 1056 2428 6314 2026 12411 5332 14680 1006 2025 2172 1007 1010 2021 2009 2001 1996 2034 2051 1045 2412 2387 2009 1010 2012 2560 1010 1999 1037 1000 7731 1000 3185 1012 1045 2245 2009 2001 1037 18453 2100 2693 2005 2049 2051 1010 1998 2165 8424 2005 2068 2000 3046 2009 1010 2926 5696 20726 1010 1999 1996 12930 1997 2010 2051 2004 18720 1011 6758 10646 1012 3287 22437 102


INFO:tensorflow:input_ids: 101 2146 2077 1000 3631 5963 3137 1000 1006 2055 2603 2086 2077 1007 1010 1000 2331 6494 2361 1000 2001 1996 2034 2051 1045 2412 2387 2048 2273 13459 2135 7618 2006 3898 1010 1998 19597 1010 1045 2001 7135 1012 1045 5319 2009 1999 3408 1997 1996 5436 1010 1998 2009 2134 1005 1056 2428 6314 2026 12411 5332 14680 1006 2025 2172 1007 1010 2021 2009 2001 1996 2034 2051 1045 2412 2387 2009 1010 2012 2560 1010 1999 1037 1000 7731 1000 3185 1012 1045 2245 2009 2001 1037 18453 2100 2693 2005 2049 2051 1010 1998 2165 8424 2005 2068 2000 3046 2009 1010 2926 5696 20726 1010 1999 1996 12930 1997 2010 2051 2004 18720 1011 6758 10646 1012 3287 22437 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] how do comments like the one that was the headline by high school girls even make it on this site , this was the stupid ##est movie i have ever seen , it was ridiculous , how can any mor ##on sit there and say that just because a movie makes you jump it is a good movie , that might be the most idiot ##ic thing i have ever heard , i could sneak up behind you and go " boo " and it would make you jump , but that does not mean i am qualified to write or direct a movie , not to mention " they tied everything together at the end " is not a good reason for a movie to [SEP]


INFO:tensorflow:tokens: [CLS] how do comments like the one that was the headline by high school girls even make it on this site , this was the stupid ##est movie i have ever seen , it was ridiculous , how can any mor ##on sit there and say that just because a movie makes you jump it is a good movie , that might be the most idiot ##ic thing i have ever heard , i could sneak up behind you and go " boo " and it would make you jump , but that does not mean i am qualified to write or direct a movie , not to mention " they tied everything together at the end " is not a good reason for a movie to [SEP]


INFO:tensorflow:input_ids: 101 2129 2079 7928 2066 1996 2028 2008 2001 1996 17653 2011 2152 2082 3057 2130 2191 2009 2006 2023 2609 1010 2023 2001 1996 5236 4355 3185 1045 2031 2412 2464 1010 2009 2001 9951 1010 2129 2064 2151 22822 2239 4133 2045 1998 2360 2008 2074 2138 1037 3185 3084 2017 5376 2009 2003 1037 2204 3185 1010 2008 2453 2022 1996 2087 10041 2594 2518 1045 2031 2412 2657 1010 1045 2071 13583 2039 2369 2017 1998 2175 1000 22017 1000 1998 2009 2052 2191 2017 5376 1010 2021 2008 2515 2025 2812 1045 2572 4591 2000 4339 2030 3622 1037 3185 1010 2025 2000 5254 1000 2027 5079 2673 2362 2012 1996 2203 1000 2003 2025 1037 2204 3114 2005 1037 3185 2000 102


INFO:tensorflow:input_ids: 101 2129 2079 7928 2066 1996 2028 2008 2001 1996 17653 2011 2152 2082 3057 2130 2191 2009 2006 2023 2609 1010 2023 2001 1996 5236 4355 3185 1045 2031 2412 2464 1010 2009 2001 9951 1010 2129 2064 2151 22822 2239 4133 2045 1998 2360 2008 2074 2138 1037 3185 3084 2017 5376 2009 2003 1037 2204 3185 1010 2008 2453 2022 1996 2087 10041 2594 2518 1045 2031 2412 2657 1010 1045 2071 13583 2039 2369 2017 1998 2175 1000 22017 1000 1998 2009 2052 2191 2017 5376 1010 2021 2008 2515 2025 2812 1045 2572 4591 2000 4339 2030 3622 1037 3185 1010 2025 2000 5254 1000 2027 5079 2673 2362 2012 1996 2203 1000 2003 2025 1037 2204 3114 2005 1037 3185 2000 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] why does everyone feel they have to constantly put this movie down ? it is cute and funny ( exactly what it is meant to be ) . madonna wasn ' t out to prove herself as an oscar call ##ibe ##r artist with this movie any ##how ! she was just doing what the character called for , and she did it well . i loved her in this movie ; it is my second favorite madonna movie after ev ##ita . the soundtrack is excellent too . it is no better or no worse than any che ##es ##y 80 ' s flick . to all the critics , just don ' t take it so seriously and you might have fun watching it [SEP]


INFO:tensorflow:tokens: [CLS] why does everyone feel they have to constantly put this movie down ? it is cute and funny ( exactly what it is meant to be ) . madonna wasn ' t out to prove herself as an oscar call ##ibe ##r artist with this movie any ##how ! she was just doing what the character called for , and she did it well . i loved her in this movie ; it is my second favorite madonna movie after ev ##ita . the soundtrack is excellent too . it is no better or no worse than any che ##es ##y 80 ' s flick . to all the critics , just don ' t take it so seriously and you might have fun watching it [SEP]


INFO:tensorflow:input_ids: 101 2339 2515 3071 2514 2027 2031 2000 7887 2404 2023 3185 2091 1029 2009 2003 10140 1998 6057 1006 3599 2054 2009 2003 3214 2000 2022 1007 1012 11284 2347 1005 1056 2041 2000 6011 2841 2004 2019 7436 2655 20755 2099 3063 2007 2023 3185 2151 14406 999 2016 2001 2074 2725 2054 1996 2839 2170 2005 1010 1998 2016 2106 2009 2092 1012 1045 3866 2014 1999 2023 3185 1025 2009 2003 2026 2117 5440 11284 3185 2044 23408 6590 1012 1996 6050 2003 6581 2205 1012 2009 2003 2053 2488 2030 2053 4788 2084 2151 18178 2229 2100 3770 1005 1055 17312 1012 2000 2035 1996 4401 1010 2074 2123 1005 1056 2202 2009 2061 5667 1998 2017 2453 2031 4569 3666 2009 102


INFO:tensorflow:input_ids: 101 2339 2515 3071 2514 2027 2031 2000 7887 2404 2023 3185 2091 1029 2009 2003 10140 1998 6057 1006 3599 2054 2009 2003 3214 2000 2022 1007 1012 11284 2347 1005 1056 2041 2000 6011 2841 2004 2019 7436 2655 20755 2099 3063 2007 2023 3185 2151 14406 999 2016 2001 2074 2725 2054 1996 2839 2170 2005 1010 1998 2016 2106 2009 2092 1012 1045 3866 2014 1999 2023 3185 1025 2009 2003 2026 2117 5440 11284 3185 2044 23408 6590 1012 1996 6050 2003 6581 2205 1012 2009 2003 2053 2488 2030 2053 4788 2084 2151 18178 2229 2100 3770 1005 1055 17312 1012 2000 2035 1996 4401 1010 2074 2123 1005 1056 2202 2009 2061 5667 1998 2017 2453 2031 4569 3666 2009 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] an exquisite film . they just don ' t make them like this any more ! we ea ##ves ##drop on an upper middle class family in dublin in the early part of the 20th century . they are hosting an after christmas dinner for their friends and relatives . their table talk is just idle chatter but it is so well written that one is eng ##ross ##ed . away from the dinner table some fine piano playing helps to create an intimate atmosphere as if one were there as one of the guests . perhaps a bit too perfect for an amateur player , the odd mistake here and there would have added to the magic of this film . no real story but [SEP]


INFO:tensorflow:tokens: [CLS] an exquisite film . they just don ' t make them like this any more ! we ea ##ves ##drop on an upper middle class family in dublin in the early part of the 20th century . they are hosting an after christmas dinner for their friends and relatives . their table talk is just idle chatter but it is so well written that one is eng ##ross ##ed . away from the dinner table some fine piano playing helps to create an intimate atmosphere as if one were there as one of the guests . perhaps a bit too perfect for an amateur player , the odd mistake here and there would have added to the magic of this film . no real story but [SEP]


INFO:tensorflow:input_ids: 101 2019 19401 2143 1012 2027 2074 2123 1005 1056 2191 2068 2066 2023 2151 2062 999 2057 19413 6961 25711 2006 2019 3356 2690 2465 2155 1999 5772 1999 1996 2220 2112 1997 1996 3983 2301 1012 2027 2024 9936 2019 2044 4234 4596 2005 2037 2814 1998 9064 1012 2037 2795 2831 2003 2074 18373 24691 2021 2009 2003 2061 2092 2517 2008 2028 2003 25540 25725 2098 1012 2185 2013 1996 4596 2795 2070 2986 3682 2652 7126 2000 3443 2019 10305 7224 2004 2065 2028 2020 2045 2004 2028 1997 1996 6368 1012 3383 1037 2978 2205 3819 2005 2019 5515 2447 1010 1996 5976 6707 2182 1998 2045 2052 2031 2794 2000 1996 3894 1997 2023 2143 1012 2053 2613 2466 2021 102


INFO:tensorflow:input_ids: 101 2019 19401 2143 1012 2027 2074 2123 1005 1056 2191 2068 2066 2023 2151 2062 999 2057 19413 6961 25711 2006 2019 3356 2690 2465 2155 1999 5772 1999 1996 2220 2112 1997 1996 3983 2301 1012 2027 2024 9936 2019 2044 4234 4596 2005 2037 2814 1998 9064 1012 2037 2795 2831 2003 2074 18373 24691 2021 2009 2003 2061 2092 2517 2008 2028 2003 25540 25725 2098 1012 2185 2013 1996 4596 2795 2070 2986 3682 2652 7126 2000 3443 2019 10305 7224 2004 2065 2028 2020 2045 2004 2028 1997 1996 6368 1012 3383 1037 2978 2205 3819 2005 2019 5515 2447 1010 1996 5976 6707 2182 1998 2045 2052 2031 2794 2000 1996 3894 1997 2023 2143 1012 2053 2613 2466 2021 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] we see thomas edison , with a glowing smile on his face , trying to electro ##cute a 5 ton living being . eventually he was successful , and so the first animal s ##nu ##ff film is born , clever ##ly disguised as an amazing achievement in technology . this is scientific arrogance at it ' s worst , folks . it ranks up there with the doctor who dec ##ap ##itated a monkey just to prove that he could keep its severed head alive for 22 minutes . < br / > < br / > oh yes , there ' s the absurd excuse that the elephant had been convicted of " murder " and sentenced to death , and that this was [SEP]


INFO:tensorflow:tokens: [CLS] we see thomas edison , with a glowing smile on his face , trying to electro ##cute a 5 ton living being . eventually he was successful , and so the first animal s ##nu ##ff film is born , clever ##ly disguised as an amazing achievement in technology . this is scientific arrogance at it ' s worst , folks . it ranks up there with the doctor who dec ##ap ##itated a monkey just to prove that he could keep its severed head alive for 22 minutes . < br / > < br / > oh yes , there ' s the absurd excuse that the elephant had been convicted of " murder " and sentenced to death , and that this was [SEP]


INFO:tensorflow:input_ids: 101 2057 2156 2726 17046 1010 2007 1037 10156 2868 2006 2010 2227 1010 2667 2000 16175 26869 1037 1019 10228 2542 2108 1012 2776 2002 2001 3144 1010 1998 2061 1996 2034 4111 1055 11231 4246 2143 2003 2141 1010 12266 2135 17330 2004 2019 6429 6344 1999 2974 1012 2023 2003 4045 24416 2012 2009 1005 1055 5409 1010 12455 1012 2009 6938 2039 2045 2007 1996 3460 2040 11703 9331 15198 1037 10608 2074 2000 6011 2008 2002 2071 2562 2049 16574 2132 4142 2005 2570 2781 1012 1026 7987 1013 1028 1026 7987 1013 1028 2821 2748 1010 2045 1005 1055 1996 18691 8016 2008 1996 10777 2018 2042 7979 1997 1000 4028 1000 1998 7331 2000 2331 1010 1998 2008 2023 2001 102


INFO:tensorflow:input_ids: 101 2057 2156 2726 17046 1010 2007 1037 10156 2868 2006 2010 2227 1010 2667 2000 16175 26869 1037 1019 10228 2542 2108 1012 2776 2002 2001 3144 1010 1998 2061 1996 2034 4111 1055 11231 4246 2143 2003 2141 1010 12266 2135 17330 2004 2019 6429 6344 1999 2974 1012 2023 2003 4045 24416 2012 2009 1005 1055 5409 1010 12455 1012 2009 6938 2039 2045 2007 1996 3460 2040 11703 9331 15198 1037 10608 2074 2000 6011 2008 2002 2071 2562 2049 16574 2132 4142 2005 2570 2781 1012 1026 7987 1013 1028 1026 7987 1013 1028 2821 2748 1010 2045 1005 1055 1996 18691 8016 2008 1996 10777 2018 2042 7979 1997 1000 4028 1000 1998 7331 2000 2331 1010 1998 2008 2023 2001 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] i watched this movie with my mother . she is 81 y ##rs . old and was raised to be a big ##ot . she even acknowledges this . i don ' t think she really understood what was happening , she had already made up her mind that the kid was guilty . scary . i felt for this child and his family . what torture they went through and remained faithful . that is true faith . back to the movie . i was disgusted by the police force and their in ##ept ##itude . i am so glad that this public defender was chosen to work this case . it was very fortunate for this family that they had a person that cared [SEP]


INFO:tensorflow:tokens: [CLS] i watched this movie with my mother . she is 81 y ##rs . old and was raised to be a big ##ot . she even acknowledges this . i don ' t think she really understood what was happening , she had already made up her mind that the kid was guilty . scary . i felt for this child and his family . what torture they went through and remained faithful . that is true faith . back to the movie . i was disgusted by the police force and their in ##ept ##itude . i am so glad that this public defender was chosen to work this case . it was very fortunate for this family that they had a person that cared [SEP]


INFO:tensorflow:input_ids: 101 1045 3427 2023 3185 2007 2026 2388 1012 2016 2003 6282 1061 2869 1012 2214 1998 2001 2992 2000 2022 1037 2502 4140 1012 2016 2130 28049 2023 1012 1045 2123 1005 1056 2228 2016 2428 5319 2054 2001 6230 1010 2016 2018 2525 2081 2039 2014 2568 2008 1996 4845 2001 5905 1012 12459 1012 1045 2371 2005 2023 2775 1998 2010 2155 1012 2054 8639 2027 2253 2083 1998 2815 11633 1012 2008 2003 2995 4752 1012 2067 2000 1996 3185 1012 1045 2001 17733 2011 1996 2610 2486 1998 2037 1999 23606 18679 1012 1045 2572 2061 5580 2008 2023 2270 8291 2001 4217 2000 2147 2023 2553 1012 2009 2001 2200 19590 2005 2023 2155 2008 2027 2018 1037 2711 2008 8725 102


INFO:tensorflow:input_ids: 101 1045 3427 2023 3185 2007 2026 2388 1012 2016 2003 6282 1061 2869 1012 2214 1998 2001 2992 2000 2022 1037 2502 4140 1012 2016 2130 28049 2023 1012 1045 2123 1005 1056 2228 2016 2428 5319 2054 2001 6230 1010 2016 2018 2525 2081 2039 2014 2568 2008 1996 4845 2001 5905 1012 12459 1012 1045 2371 2005 2023 2775 1998 2010 2155 1012 2054 8639 2027 2253 2083 1998 2815 11633 1012 2008 2003 2995 4752 1012 2067 2000 1996 3185 1012 1045 2001 17733 2011 1996 2610 2486 1998 2037 1999 23606 18679 1012 1045 2572 2061 5580 2008 2023 2270 8291 2001 4217 2000 2147 2023 2553 1012 2009 2001 2200 19590 2005 2023 2155 2008 2027 2018 1037 2711 2008 8725 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] i was watching this movie and getting increasingly bored with the silly plot that was going nowhere , when suddenly , the story takes a surreal turn for the worse and has an actor playing herself . oh how i gu ##ffa ##wed . because it ' s soo ##oo ##o funny , isn ' t it ? we know julia roberts is playing the character of tess , and here they are , in the film , cracking the joke that the character of tess looks a bit like julia roberts . so julia plays someone imp ##erson ##ating julia . how well she does this , we ' ll never know , because 99 . 999 % of the audience don ' t actually [SEP]


INFO:tensorflow:tokens: [CLS] i was watching this movie and getting increasingly bored with the silly plot that was going nowhere , when suddenly , the story takes a surreal turn for the worse and has an actor playing herself . oh how i gu ##ffa ##wed . because it ' s soo ##oo ##o funny , isn ' t it ? we know julia roberts is playing the character of tess , and here they are , in the film , cracking the joke that the character of tess looks a bit like julia roberts . so julia plays someone imp ##erson ##ating julia . how well she does this , we ' ll never know , because 99 . 999 % of the audience don ' t actually [SEP]


INFO:tensorflow:input_ids: 101 1045 2001 3666 2023 3185 1998 2893 6233 11471 2007 1996 10021 5436 2008 2001 2183 7880 1010 2043 3402 1010 1996 2466 3138 1037 16524 2735 2005 1996 4788 1998 2038 2019 3364 2652 2841 1012 2821 2129 1045 19739 20961 15557 1012 2138 2009 1005 1055 17111 9541 2080 6057 1010 3475 1005 1056 2009 1029 2057 2113 6423 7031 2003 2652 1996 2839 1997 15540 1010 1998 2182 2027 2024 1010 1999 1996 2143 1010 15729 1996 8257 2008 1996 2839 1997 15540 3504 1037 2978 2066 6423 7031 1012 2061 6423 3248 2619 17727 18617 5844 6423 1012 2129 2092 2016 2515 2023 1010 2057 1005 2222 2196 2113 1010 2138 5585 1012 25897 1003 1997 1996 4378 2123 1005 1056 2941 102


INFO:tensorflow:input_ids: 101 1045 2001 3666 2023 3185 1998 2893 6233 11471 2007 1996 10021 5436 2008 2001 2183 7880 1010 2043 3402 1010 1996 2466 3138 1037 16524 2735 2005 1996 4788 1998 2038 2019 3364 2652 2841 1012 2821 2129 1045 19739 20961 15557 1012 2138 2009 1005 1055 17111 9541 2080 6057 1010 3475 1005 1056 2009 1029 2057 2113 6423 7031 2003 2652 1996 2839 1997 15540 1010 1998 2182 2027 2024 1010 1999 1996 2143 1010 15729 1996 8257 2008 1996 2839 1997 15540 3504 1037 2978 2066 6423 7031 1012 2061 6423 3248 2619 17727 18617 5844 6423 1012 2129 2092 2016 2515 2023 1010 2057 1005 2222 2196 2113 1010 2138 5585 1012 25897 1003 1997 1996 4378 2123 1005 1056 2941 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


### Train a Model
Now that we've prepared our data, let's focus on building a model. ```create_model()``` does just this below. First, it loads the BERT tf hub module again (this time to extract the computation graph). Next, it creates a single new layer that will be trained to adapt BERT to our sentiment task (i.e. classifying whether a movie review is positive or negative). This strategy of using a mostly trained model is called fine-tuning.

In [31]:
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0

# Warmup is a period of time where the learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1

# Model configs
SAVE_CHECKPOINTS_STEPS = 500
SAVE_SUMMARY_STEPS = 100

In [32]:
def create_model(is_predicting, input_ids, input_mask, segment_ids,
                 labels, num_labels):
    """Creates a classification model."""
    bert_module = hub.Module(BERT_MODEL_HUB, trainable=True)
    bert_inputs = dict(input_ids=input_ids, input_mask=input_mask,
                       segment_ids=segment_ids)
    bert_outputs = bert_module(inputs=bert_inputs, signature="tokens", as_dict=True)

    # Use "pooled_output" for classification tasks on an entire sentence.
    # Use "sequence_outputs" for token-level output.
    output_layer = bert_outputs["pooled_output"]
    hidden_size = output_layer.shape[-1].value

    # Create our own layer to tune for politeness data.
    output_weights = tf.get_variable("output_weights",
                                     [num_labels, hidden_size],
                                     initializer=tf.truncated_normal_initializer(stddev=0.02))

    output_bias = tf.get_variable("output_bias", [num_labels], initializer=tf.zeros_initializer())

    with tf.variable_scope("loss"):
        output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)
        logits = tf.matmul(output_layer, output_weights, transpose_b=True)
        logits = tf.nn.bias_add(logits, output_bias)
        log_probs = tf.nn.log_softmax(logits, axis=-1)

        # Convert labels into one-hot encoding.
        one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

        predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))

        # If we're predicting, we want predicted labels and the probabilities.
        if is_predicting:
            return (predicted_labels, log_probs)

        # If we're train/eval, compute loss between predicted and actual label
        per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
        loss = tf.reduce_mean(per_example_loss)
        return (loss, predicted_labels, log_probs)

In [33]:
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
    """Returns model_fn closure for TPUEstimator.
    
    model_fn_builder() actually creates our model function using 
    the passed parameters for num_labels, learning_rate, etc.
    """

    def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
        """The `model_fn` for TPUEstimator."""

        input_ids = features["input_ids"]
        input_mask = features["input_mask"]
        segment_ids = features["segment_ids"]
        label_ids = features["label_ids"]

        is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)

        # TRAIN and EVAL
        if not is_predicting:

            (loss, predicted_labels, log_probs) = create_model(
                is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

            train_op = bert.optimization.create_optimizer(
                loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

            # Calculate evaluation metrics.
            def metric_fn(label_ids, predicted_labels):
                accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
                f1_score = tf.contrib.metrics.f1_score(
                    label_ids,
                    predicted_labels)
                auc = tf.metrics.auc(
                    label_ids,
                    predicted_labels)
                recall = tf.metrics.recall(
                    label_ids,
                    predicted_labels)
                precision = tf.metrics.precision(
                    label_ids,
                    predicted_labels)
                true_pos = tf.metrics.true_positives(
                    label_ids,
                    predicted_labels)
                true_neg = tf.metrics.true_negatives(
                    label_ids,
                    predicted_labels)
                false_pos = tf.metrics.false_positives(
                    label_ids,
                    predicted_labels)
                false_neg = tf.metrics.false_negatives(
                    label_ids,
                    predicted_labels)
                return {
                    "eval_accuracy": accuracy,
                    "f1_score": f1_score,
                    "auc": auc,
                    "precision": precision,
                    "recall": recall,
                    "true_positives": true_pos,
                    "true_negatives": true_neg,
                    "false_positives": false_pos,
                    "false_negatives": false_neg
                }

            eval_metrics = metric_fn(label_ids, predicted_labels)

            if mode == tf.estimator.ModeKeys.TRAIN:
                return tf.estimator.EstimatorSpec(mode=mode,
                                                  loss=loss,
                                                  train_op=train_op)
            else:
                return tf.estimator.EstimatorSpec(mode=mode,
                                                  loss=loss,
                                                  eval_metric_ops=eval_metrics)
        else:
            (predicted_labels, log_probs) = create_model(
                is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

            predictions = {
                'probabilities': log_probs,
                'labels': predicted_labels
            }
            return tf.estimator.EstimatorSpec(mode, predictions=predictions)

    # Return the actual model function in the closure
    return model_fn

In [34]:
# Compute train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [36]:
# Specify output directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIRECTORY,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [37]:
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})

INFO:tensorflow:Using config: {'_model_dir': '/Users/garyb/Develop/TF2/tensorflow_models', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x1418514e0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


INFO:tensorflow:Using config: {'_model_dir': '/Users/garyb/Develop/TF2/tensorflow_models', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x1418514e0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


In [38]:
# Create an input function for training.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False) # True for using TPUs.

### Train the Model

In [39]:
print('Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.


Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.














Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


Instructions for updating:
Deprecated in favor of operator or tf.math.divide.








Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where


Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where






  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into /Users/garyb/Develop/TF2/tensorflow_models/model.ckpt.


INFO:tensorflow:Saving checkpoints for 0 into /Users/garyb/Develop/TF2/tensorflow_models/model.ckpt.


INFO:tensorflow:loss = 0.68282914, step = 1


INFO:tensorflow:loss = 0.68282914, step = 1


INFO:tensorflow:global_step/sec: 0.0299379


INFO:tensorflow:global_step/sec: 0.0299379


INFO:tensorflow:loss = 0.37389114, step = 101 (3340.275 sec)


INFO:tensorflow:loss = 0.37389114, step = 101 (3340.275 sec)






INFO:tensorflow:global_step/sec: 0.0297746


INFO:tensorflow:global_step/sec: 0.0297746


INFO:tensorflow:loss = 0.021475317, step = 201 (3358.560 sec)


INFO:tensorflow:loss = 0.021475317, step = 201 (3358.560 sec)


INFO:tensorflow:global_step/sec: 0.030929


INFO:tensorflow:global_step/sec: 0.030929


INFO:tensorflow:loss = 0.14221631, step = 301 (3233.209 sec)


INFO:tensorflow:loss = 0.14221631, step = 301 (3233.209 sec)


INFO:tensorflow:global_step/sec: 0.0309465


INFO:tensorflow:global_step/sec: 0.0309465


INFO:tensorflow:loss = 0.0047838865, step = 401 (3231.381 sec)


INFO:tensorflow:loss = 0.0047838865, step = 401 (3231.381 sec)


INFO:tensorflow:Saving checkpoints for 468 into /Users/garyb/Develop/TF2/tensorflow_models/model.ckpt.


INFO:tensorflow:Saving checkpoints for 468 into /Users/garyb/Develop/TF2/tensorflow_models/model.ckpt.


INFO:tensorflow:Loss for final step: 0.004142043.


INFO:tensorflow:Loss for final step: 0.004142043.


Training took time  4:17:05.069818
