<a href="https://colab.research.google.com/github/victor-roris/mediumseries/blob/master/NLP/BERT_Predicting_Movie_Reviews_with_BERT_on_TF_Hub.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
# Copyright 2019 Google Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#Predicting Movie Review Sentiment with BERT on TF Hub

If you’ve been following Natural Language Processing over the past year, you’ve probably heard of BERT: Bidirectional Encoder Representations from Transformers. It’s a neural network architecture designed by Google researchers that’s totally transformed what’s state-of-the-art for NLP tasks, like text classification, translation, summarization, and question answering.

Now that BERT's been added to [TF Hub](https://www.tensorflow.org/hub) as a loadable module, it's easy(ish) to add into existing Tensorflow text pipelines. In an existing pipeline, BERT can replace text embedding layers like ELMO and GloVE. Alternatively, [finetuning](http://wiki.fast.ai/index.php/Fine_tuning) BERT can provide both an accuracy boost and faster training time in many cases.

Here, we'll train a model to predict whether an IMDB movie review is positive or negative using BERT in Tensorflow with tf hub. Some code was adapted from [this colab notebook](https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb). Let's get started!

In this notebook, we use the methods and clases presented in the script: https://github.com/google-research/bert/blob/master/run_classifier.py

This script contains code to make easier the use of BERT with common and general functionality.

In [1]:
from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime

In addition to the standard libraries we imported above, we'll need to install BERT's python package.

In [2]:
!pip install bert-tensorflow

Collecting bert-tensorflow
[?25l  Downloading https://files.pythonhosted.org/packages/a6/66/7eb4e8b6ea35b7cc54c322c816f976167a43019750279a8473d355800a93/bert_tensorflow-1.0.1-py2.py3-none-any.whl (67kB)
[K     |████▉                           | 10kB 12.2MB/s eta 0:00:01[K     |█████████▊                      | 20kB 4.5MB/s eta 0:00:01[K     |██████████████▋                 | 30kB 6.5MB/s eta 0:00:01[K     |███████████████████▍            | 40kB 8.3MB/s eta 0:00:01[K     |████████████████████████▎       | 51kB 5.1MB/s eta 0:00:01[K     |█████████████████████████████▏  | 61kB 6.0MB/s eta 0:00:01[K     |████████████████████████████████| 71kB 4.2MB/s 
Installing collected packages: bert-tensorflow
Successfully installed bert-tensorflow-1.0.1


In [3]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization




Below, we'll set an output directory location to store our model output and checkpoints. This can be a local directory, in which case you'd set OUTPUT_DIR to the name of the directory you'd like to create. If you're running this code in Google's hosted Colab, the directory won't persist after the Colab session ends.

Alternatively, if you're a GCP user, you can store output in a GCP bucket. To do that, set a directory name in OUTPUT_DIR and the name of the GCP bucket in the BUCKET field.

Set DO_DELETE to rewrite the OUTPUT_DIR if it exists. Otherwise, Tensorflow will load existing model checkpoints from that directory (if they exist).

In [4]:
# Set the output directory for saving model file
# Optionally, set a GCP bucket location

OUTPUT_DIR = 'session_output_dir'#@param {type:"string"}
#@markdown Whether or not to clear/delete the directory and create a new one
DO_DELETE = True #@param {type:"boolean"}
#@markdown Set USE_BUCKET and BUCKET if you want to (optionally) store model output on GCP bucket.
USE_BUCKET = False #@param {type:"boolean"}
BUCKET = 'BUCKET_NAME' #@param {type:"string"}

if USE_BUCKET:
  OUTPUT_DIR = 'gs://{}/{}'.format(BUCKET, OUTPUT_DIR)
  from google.colab import auth
  auth.authenticate_user()

if DO_DELETE:
  try:
    tf.gfile.DeleteRecursively(OUTPUT_DIR)
  except:
    # Doesn't matter if the directory didn't exist
    pass
tf.gfile.MakeDirs(OUTPUT_DIR)
print('***** Model output directory: {} *****'.format(OUTPUT_DIR))


***** Model output directory: session_output_dir *****


#Data

First, let's download the dataset, hosted by Stanford. The code below, which downloads, extracts, and imports the IMDB Large Movie Review Dataset, is borrowed from [this Tensorflow tutorial](https://www.tensorflow.org/hub/tutorials/text_classification_with_tf_hub).

In [0]:
from tensorflow import keras
import os
import re

# Load all files from a directory in a DataFrame.
def load_directory_data(directory):
  data = {}
  data["sentence"] = []
  data["sentiment"] = []
  for file_path in os.listdir(directory):
    with tf.gfile.GFile(os.path.join(directory, file_path), "r") as f:
      data["sentence"].append(f.read())
      data["sentiment"].append(re.match("\d+_(\d+)\.txt", file_path).group(1))
  return pd.DataFrame.from_dict(data)

# Merge positive and negative examples, add a polarity column and shuffle.
def load_dataset(directory):
  pos_df = load_directory_data(os.path.join(directory, "pos"))
  neg_df = load_directory_data(os.path.join(directory, "neg"))
  pos_df["polarity"] = 1
  neg_df["polarity"] = 0
  return pd.concat([pos_df, neg_df]).sample(frac=1).reset_index(drop=True)

# Download and process the dataset files.
def download_and_load_datasets(force_download=False):
  dataset = tf.keras.utils.get_file(
      fname="aclImdb.tar.gz", 
      origin="http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz", 
      extract=True)
  
  train_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                       "aclImdb", "train"))
  test_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                      "aclImdb", "test"))
  
  return train_df, test_df


In [6]:
train, test = download_and_load_datasets()

Downloading data from http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz


To keep training fast, we'll take a sample of 5000 train and test examples, respectively.

In [0]:
train = train.sample(5000)
test = test.sample(5000)

In [19]:
train.columns

Index(['sentence', 'sentiment', 'polarity'], dtype='object')

In [17]:
train.head()

Unnamed: 0,sentence,sentiment,polarity
0,"It was the Sixties, and anyone with long hair ...",1,0
1,THE LADY FROM SHANGHAI is proof that the great...,9,1
2,The funny sound that you may hear when you eye...,1,0
3,*I mark where there are spoilers! Overall comm...,10,1
4,"As other reviewers have noted, this is an unju...",8,1


For us, our input data is the 'sentence' column and our label is the 'polarity' column (0, 1 for negative and positive, respecitvely)

In [0]:
DATA_COLUMN = 'sentence'
LABEL_COLUMN = 'polarity'
# label_list is the list of labels, i.e. True, False or 0, 1 or 'dog', 'cat'
label_list = [0, 1]

#Data Preprocessing
We'll need to transform our data into a format BERT understands. This involves two steps. First, we create  `InputExample`'s using the constructor provided in the BERT library.

- `text_a` is the text we want to classify, which in this case, is the `Request` field in our Dataframe. 
- `text_b` is used if we're training a model to understand the relationship between sentences (i.e. is `text_b` a translation of `text_a`? Is `text_b` an answer to the question asked by `text_a`?). This doesn't apply to our task, so we can leave `text_b` blank.
- `label` is the label for our example, i.e. True, False

In [0]:
# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this example
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

test_InputExamples = test.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

The `InputExample` is a class created in the Bert code to only contain the input training dates (is a bean class). This class is in the `bert.run_classifier` script.

See class `InputExample` in https://github.com/google-research/bert/blob/master/run_classifier.py

In [33]:
print('The InputExample class for the firts training example:')
firstExample = train_InputExamples.values[0]
print(firstExample)
print(firstExample.guid)
print(firstExample.text_a)
print(firstExample.text_b)
print(firstExample.label)

The InputExample class for the firts training example:
<bert.run_classifier.InputExample object at 0x7fbffe5ecd68>
None
Don't bother. A little prosciutto could go a long way, but all we get is pure ham, particularly from Dunaway. The plot is one of those bumper car episodes... the vehicle bounces into another and everything changes direction again, until we are merely scratching our heads wondering if there were ever a plot. Gina Phillips is actually good, but it's hard playing across from a mystified Dunaway playing Lady Macbeth lost in the Marx's Brother's Duck Soup. Ah, the Raven...now there's an actor. And there is the relative who just lies and bed and looks ghostly. Or Dr. Dread who's filled with lots of gloom and no working remedies. I'm one of those suckers who just has to see a movie to the end. Quoth the Raven, "Nevermore."
None
0


Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (but don't worry--this is also included in the Python library):


1. Lowercase our text (if we're using a BERT lowercase model)
2. Tokenize it (i.e. "sally says hi" -> ["sally", "says", "hi"])
3. Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
4. Map our words to indexes using a vocab file that BERT provides
5. Add special "CLS" and "SEP" tokens (see the [readme](https://github.com/google-research/bert))
6. Append "index" and "segment" tokens to each input (see the [BERT paper](https://arxiv.org/pdf/1810.04805.pdf))

Happily, we don't have to worry about most of these details.




To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:

In [34]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


Great--we just learned that the BERT model we're using expects lowercase data (that's what stored in tokenization_info["do_lower_case"]) and we also loaded BERT's vocab file. We also created a tokenizer, which breaks words into word pieces:

In [35]:
tokenizer.tokenize("This here's an example of using the BERT tokenizer")

['this',
 'here',
 "'",
 's',
 'an',
 'example',
 'of',
 'using',
 'the',
 'bert',
 'token',
 '##izer']

Using our tokenizer, we'll call `run_classifier.convert_examples_to_features` on our InputExamples to convert them into features BERT understands.

See method `run_classifier.convert_examples_to_features` in https://github.com/google-research/bert/blob/master/run_classifier.py

In [36]:
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128
# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)

INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] don ' t bother . a little pro ##sc ##iu ##tto could go a long way , but all we get is pure ham , particularly from dun ##away . the plot is one of those bumper car episodes . . . the vehicle bounce ##s into another and everything changes direction again , until we are merely scratching our heads wondering if there were ever a plot . gina phillips is actually good , but it ' s hard playing across from a my ##sti ##fied dun ##away playing lady macbeth lost in the marx ' s brother ' s duck soup . ah , the raven . . . now there ' s an actor . and there is the relative who just lies [SEP]


INFO:tensorflow:tokens: [CLS] don ' t bother . a little pro ##sc ##iu ##tto could go a long way , but all we get is pure ham , particularly from dun ##away . the plot is one of those bumper car episodes . . . the vehicle bounce ##s into another and everything changes direction again , until we are merely scratching our heads wondering if there were ever a plot . gina phillips is actually good , but it ' s hard playing across from a my ##sti ##fied dun ##away playing lady macbeth lost in the marx ' s brother ' s duck soup . ah , the raven . . . now there ' s an actor . and there is the relative who just lies [SEP]


INFO:tensorflow:input_ids: 101 2123 1005 1056 8572 1012 1037 2210 4013 11020 17922 9284 2071 2175 1037 2146 2126 1010 2021 2035 2057 2131 2003 5760 10654 1010 3391 2013 24654 9497 1012 1996 5436 2003 2028 1997 2216 21519 2482 4178 1012 1012 1012 1996 4316 17523 2015 2046 2178 1998 2673 3431 3257 2153 1010 2127 2057 2024 6414 20291 2256 4641 6603 2065 2045 2020 2412 1037 5436 1012 17508 8109 2003 2941 2204 1010 2021 2009 1005 1055 2524 2652 2408 2013 1037 2026 16643 10451 24654 9497 2652 3203 25182 2439 1999 1996 13518 1005 1055 2567 1005 1055 9457 11350 1012 6289 1010 1996 10000 1012 1012 1012 2085 2045 1005 1055 2019 3364 1012 1998 2045 2003 1996 5816 2040 2074 3658 102


INFO:tensorflow:input_ids: 101 2123 1005 1056 8572 1012 1037 2210 4013 11020 17922 9284 2071 2175 1037 2146 2126 1010 2021 2035 2057 2131 2003 5760 10654 1010 3391 2013 24654 9497 1012 1996 5436 2003 2028 1997 2216 21519 2482 4178 1012 1012 1012 1996 4316 17523 2015 2046 2178 1998 2673 3431 3257 2153 1010 2127 2057 2024 6414 20291 2256 4641 6603 2065 2045 2020 2412 1037 5436 1012 17508 8109 2003 2941 2204 1010 2021 2009 1005 1055 2524 2652 2408 2013 1037 2026 16643 10451 24654 9497 2652 3203 25182 2439 1999 1996 13518 1005 1055 2567 1005 1055 9457 11350 1012 6289 1010 1996 10000 1012 1012 1012 2085 2045 1005 1055 2019 3364 1012 1998 2045 2003 1996 5816 2040 2074 3658 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] flat , soul ##less computer images on less than astonishing backgrounds an ##imate ##s a horribly predictable story in this film . absolutely nothing takes you by surprise , you can even tell when the bryan adams vocals are going to come in , which are always at the wrong time . < br / > < br / > the main character , spirit the horse , is given an annoying voice when he na ##rra ##tes what is happening . the narration is not needed , though , as everything happening is really obvious . you can even tell what the horses are saying , although all they do is ne ##igh . which would be good , but all the horses make exactly [SEP]


INFO:tensorflow:tokens: [CLS] flat , soul ##less computer images on less than astonishing backgrounds an ##imate ##s a horribly predictable story in this film . absolutely nothing takes you by surprise , you can even tell when the bryan adams vocals are going to come in , which are always at the wrong time . < br / > < br / > the main character , spirit the horse , is given an annoying voice when he na ##rra ##tes what is happening . the narration is not needed , though , as everything happening is really obvious . you can even tell what the horses are saying , although all they do is ne ##igh . which would be good , but all the horses make exactly [SEP]


INFO:tensorflow:input_ids: 101 4257 1010 3969 3238 3274 4871 2006 2625 2084 26137 15406 2019 21499 2015 1037 27762 21425 2466 1999 2023 2143 1012 7078 2498 3138 2017 2011 4474 1010 2017 2064 2130 2425 2043 1996 8527 5922 2955 2024 2183 2000 2272 1999 1010 2029 2024 2467 2012 1996 3308 2051 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 2364 2839 1010 4382 1996 3586 1010 2003 2445 2019 15703 2376 2043 2002 6583 11335 4570 2054 2003 6230 1012 1996 21283 2003 2025 2734 1010 2295 1010 2004 2673 6230 2003 2428 5793 1012 2017 2064 2130 2425 2054 1996 5194 2024 3038 1010 2348 2035 2027 2079 2003 11265 18377 1012 2029 2052 2022 2204 1010 2021 2035 1996 5194 2191 3599 102


INFO:tensorflow:input_ids: 101 4257 1010 3969 3238 3274 4871 2006 2625 2084 26137 15406 2019 21499 2015 1037 27762 21425 2466 1999 2023 2143 1012 7078 2498 3138 2017 2011 4474 1010 2017 2064 2130 2425 2043 1996 8527 5922 2955 2024 2183 2000 2272 1999 1010 2029 2024 2467 2012 1996 3308 2051 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 2364 2839 1010 4382 1996 3586 1010 2003 2445 2019 15703 2376 2043 2002 6583 11335 4570 2054 2003 6230 1012 1996 21283 2003 2025 2734 1010 2295 1010 2004 2673 6230 2003 2428 5793 1012 2017 2064 2130 2425 2054 1996 5194 2024 3038 1010 2348 2035 2027 2079 2003 11265 18377 1012 2029 2052 2022 2204 1010 2021 2035 1996 5194 2191 3599 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] when this film was originally released it was promoted with the notably un ##ima ##gina ##tive tag ##line " dirty harry is at it again " . whatever this pitch lacks in original ##ity is more than compensated for by it ' s complete and total accuracy . " sudden impact " retains all the aspects that made the previous three dirty harry movies so successful - tight pacing , a compelling plot , strong supporting characters , endless gun ##play , and bone - dry humor . some of these elements are not only retained but amplified - this is easily the darkest , blood ##iest , and most over ##tly right - wing installment of the franchise . < br / > < br [SEP]


INFO:tensorflow:tokens: [CLS] when this film was originally released it was promoted with the notably un ##ima ##gina ##tive tag ##line " dirty harry is at it again " . whatever this pitch lacks in original ##ity is more than compensated for by it ' s complete and total accuracy . " sudden impact " retains all the aspects that made the previous three dirty harry movies so successful - tight pacing , a compelling plot , strong supporting characters , endless gun ##play , and bone - dry humor . some of these elements are not only retained but amplified - this is easily the darkest , blood ##iest , and most over ##tly right - wing installment of the franchise . < br / > < br [SEP]


INFO:tensorflow:input_ids: 101 2043 2023 2143 2001 2761 2207 2009 2001 3755 2007 1996 5546 4895 9581 20876 6024 6415 4179 1000 6530 4302 2003 2012 2009 2153 1000 1012 3649 2023 6510 14087 1999 2434 3012 2003 2062 2084 29258 2005 2011 2009 1005 1055 3143 1998 2561 10640 1012 1000 5573 4254 1000 14567 2035 1996 5919 2008 2081 1996 3025 2093 6530 4302 5691 2061 3144 1011 4389 15732 1010 1037 17075 5436 1010 2844 4637 3494 1010 10866 3282 13068 1010 1998 5923 1011 4318 8562 1012 2070 1997 2122 3787 2024 2025 2069 6025 2021 26986 1011 2023 2003 4089 1996 23036 1010 2668 10458 1010 1998 2087 2058 14626 2157 1011 3358 18932 1997 1996 6329 1012 1026 7987 1013 1028 1026 7987 102


INFO:tensorflow:input_ids: 101 2043 2023 2143 2001 2761 2207 2009 2001 3755 2007 1996 5546 4895 9581 20876 6024 6415 4179 1000 6530 4302 2003 2012 2009 2153 1000 1012 3649 2023 6510 14087 1999 2434 3012 2003 2062 2084 29258 2005 2011 2009 1005 1055 3143 1998 2561 10640 1012 1000 5573 4254 1000 14567 2035 1996 5919 2008 2081 1996 3025 2093 6530 4302 5691 2061 3144 1011 4389 15732 1010 1037 17075 5436 1010 2844 4637 3494 1010 10866 3282 13068 1010 1998 5923 1011 4318 8562 1012 2070 1997 2122 3787 2024 2025 2069 6025 2021 26986 1011 2023 2003 4089 1996 23036 1010 2668 10458 1010 1998 2087 2058 14626 2157 1011 3358 18932 1997 1996 6329 1012 1026 7987 1013 1028 1026 7987 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] " what would you do ? " is a question that will stick in your mind for weeks after watching the emotional broke ##down palace . you will also be left wondering if alice ( danes ) was telling the truth or not - a issue that is left un ##res ##olved , and right ##ly so . this is a particularly well acted and beautifully shot film . although it is slow at times , its pace is reflective of the story line - but a lot of the film will have you on the edge of your seat ; wanting to know what happens next . the ending will also leave you imagining yourself in the shoes of the lead characters , which are [SEP]


INFO:tensorflow:tokens: [CLS] " what would you do ? " is a question that will stick in your mind for weeks after watching the emotional broke ##down palace . you will also be left wondering if alice ( danes ) was telling the truth or not - a issue that is left un ##res ##olved , and right ##ly so . this is a particularly well acted and beautifully shot film . although it is slow at times , its pace is reflective of the story line - but a lot of the film will have you on the edge of your seat ; wanting to know what happens next . the ending will also leave you imagining yourself in the shoes of the lead characters , which are [SEP]


INFO:tensorflow:input_ids: 101 1000 2054 2052 2017 2079 1029 1000 2003 1037 3160 2008 2097 6293 1999 2115 2568 2005 3134 2044 3666 1996 6832 3631 7698 4186 1012 2017 2097 2036 2022 2187 6603 2065 5650 1006 27476 1007 2001 4129 1996 3606 2030 2025 1011 1037 3277 2008 2003 2187 4895 6072 16116 1010 1998 2157 2135 2061 1012 2023 2003 1037 3391 2092 6051 1998 17950 2915 2143 1012 2348 2009 2003 4030 2012 2335 1010 2049 6393 2003 21346 1997 1996 2466 2240 1011 2021 1037 2843 1997 1996 2143 2097 2031 2017 2006 1996 3341 1997 2115 2835 1025 5782 2000 2113 2054 6433 2279 1012 1996 4566 2097 2036 2681 2017 16603 4426 1999 1996 6007 1997 1996 2599 3494 1010 2029 2024 102


INFO:tensorflow:input_ids: 101 1000 2054 2052 2017 2079 1029 1000 2003 1037 3160 2008 2097 6293 1999 2115 2568 2005 3134 2044 3666 1996 6832 3631 7698 4186 1012 2017 2097 2036 2022 2187 6603 2065 5650 1006 27476 1007 2001 4129 1996 3606 2030 2025 1011 1037 3277 2008 2003 2187 4895 6072 16116 1010 1998 2157 2135 2061 1012 2023 2003 1037 3391 2092 6051 1998 17950 2915 2143 1012 2348 2009 2003 4030 2012 2335 1010 2049 6393 2003 21346 1997 1996 2466 2240 1011 2021 1037 2843 1997 1996 2143 2097 2031 2017 2006 1996 3341 1997 2115 2835 1025 5782 2000 2113 2054 6433 2279 1012 1996 4566 2097 2036 2681 2017 16603 4426 1999 1996 6007 1997 1996 2599 3494 1010 2029 2024 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] this is the best dub i ' ve ever heard by disney , as well as the best adaptation since the biggest abuse ever on soundtrack , themes , characters , dialogues in ki ##ki delivery service . ur ##rr ##gh ##hh < br / > < br / > this one has different atmosphere , especially the deviation from the common heroine . this one has both hero and heroine ( although i don ' t really end ##ors ##e the use of hero & heroine here , since mi ##ya ##zaki is out from the stereo ##type & common theme ) . as usual , after being introduced by spirited away , amazed by mono ##no ##ke , troubled by grave of fire ##flies [SEP]


INFO:tensorflow:tokens: [CLS] this is the best dub i ' ve ever heard by disney , as well as the best adaptation since the biggest abuse ever on soundtrack , themes , characters , dialogues in ki ##ki delivery service . ur ##rr ##gh ##hh < br / > < br / > this one has different atmosphere , especially the deviation from the common heroine . this one has both hero and heroine ( although i don ' t really end ##ors ##e the use of hero & heroine here , since mi ##ya ##zaki is out from the stereo ##type & common theme ) . as usual , after being introduced by spirited away , amazed by mono ##no ##ke , troubled by grave of fire ##flies [SEP]


INFO:tensorflow:input_ids: 101 2023 2003 1996 2190 12931 1045 1005 2310 2412 2657 2011 6373 1010 2004 2092 2004 1996 2190 6789 2144 1996 5221 6905 2412 2006 6050 1010 6991 1010 3494 1010 22580 1999 11382 3211 6959 2326 1012 24471 12171 5603 23644 1026 7987 1013 1028 1026 7987 1013 1028 2023 2028 2038 2367 7224 1010 2926 1996 24353 2013 1996 2691 18869 1012 2023 2028 2038 2119 5394 1998 18869 1006 2348 1045 2123 1005 1056 2428 2203 5668 2063 1996 2224 1997 5394 1004 18869 2182 1010 2144 2771 3148 18637 2003 2041 2013 1996 12991 13874 1004 2691 4323 1007 1012 2004 5156 1010 2044 2108 3107 2011 24462 2185 1010 15261 2011 18847 3630 3489 1010 11587 2011 6542 1997 2543 24019 102


INFO:tensorflow:input_ids: 101 2023 2003 1996 2190 12931 1045 1005 2310 2412 2657 2011 6373 1010 2004 2092 2004 1996 2190 6789 2144 1996 5221 6905 2412 2006 6050 1010 6991 1010 3494 1010 22580 1999 11382 3211 6959 2326 1012 24471 12171 5603 23644 1026 7987 1013 1028 1026 7987 1013 1028 2023 2028 2038 2367 7224 1010 2926 1996 24353 2013 1996 2691 18869 1012 2023 2028 2038 2119 5394 1998 18869 1006 2348 1045 2123 1005 1056 2428 2203 5668 2063 1996 2224 1997 5394 1004 18869 2182 1010 2144 2771 3148 18637 2003 2041 2013 1996 12991 13874 1004 2691 4323 1007 1012 2004 5156 1010 2044 2108 3107 2011 24462 2185 1010 15261 2011 18847 3630 3489 1010 11587 2011 6542 1997 2543 24019 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] first of all , i think the casting and acting were excellent . the problem is the story . there is basically no story here worth telling and thus basically no movie here . larry mc ##mur ##try has done lone ##some dove and i can ' t fault the original , though it probably didn ' t need sequels . he did hu ##d with paul newman , which is one of my favorite movies . mel ##len ##camp is supposed to be a country singer , but the only song i hear him sing is an old buck owens song . the movie makes a big deal out of chicken farming . mel ##len ##camp ' s character has a good wife , and [SEP]


INFO:tensorflow:tokens: [CLS] first of all , i think the casting and acting were excellent . the problem is the story . there is basically no story here worth telling and thus basically no movie here . larry mc ##mur ##try has done lone ##some dove and i can ' t fault the original , though it probably didn ' t need sequels . he did hu ##d with paul newman , which is one of my favorite movies . mel ##len ##camp is supposed to be a country singer , but the only song i hear him sing is an old buck owens song . the movie makes a big deal out of chicken farming . mel ##len ##camp ' s character has a good wife , and [SEP]


INFO:tensorflow:input_ids: 101 2034 1997 2035 1010 1045 2228 1996 9179 1998 3772 2020 6581 1012 1996 3291 2003 1996 2466 1012 2045 2003 10468 2053 2466 2182 4276 4129 1998 2947 10468 2053 3185 2182 1012 6554 11338 20136 11129 2038 2589 10459 14045 10855 1998 1045 2064 1005 1056 6346 1996 2434 1010 2295 2009 2763 2134 1005 1056 2342 25815 1012 2002 2106 15876 2094 2007 2703 10625 1010 2029 2003 2028 1997 2026 5440 5691 1012 11463 7770 26468 2003 4011 2000 2022 1037 2406 3220 1010 2021 1996 2069 2299 1045 2963 2032 6170 2003 2019 2214 10131 14824 2299 1012 1996 3185 3084 1037 2502 3066 2041 1997 7975 7876 1012 11463 7770 26468 1005 1055 2839 2038 1037 2204 2564 1010 1998 102


INFO:tensorflow:input_ids: 101 2034 1997 2035 1010 1045 2228 1996 9179 1998 3772 2020 6581 1012 1996 3291 2003 1996 2466 1012 2045 2003 10468 2053 2466 2182 4276 4129 1998 2947 10468 2053 3185 2182 1012 6554 11338 20136 11129 2038 2589 10459 14045 10855 1998 1045 2064 1005 1056 6346 1996 2434 1010 2295 2009 2763 2134 1005 1056 2342 25815 1012 2002 2106 15876 2094 2007 2703 10625 1010 2029 2003 2028 1997 2026 5440 5691 1012 11463 7770 26468 2003 4011 2000 2022 1037 2406 3220 1010 2021 1996 2069 2299 1045 2963 2032 6170 2003 2019 2214 10131 14824 2299 1012 1996 3185 3084 1037 2502 3066 2041 1997 7975 7876 1012 11463 7770 26468 1005 1055 2839 2038 1037 2204 2564 1010 1998 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] reading the book i felt once again drawn into castle rock ( need ##ful things being the final part of the rock trilogy ) , and the plot was a variant on the " demon comes to small red ##neck village " type story king likes to tell . the characters were all described in loving detail , and it made both a good psychological and go ##ry horror . the film on the other hand is awful . gone are the character interactions and clever plot , and replaced by a story that tries to be exciting but misses by a mile . if you haven ' t read the book then you might enjoy this , else avoid at all costs , as with [SEP]


INFO:tensorflow:tokens: [CLS] reading the book i felt once again drawn into castle rock ( need ##ful things being the final part of the rock trilogy ) , and the plot was a variant on the " demon comes to small red ##neck village " type story king likes to tell . the characters were all described in loving detail , and it made both a good psychological and go ##ry horror . the film on the other hand is awful . gone are the character interactions and clever plot , and replaced by a story that tries to be exciting but misses by a mile . if you haven ' t read the book then you might enjoy this , else avoid at all costs , as with [SEP]


INFO:tensorflow:input_ids: 101 3752 1996 2338 1045 2371 2320 2153 4567 2046 3317 2600 1006 2342 3993 2477 2108 1996 2345 2112 1997 1996 2600 11544 1007 1010 1998 1996 5436 2001 1037 8349 2006 1996 1000 5698 3310 2000 2235 2417 18278 2352 1000 2828 2466 2332 7777 2000 2425 1012 1996 3494 2020 2035 2649 1999 8295 6987 1010 1998 2009 2081 2119 1037 2204 8317 1998 2175 2854 5469 1012 1996 2143 2006 1996 2060 2192 2003 9643 1012 2908 2024 1996 2839 10266 1998 12266 5436 1010 1998 2999 2011 1037 2466 2008 5363 2000 2022 10990 2021 22182 2011 1037 3542 1012 2065 2017 4033 1005 1056 3191 1996 2338 2059 2017 2453 5959 2023 1010 2842 4468 2012 2035 5366 1010 2004 2007 102


INFO:tensorflow:input_ids: 101 3752 1996 2338 1045 2371 2320 2153 4567 2046 3317 2600 1006 2342 3993 2477 2108 1996 2345 2112 1997 1996 2600 11544 1007 1010 1998 1996 5436 2001 1037 8349 2006 1996 1000 5698 3310 2000 2235 2417 18278 2352 1000 2828 2466 2332 7777 2000 2425 1012 1996 3494 2020 2035 2649 1999 8295 6987 1010 1998 2009 2081 2119 1037 2204 8317 1998 2175 2854 5469 1012 1996 2143 2006 1996 2060 2192 2003 9643 1012 2908 2024 1996 2839 10266 1998 12266 5436 1010 1998 2999 2011 1037 2466 2008 5363 2000 2022 10990 2021 22182 2011 1037 3542 1012 2065 2017 4033 1005 1056 3191 1996 2338 2059 2017 2453 5959 2023 1010 2842 4468 2012 2035 5366 1010 2004 2007 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] this has to be the ultimate chick flick ever . we taped it off the t . v . years ago and i ' ve watched it about 30 times over the years . i hadn ' t seen it for about 12 years and just recently watched this movie . i ' m not lying , i cried from the opening credits to the ending credits . this movie truly tears your heart out , even if you don ' t have children . [SEP]


INFO:tensorflow:tokens: [CLS] this has to be the ultimate chick flick ever . we taped it off the t . v . years ago and i ' ve watched it about 30 times over the years . i hadn ' t seen it for about 12 years and just recently watched this movie . i ' m not lying , i cried from the opening credits to the ending credits . this movie truly tears your heart out , even if you don ' t have children . [SEP]


INFO:tensorflow:input_ids: 101 2023 2038 2000 2022 1996 7209 14556 17312 2412 1012 2057 19374 2009 2125 1996 1056 1012 1058 1012 2086 3283 1998 1045 1005 2310 3427 2009 2055 2382 2335 2058 1996 2086 1012 1045 2910 1005 1056 2464 2009 2005 2055 2260 2086 1998 2074 3728 3427 2023 3185 1012 1045 1005 1049 2025 4688 1010 1045 6639 2013 1996 3098 6495 2000 1996 4566 6495 1012 2023 3185 5621 4000 2115 2540 2041 1010 2130 2065 2017 2123 1005 1056 2031 2336 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 2023 2038 2000 2022 1996 7209 14556 17312 2412 1012 2057 19374 2009 2125 1996 1056 1012 1058 1012 2086 3283 1998 1045 1005 2310 3427 2009 2055 2382 2335 2058 1996 2086 1012 1045 2910 1005 1056 2464 2009 2005 2055 2260 2086 1998 2074 3728 3427 2023 3185 1012 1045 1005 1049 2025 4688 1010 1045 6639 2013 1996 3098 6495 2000 1996 4566 6495 1012 2023 3185 5621 4000 2115 2540 2041 1010 2130 2065 2017 2123 1005 1056 2031 2336 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] this movie is engaging from start to finish with excellent performances , a great soundtrack with original music by douglas brown , and a well paced script that ' s full of surprises . < br / > < br / > full of new and not so new faces , this movie showcases promising talent especially in the case of craig morris who plays the main character eddie monroe . morris , who also co - wrote the script , displays a quiet strength combined with a strong emotional performance as he creates a bel ##ie ##vable character on screen . also a po ##ignant delivery by paul var ##io who plays uncle benny with a genuine warmth , was so convincing that he made [SEP]


INFO:tensorflow:tokens: [CLS] this movie is engaging from start to finish with excellent performances , a great soundtrack with original music by douglas brown , and a well paced script that ' s full of surprises . < br / > < br / > full of new and not so new faces , this movie showcases promising talent especially in the case of craig morris who plays the main character eddie monroe . morris , who also co - wrote the script , displays a quiet strength combined with a strong emotional performance as he creates a bel ##ie ##vable character on screen . also a po ##ignant delivery by paul var ##io who plays uncle benny with a genuine warmth , was so convincing that he made [SEP]


INFO:tensorflow:input_ids: 101 2023 3185 2003 11973 2013 2707 2000 3926 2007 6581 4616 1010 1037 2307 6050 2007 2434 2189 2011 5203 2829 1010 1998 1037 2092 13823 5896 2008 1005 1055 2440 1997 20096 1012 1026 7987 1013 1028 1026 7987 1013 1028 2440 1997 2047 1998 2025 2061 2047 5344 1010 2023 3185 27397 10015 5848 2926 1999 1996 2553 1997 7010 6384 2040 3248 1996 2364 2839 5752 9747 1012 6384 1010 2040 2036 2522 1011 2626 1996 5896 1010 8834 1037 4251 3997 4117 2007 1037 2844 6832 2836 2004 2002 9005 1037 19337 2666 12423 2839 2006 3898 1012 2036 1037 13433 25593 6959 2011 2703 13075 3695 2040 3248 4470 11945 2007 1037 10218 8251 1010 2001 2061 13359 2008 2002 2081 102


INFO:tensorflow:input_ids: 101 2023 3185 2003 11973 2013 2707 2000 3926 2007 6581 4616 1010 1037 2307 6050 2007 2434 2189 2011 5203 2829 1010 1998 1037 2092 13823 5896 2008 1005 1055 2440 1997 20096 1012 1026 7987 1013 1028 1026 7987 1013 1028 2440 1997 2047 1998 2025 2061 2047 5344 1010 2023 3185 27397 10015 5848 2926 1999 1996 2553 1997 7010 6384 2040 3248 1996 2364 2839 5752 9747 1012 6384 1010 2040 2036 2522 1011 2626 1996 5896 1010 8834 1037 4251 3997 4117 2007 1037 2844 6832 2836 2004 2002 9005 1037 19337 2666 12423 2839 2006 3898 1012 2036 1037 13433 25593 6959 2011 2703 13075 3695 2040 3248 4470 11945 2007 1037 10218 8251 1010 2001 2061 13359 2008 2002 2081 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] having just seen walt disney ' s the skeleton dance on the saturday morning blog as linked from youtube , i used those same sources to watch a remake done in tech ##nic ##olo ##r for the columbia cartoon unit and animated by the same man - u ##b i ##werk ##s . the colors , compared to the earlier black and white , are really used imaginative ##ly here and many of the new gag ##s - like when one of the skeletal band players hits a wrong note constantly or when one loses his head and takes another one ' s off or when one dances with the other with part of that other gone - are just as funny as the previous short [SEP]


INFO:tensorflow:tokens: [CLS] having just seen walt disney ' s the skeleton dance on the saturday morning blog as linked from youtube , i used those same sources to watch a remake done in tech ##nic ##olo ##r for the columbia cartoon unit and animated by the same man - u ##b i ##werk ##s . the colors , compared to the earlier black and white , are really used imaginative ##ly here and many of the new gag ##s - like when one of the skeletal band players hits a wrong note constantly or when one loses his head and takes another one ' s off or when one dances with the other with part of that other gone - are just as funny as the previous short [SEP]


INFO:tensorflow:input_ids: 101 2383 2074 2464 10598 6373 1005 1055 1996 13526 3153 2006 1996 5095 2851 9927 2004 5799 2013 7858 1010 1045 2109 2216 2168 4216 2000 3422 1037 12661 2589 1999 6627 8713 12898 2099 2005 1996 3996 9476 3131 1998 6579 2011 1996 2168 2158 1011 1057 2497 1045 29548 2015 1012 1996 6087 1010 4102 2000 1996 3041 2304 1998 2317 1010 2024 2428 2109 28575 2135 2182 1998 2116 1997 1996 2047 18201 2015 1011 2066 2043 2028 1997 1996 20415 2316 2867 4978 1037 3308 3602 7887 2030 2043 2028 12386 2010 2132 1998 3138 2178 2028 1005 1055 2125 2030 2043 2028 11278 2007 1996 2060 2007 2112 1997 2008 2060 2908 1011 2024 2074 2004 6057 2004 1996 3025 2460 102


INFO:tensorflow:input_ids: 101 2383 2074 2464 10598 6373 1005 1055 1996 13526 3153 2006 1996 5095 2851 9927 2004 5799 2013 7858 1010 1045 2109 2216 2168 4216 2000 3422 1037 12661 2589 1999 6627 8713 12898 2099 2005 1996 3996 9476 3131 1998 6579 2011 1996 2168 2158 1011 1057 2497 1045 29548 2015 1012 1996 6087 1010 4102 2000 1996 3041 2304 1998 2317 1010 2024 2428 2109 28575 2135 2182 1998 2116 1997 1996 2047 18201 2015 1011 2066 2043 2028 1997 1996 20415 2316 2867 4978 1037 3308 3602 7887 2030 2043 2028 12386 2010 2132 1998 3138 2178 2028 1005 1055 2125 2030 2043 2028 11278 2007 1996 2060 2007 2112 1997 2008 2060 2908 1011 2024 2074 2004 6057 2004 1996 3025 2460 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


#Creating a model

Now that we've prepared our data, let's focus on building a model. `create_model` does just this below. First, it loads the BERT tf hub module again (this time to extract the computation graph). Next, it creates a single new layer that will be trained to adapt BERT to our sentiment task (i.e. classifying whether a movie review is positive or negative). This strategy of using a mostly trained model is called [fine-tuning](http://wiki.fast.ai/index.php/Fine_tuning).

In [0]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)


Next we'll wrap our model function in a `model_fn_builder` function that adapts our model to work for training, evaluation, and prediction.

In [0]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        f1_score = tf.contrib.metrics.f1_score(
            label_ids,
            predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        return {
            "eval_accuracy": accuracy,
            "f1_score": f1_score,
            "auc": auc,
            "precision": precision,
            "recall": recall,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
        }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


In [0]:
# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 500
SAVE_SUMMARY_STEPS = 100

In [0]:
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [0]:
# Specify outpit directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [42]:
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})


INFO:tensorflow:Using config: {'_model_dir': 'session_output_dir', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fbfa45f44a8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


INFO:tensorflow:Using config: {'_model_dir': 'session_output_dir', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fbfa45f44a8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


Next we create an input builder function that takes our training feature set (`train_features`) and produces a generator. This is a pretty standard design pattern for working with Tensorflow [Estimators](https://www.tensorflow.org/guide/estimators).

In [0]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

Now we train our model! For me, using a Colab notebook running on Google's GPUs, my training time was about 14 minutes.

In [0]:
print(f'Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.


Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.




















Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where


Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into session_output_dir/model.ckpt.


INFO:tensorflow:Saving checkpoints for 0 into session_output_dir/model.ckpt.


INFO:tensorflow:loss = 0.70910347, step = 0


INFO:tensorflow:loss = 0.70910347, step = 0


INFO:tensorflow:global_step/sec: 0.543051


INFO:tensorflow:global_step/sec: 0.543051


INFO:tensorflow:loss = 0.16484924, step = 100 (184.146 sec)


INFO:tensorflow:loss = 0.16484924, step = 100 (184.146 sec)


INFO:tensorflow:global_step/sec: 0.611644


INFO:tensorflow:global_step/sec: 0.611644


INFO:tensorflow:loss = 0.058268245, step = 200 (163.499 sec)


INFO:tensorflow:loss = 0.058268245, step = 200 (163.499 sec)


INFO:tensorflow:global_step/sec: 0.611088


INFO:tensorflow:global_step/sec: 0.611088


INFO:tensorflow:loss = 0.039988644, step = 300 (163.642 sec)


INFO:tensorflow:loss = 0.039988644, step = 300 (163.642 sec)


INFO:tensorflow:global_step/sec: 0.61143


INFO:tensorflow:global_step/sec: 0.61143


INFO:tensorflow:loss = 0.08440806, step = 400 (163.547 sec)


INFO:tensorflow:loss = 0.08440806, step = 400 (163.547 sec)


Now let's use our test data to see how well our model did:

In [0]:
test_input_fn = run_classifier.input_fn_builder(
    features=test_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

In [0]:
estimator.evaluate(input_fn=test_input_fn, steps=None)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore


  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2019-02-12T21:04:20Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from gs://bert-tfhub/aclImdb_v1/model.ckpt-468
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Finished evaluation at 2019-02-12-21:06:05
INFO:tensorflow:Saving dict for global step 468: auc = 0.86659324, eval_accuracy = 0.8664, f1_score = 0.8659711, false_negatives = 375.0, false_positives = 293.0, global_step = 468, loss = 0.51870537, precision = 0.880457, recall = 0.8519542, true_negatives = 2174.0, true_positives = 2158.0
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 468: gs://bert-tfhub/aclImdb_v1/model.ckpt-468


{'auc': 0.86659324,
 'eval_accuracy': 0.8664,
 'f1_score': 0.8659711,
 'false_negatives': 375.0,
 'false_positives': 293.0,
 'global_step': 468,
 'loss': 0.51870537,
 'precision': 0.880457,
 'recall': 0.8519542,
 'true_negatives': 2174.0,
 'true_positives': 2158.0}

Now let's write code to make predictions on new sentences:

In [0]:
def getPrediction(in_sentences):
  labels = ["Negative", "Positive"]
  input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, label = 0) for x in in_sentences] # here, "" is just a dummy label
  input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
  predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
  predictions = estimator.predict(predict_input_fn)
  return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]

In [0]:
pred_sentences = [
  "That movie was absolutely awful",
  "The acting was a bit lacking",
  "The film was creative and surprising",
  "Absolutely fantastic!"
]

In [0]:
predictions = getPrediction(pred_sentences)

INFO:tensorflow:Writing example 0 of 4
INFO:tensorflow:*** Example ***
INFO:tensorflow:guid: 
INFO:tensorflow:tokens: [CLS] that movie was absolutely awful [SEP]
INFO:tensorflow:input_ids: 101 2008 3185 2001 7078 9643 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Voila! We have a sentiment classifier!

In [0]:
predictions

[('That movie was absolutely awful',
  array([-4.9142293e-03, -5.3180690e+00], dtype=float32),
  'Negative'),
 ('The acting was a bit lacking',
  array([-0.03325794, -3.4200459 ], dtype=float32),
  'Negative'),
 ('The film was creative and surprising',
  array([-5.3589125e+00, -4.7171740e-03], dtype=float32),
  'Positive'),
 ('Absolutely fantastic!',
  array([-5.0434084 , -0.00647258], dtype=float32),
  'Positive')]