<a href="https://colab.research.google.com/github/rajibmondal/BERT-Fine-Tuning-with-PyTorch-for-Sentence-classification/blob/master/Copy_of_Predicting_Movie_Reviews_with_BERT_on_TF_Hub.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
# Copyright 2019 Google Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#Predicting Movie Review Sentiment with BERT on TF Hub

If you’ve been following Natural Language Processing over the past year, you’ve probably heard of BERT: Bidirectional Encoder Representations from Transformers. It’s a neural network architecture designed by Google researchers that’s totally transformed what’s state-of-the-art for NLP tasks, like text classification, translation, summarization, and question answering.

Now that BERT's been added to [TF Hub](https://www.tensorflow.org/hub) as a loadable module, it's easy(ish) to add into existing Tensorflow text pipelines. In an existing pipeline, BERT can replace text embedding layers like ELMO and GloVE. Alternatively, [finetuning](http://wiki.fast.ai/index.php/Fine_tuning) BERT can provide both an accuracy boost and faster training time in many cases.

Here, we'll train a model to predict whether an IMDB movie review is positive or negative using BERT in Tensorflow with tf hub. Some code was adapted from [this colab notebook](https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb). Let's get started!

In [0]:
from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime

In addition to the standard libraries we imported above, we'll need to install BERT's python package.

In [0]:
!pip install bert-tensorflow

Collecting bert-tensorflow
[?25l  Downloading https://files.pythonhosted.org/packages/a6/66/7eb4e8b6ea35b7cc54c322c816f976167a43019750279a8473d355800a93/bert_tensorflow-1.0.1-py2.py3-none-any.whl (67kB)
[K     |████▉                           | 10kB 18.3MB/s eta 0:00:01[K     |█████████▊                      | 20kB 4.1MB/s eta 0:00:01[K     |██████████████▋                 | 30kB 5.8MB/s eta 0:00:01[K     |███████████████████▍            | 40kB 3.8MB/s eta 0:00:01[K     |████████████████████████▎       | 51kB 4.7MB/s eta 0:00:01[K     |█████████████████████████████▏  | 61kB 5.5MB/s eta 0:00:01[K     |████████████████████████████████| 71kB 3.9MB/s 
Installing collected packages: bert-tensorflow
Successfully installed bert-tensorflow-1.0.1


In [0]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization




Below, we'll set an output directory location to store our model output and checkpoints. This can be a local directory, in which case you'd set OUTPUT_DIR to the name of the directory you'd like to create. If you're running this code in Google's hosted Colab, the directory won't persist after the Colab session ends.

Alternatively, if you're a GCP user, you can store output in a GCP bucket. To do that, set a directory name in OUTPUT_DIR and the name of the GCP bucket in the BUCKET field.

Set DO_DELETE to rewrite the OUTPUT_DIR if it exists. Otherwise, Tensorflow will load existing model checkpoints from that directory (if they exist).

In [0]:
# Set the output directory for saving model file
# Optionally, set a GCP bucket location

OUTPUT_DIR = 'OUTPUT_DIR_NAME'#@param {type:"string"}
#@markdown Whether or not to clear/delete the directory and create a new one
DO_DELETE = False #@param {type:"boolean"}
#@markdown Set USE_BUCKET and BUCKET if you want to (optionally) store model output on GCP bucket.
USE_BUCKET = True #@param {type:"boolean"}
BUCKET = 'BUCKET_NAME' #@param {type:"string"}

if USE_BUCKET:
  OUTPUT_DIR = 'gs://{}/{}'.format(BUCKET, OUTPUT_DIR)
  from google.colab import auth
  auth.authenticate_user()

if DO_DELETE:
  try:
    tf.gfile.DeleteRecursively(OUTPUT_DIR)
  except:
    # Doesn't matter if the directory didn't exist
    pass
tf.gfile.MakeDirs(OUTPUT_DIR)
print('***** Model output directory: {} *****'.format(OUTPUT_DIR))


***** Model output directory: gs://bert-tfhub/aclImdb_v1 *****


#Data

First, let's download the dataset, hosted by Stanford. The code below, which downloads, extracts, and imports the IMDB Large Movie Review Dataset, is borrowed from [this Tensorflow tutorial](https://www.tensorflow.org/hub/tutorials/text_classification_with_tf_hub).

In [0]:
from tensorflow import keras
import os
import re

# Load all files from a directory in a DataFrame.
def load_directory_data(directory):
  data = {}
  data["sentence"] = []
  data["sentiment"] = []
  for file_path in os.listdir(directory):
    with tf.gfile.GFile(os.path.join(directory, file_path), "r") as f:
      data["sentence"].append(f.read())
      data["sentiment"].append(re.match("\d+_(\d+)\.txt", file_path).group(1))
  return pd.DataFrame.from_dict(data)

# Merge positive and negative examples, add a polarity column and shuffle.
def load_dataset(directory):
  pos_df = load_directory_data(os.path.join(directory, "pos"))
  neg_df = load_directory_data(os.path.join(directory, "neg"))
  pos_df["polarity"] = 1
  neg_df["polarity"] = 0
  return pd.concat([pos_df, neg_df]).sample(frac=1).reset_index(drop=True)

# Download and process the dataset files.
def download_and_load_datasets(force_download=False):
  dataset = tf.keras.utils.get_file(
      fname="aclImdb.tar.gz", 
      origin="http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz", 
      extract=True)
  
  train_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                       "aclImdb", "train"))
  test_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                      "aclImdb", "test"))
  
  return train_df, test_df


In [0]:
train, test = download_and_load_datasets()

Downloading data from http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz


To keep training fast, we'll take a sample of 5000 train and test examples, respectively.

In [0]:
train = train.sample(5000)
test = test.sample(5000)

In [0]:
train.columns
train.head()

Unnamed: 0,sentence,sentiment,polarity
14768,This is an anti-Serb propaganda film made for ...,1,0
14047,"Although the plot was a bit sappy at times, an...",7,1
2433,The BFG is one of Roald Dahl's most cherished ...,4,0
15290,This film is easily one of the worst ones I ha...,1,0
6065,"Excellent movie, albeit slightly predictable. ...",10,1


For us, our input data is the 'sentence' column and our label is the 'polarity' column (0, 1 for negative and positive, respecitvely)

In [0]:
DATA_COLUMN = 'sentence'
LABEL_COLUMN = 'polarity'
# label_list is the list of labels, i.e. True, False or 0, 1 or 'dog', 'cat'
label_list = [0, 1]

#Data Preprocessing
We'll need to transform our data into a format BERT understands. This involves two steps. First, we create  `InputExample`'s using the constructor provided in the BERT library.

- `text_a` is the text we want to classify, which in this case, is the `Request` field in our Dataframe. 
- `text_b` is used if we're training a model to understand the relationship between sentences (i.e. is `text_b` a translation of `text_a`? Is `text_b` an answer to the question asked by `text_a`?). This doesn't apply to our task, so we can leave `text_b` blank.
- `label` is the label for our example, i.e. True, False

In [0]:
# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this example
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

test_InputExamples = test.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (but don't worry--this is also included in the Python library):


1. Lowercase our text (if we're using a BERT lowercase model)
2. Tokenize it (i.e. "sally says hi" -> ["sally", "says", "hi"])
3. Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
4. Map our words to indexes using a vocab file that BERT provides
5. Add special "CLS" and "SEP" tokens (see the [readme](https://github.com/google-research/bert))
6. Append "index" and "segment" tokens to each input (see the [BERT paper](https://arxiv.org/pdf/1810.04805.pdf))

Happily, we don't have to worry about most of these details.




To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:

In [0]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


Great--we just learned that the BERT model we're using expects lowercase data (that's what stored in tokenization_info["do_lower_case"]) and we also loaded BERT's vocab file. We also created a tokenizer, which breaks words into word pieces:

In [0]:
tokenizer.tokenize("This here's an example of using the BERT tokenizer")

['this',
 'here',
 "'",
 's',
 'an',
 'example',
 'of',
 'using',
 'the',
 'bert',
 'token',
 '##izer']

Using our tokenizer, we'll call `run_classifier.convert_examples_to_features` on our InputExamples to convert them into features BERT understands.

In [0]:
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128
# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)







INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] this is an anti - serb propaganda film made for tv . < br / > < br / > " the muslims are good ; the orthodox christian serbs are bad . " < br / > < br / > that ' s the message . < br / > < br / > using " entertainment " to get across a propaganda message is nothing new . < br / > < br / > this movie lays it on thick . < br / > < br / > and apparently many viewers and reviewer lap it up . < br / > < br / > i know better . < br / > < br / > the serbs , [SEP]


INFO:tensorflow:tokens: [CLS] this is an anti - serb propaganda film made for tv . < br / > < br / > " the muslims are good ; the orthodox christian serbs are bad . " < br / > < br / > that ' s the message . < br / > < br / > using " entertainment " to get across a propaganda message is nothing new . < br / > < br / > this movie lays it on thick . < br / > < br / > and apparently many viewers and reviewer lap it up . < br / > < br / > i know better . < br / > < br / > the serbs , [SEP]


INFO:tensorflow:input_ids: 101 2023 2003 2019 3424 1011 20180 10398 2143 2081 2005 2694 1012 1026 7987 1013 1028 1026 7987 1013 1028 1000 1996 7486 2024 2204 1025 1996 6244 3017 16757 2024 2919 1012 1000 1026 7987 1013 1028 1026 7987 1013 1028 2008 1005 1055 1996 4471 1012 1026 7987 1013 1028 1026 7987 1013 1028 2478 1000 4024 1000 2000 2131 2408 1037 10398 4471 2003 2498 2047 1012 1026 7987 1013 1028 1026 7987 1013 1028 2023 3185 19764 2009 2006 4317 1012 1026 7987 1013 1028 1026 7987 1013 1028 1998 4593 2116 7193 1998 12027 5001 2009 2039 1012 1026 7987 1013 1028 1026 7987 1013 1028 1045 2113 2488 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 16757 1010 102


INFO:tensorflow:input_ids: 101 2023 2003 2019 3424 1011 20180 10398 2143 2081 2005 2694 1012 1026 7987 1013 1028 1026 7987 1013 1028 1000 1996 7486 2024 2204 1025 1996 6244 3017 16757 2024 2919 1012 1000 1026 7987 1013 1028 1026 7987 1013 1028 2008 1005 1055 1996 4471 1012 1026 7987 1013 1028 1026 7987 1013 1028 2478 1000 4024 1000 2000 2131 2408 1037 10398 4471 2003 2498 2047 1012 1026 7987 1013 1028 1026 7987 1013 1028 2023 3185 19764 2009 2006 4317 1012 1026 7987 1013 1028 1026 7987 1013 1028 1998 4593 2116 7193 1998 12027 5001 2009 2039 1012 1026 7987 1013 1028 1026 7987 1013 1028 1045 2113 2488 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 16757 1010 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] although the plot was a bit sap ##py at times , and very rushed at the end , as if the director had run out of his all ##oted time and needed to hurry up and finish the story , overall it was pretty good for the made - for - back ##woods - cable - tv genre . < br / > < br / > however , the actress who played the baby ##sit ##ter , mariana k ##lav ##eno , was very good ! i hope to see more of her around in movie - land . the music was also well done , getting every possible chill out of the da ##h - du ##h - da ##h - du ##h ( [SEP]


INFO:tensorflow:tokens: [CLS] although the plot was a bit sap ##py at times , and very rushed at the end , as if the director had run out of his all ##oted time and needed to hurry up and finish the story , overall it was pretty good for the made - for - back ##woods - cable - tv genre . < br / > < br / > however , the actress who played the baby ##sit ##ter , mariana k ##lav ##eno , was very good ! i hope to see more of her around in movie - land . the music was also well done , getting every possible chill out of the da ##h - du ##h - da ##h - du ##h ( [SEP]


INFO:tensorflow:input_ids: 101 2348 1996 5436 2001 1037 2978 20066 7685 2012 2335 1010 1998 2200 6760 2012 1996 2203 1010 2004 2065 1996 2472 2018 2448 2041 1997 2010 2035 27428 2051 1998 2734 2000 9241 2039 1998 3926 1996 2466 1010 3452 2009 2001 3492 2204 2005 1996 2081 1011 2005 1011 2067 25046 1011 5830 1011 2694 6907 1012 1026 7987 1013 1028 1026 7987 1013 1028 2174 1010 1996 3883 2040 2209 1996 3336 28032 3334 1010 22097 1047 14973 16515 1010 2001 2200 2204 999 1045 3246 2000 2156 2062 1997 2014 2105 1999 3185 1011 2455 1012 1996 2189 2001 2036 2092 2589 1010 2893 2296 2825 10720 2041 1997 1996 4830 2232 1011 4241 2232 1011 4830 2232 1011 4241 2232 1006 102


INFO:tensorflow:input_ids: 101 2348 1996 5436 2001 1037 2978 20066 7685 2012 2335 1010 1998 2200 6760 2012 1996 2203 1010 2004 2065 1996 2472 2018 2448 2041 1997 2010 2035 27428 2051 1998 2734 2000 9241 2039 1998 3926 1996 2466 1010 3452 2009 2001 3492 2204 2005 1996 2081 1011 2005 1011 2067 25046 1011 5830 1011 2694 6907 1012 1026 7987 1013 1028 1026 7987 1013 1028 2174 1010 1996 3883 2040 2209 1996 3336 28032 3334 1010 22097 1047 14973 16515 1010 2001 2200 2204 999 1045 3246 2000 2156 2062 1997 2014 2105 1999 3185 1011 2455 1012 1996 2189 2001 2036 2092 2589 1010 2893 2296 2825 10720 2041 1997 1996 4830 2232 1011 4241 2232 1011 4830 2232 1011 4241 2232 1006 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] the bf ##g is one of ro ##ald dahl ' s most cher ##ished books , but in this animated adaptation the magic just isn ' t there . this version remains pretty faithful to dahl ' s original story so one can ' t lay the blame on john ham ##bley ' s script . if anything the fault lies with the colour ##less animation , the let ##har ##gic pace and the generally lack ##lus ##tre voice - overs . one would be right to expect this story to make for a happy , vibrant , fun - filled movie . . . . . instead , the film is a hopeless ##ly dull affair that becomes quite ted ##ious to watch . children [SEP]


INFO:tensorflow:tokens: [CLS] the bf ##g is one of ro ##ald dahl ' s most cher ##ished books , but in this animated adaptation the magic just isn ' t there . this version remains pretty faithful to dahl ' s original story so one can ' t lay the blame on john ham ##bley ' s script . if anything the fault lies with the colour ##less animation , the let ##har ##gic pace and the generally lack ##lus ##tre voice - overs . one would be right to expect this story to make for a happy , vibrant , fun - filled movie . . . . . instead , the film is a hopeless ##ly dull affair that becomes quite ted ##ious to watch . children [SEP]


INFO:tensorflow:input_ids: 101 1996 28939 2290 2003 2028 1997 20996 19058 27934 1005 1055 2087 24188 13295 2808 1010 2021 1999 2023 6579 6789 1996 3894 2074 3475 1005 1056 2045 1012 2023 2544 3464 3492 11633 2000 27934 1005 1055 2434 2466 2061 2028 2064 1005 1056 3913 1996 7499 2006 2198 10654 29538 1005 1055 5896 1012 2065 2505 1996 6346 3658 2007 1996 6120 3238 7284 1010 1996 2292 8167 12863 6393 1998 1996 3227 3768 7393 7913 2376 1011 15849 1012 2028 2052 2022 2157 2000 5987 2023 2466 2000 2191 2005 1037 3407 1010 17026 1010 4569 1011 3561 3185 1012 1012 1012 1012 1012 2612 1010 1996 2143 2003 1037 20625 2135 10634 6771 2008 4150 3243 6945 6313 2000 3422 1012 2336 102


INFO:tensorflow:input_ids: 101 1996 28939 2290 2003 2028 1997 20996 19058 27934 1005 1055 2087 24188 13295 2808 1010 2021 1999 2023 6579 6789 1996 3894 2074 3475 1005 1056 2045 1012 2023 2544 3464 3492 11633 2000 27934 1005 1055 2434 2466 2061 2028 2064 1005 1056 3913 1996 7499 2006 2198 10654 29538 1005 1055 5896 1012 2065 2505 1996 6346 3658 2007 1996 6120 3238 7284 1010 1996 2292 8167 12863 6393 1998 1996 3227 3768 7393 7913 2376 1011 15849 1012 2028 2052 2022 2157 2000 5987 2023 2466 2000 2191 2005 1037 3407 1010 17026 1010 4569 1011 3561 3185 1012 1012 1012 1012 1012 2612 1010 1996 2143 2003 1037 20625 2135 10634 6771 2008 4150 3243 6945 6313 2000 3422 1012 2336 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] this film is easily one of the worst ones i have ever seen . and i don ' t mean that in a good way . we wanted to see a crap ##py horror / thriller , so we picked the one that seemed to be the lou ##sies ##t in the store . for once , the film was everything we ' d expected . and more ! ( or should i say less ? ) < br / > < br / > the actors look like they are reading their lines from posters behind the camera . the so - called special effects are created by putting red see - through plastic in front of the camera to give the impression that we [SEP]


INFO:tensorflow:tokens: [CLS] this film is easily one of the worst ones i have ever seen . and i don ' t mean that in a good way . we wanted to see a crap ##py horror / thriller , so we picked the one that seemed to be the lou ##sies ##t in the store . for once , the film was everything we ' d expected . and more ! ( or should i say less ? ) < br / > < br / > the actors look like they are reading their lines from posters behind the camera . the so - called special effects are created by putting red see - through plastic in front of the camera to give the impression that we [SEP]


INFO:tensorflow:input_ids: 101 2023 2143 2003 4089 2028 1997 1996 5409 3924 1045 2031 2412 2464 1012 1998 1045 2123 1005 1056 2812 2008 1999 1037 2204 2126 1012 2057 2359 2000 2156 1037 10231 7685 5469 1013 10874 1010 2061 2057 3856 1996 2028 2008 2790 2000 2022 1996 10223 14625 2102 1999 1996 3573 1012 2005 2320 1010 1996 2143 2001 2673 2057 1005 1040 3517 1012 1998 2062 999 1006 2030 2323 1045 2360 2625 1029 1007 1026 7987 1013 1028 1026 7987 1013 1028 1996 5889 2298 2066 2027 2024 3752 2037 3210 2013 14921 2369 1996 4950 1012 1996 2061 1011 2170 2569 3896 2024 2580 2011 5128 2417 2156 1011 2083 6081 1999 2392 1997 1996 4950 2000 2507 1996 8605 2008 2057 102


INFO:tensorflow:input_ids: 101 2023 2143 2003 4089 2028 1997 1996 5409 3924 1045 2031 2412 2464 1012 1998 1045 2123 1005 1056 2812 2008 1999 1037 2204 2126 1012 2057 2359 2000 2156 1037 10231 7685 5469 1013 10874 1010 2061 2057 3856 1996 2028 2008 2790 2000 2022 1996 10223 14625 2102 1999 1996 3573 1012 2005 2320 1010 1996 2143 2001 2673 2057 1005 1040 3517 1012 1998 2062 999 1006 2030 2323 1045 2360 2625 1029 1007 1026 7987 1013 1028 1026 7987 1013 1028 1996 5889 2298 2066 2027 2024 3752 2037 3210 2013 14921 2369 1996 4950 1012 1996 2061 1011 2170 2569 3896 2024 2580 2011 5128 2417 2156 1011 2083 6081 1999 2392 1997 1996 4950 2000 2507 1996 8605 2008 2057 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] excellent movie , albeit slightly predictable . i have to comment on nicole kid ##mans acting in this movie . some of her other works haven ' t shown the amazing talent this woman has , but birthday girl doesn ' t suffer from this in the slightest . even without words kid ##mans acting shine ##s through . [SEP]


INFO:tensorflow:tokens: [CLS] excellent movie , albeit slightly predictable . i have to comment on nicole kid ##mans acting in this movie . some of her other works haven ' t shown the amazing talent this woman has , but birthday girl doesn ' t suffer from this in the slightest . even without words kid ##mans acting shine ##s through . [SEP]


INFO:tensorflow:input_ids: 101 6581 3185 1010 12167 3621 21425 1012 1045 2031 2000 7615 2006 9851 4845 15154 3772 1999 2023 3185 1012 2070 1997 2014 2060 2573 4033 1005 1056 3491 1996 6429 5848 2023 2450 2038 1010 2021 5798 2611 2987 1005 1056 9015 2013 2023 1999 1996 15989 1012 2130 2302 2616 4845 15154 3772 12342 2015 2083 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 6581 3185 1010 12167 3621 21425 1012 1045 2031 2000 7615 2006 9851 4845 15154 3772 1999 2023 3185 1012 2070 1997 2014 2060 2573 4033 1005 1056 3491 1996 6429 5848 2023 2450 2038 1010 2021 5798 2611 2987 1005 1056 9015 2013 2023 1999 1996 15989 1012 2130 2302 2616 4845 15154 3772 12342 2015 2083 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] i saw this movie as part of a billy graham program . the church i attend was part of a community wide outreach to present god and christianity to our community ( hartford , ct . usa ) . i was one of the counselor ##s who helped attendees ( who were invited to come forward and make whatever kind of religious profession they wanted . . . and to follow up on them after the movie . as such , it did what it was supposed to do , and i personally found it to be a medium to strengthen my faith in god . i also found it to be very helpful to those i counsel ##ed . i especially like the work of [SEP]


INFO:tensorflow:tokens: [CLS] i saw this movie as part of a billy graham program . the church i attend was part of a community wide outreach to present god and christianity to our community ( hartford , ct . usa ) . i was one of the counselor ##s who helped attendees ( who were invited to come forward and make whatever kind of religious profession they wanted . . . and to follow up on them after the movie . as such , it did what it was supposed to do , and i personally found it to be a medium to strengthen my faith in god . i also found it to be very helpful to those i counsel ##ed . i especially like the work of [SEP]


INFO:tensorflow:input_ids: 101 1045 2387 2023 3185 2004 2112 1997 1037 5006 5846 2565 1012 1996 2277 1045 5463 2001 2112 1997 1037 2451 2898 15641 2000 2556 2643 1998 7988 2000 2256 2451 1006 13381 1010 14931 1012 3915 1007 1012 1045 2001 2028 1997 1996 17220 2015 2040 3271 19973 1006 2040 2020 4778 2000 2272 2830 1998 2191 3649 2785 1997 3412 9518 2027 2359 1012 1012 1012 1998 2000 3582 2039 2006 2068 2044 1996 3185 1012 2004 2107 1010 2009 2106 2054 2009 2001 4011 2000 2079 1010 1998 1045 7714 2179 2009 2000 2022 1037 5396 2000 12919 2026 4752 1999 2643 1012 1045 2036 2179 2009 2000 2022 2200 14044 2000 2216 1045 9517 2098 1012 1045 2926 2066 1996 2147 1997 102


INFO:tensorflow:input_ids: 101 1045 2387 2023 3185 2004 2112 1997 1037 5006 5846 2565 1012 1996 2277 1045 5463 2001 2112 1997 1037 2451 2898 15641 2000 2556 2643 1998 7988 2000 2256 2451 1006 13381 1010 14931 1012 3915 1007 1012 1045 2001 2028 1997 1996 17220 2015 2040 3271 19973 1006 2040 2020 4778 2000 2272 2830 1998 2191 3649 2785 1997 3412 9518 2027 2359 1012 1012 1012 1998 2000 3582 2039 2006 2068 2044 1996 3185 1012 2004 2107 1010 2009 2106 2054 2009 2001 4011 2000 2079 1010 1998 1045 7714 2179 2009 2000 2022 1037 5396 2000 12919 2026 4752 1999 2643 1012 1045 2036 2179 2009 2000 2022 2200 14044 2000 2216 1045 9517 2098 1012 1045 2926 2066 1996 2147 1997 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] i was in 6th grade and this movie aired on pbs during a series called ' wonder ##works . ' i distinctly remember sitting on a couch watching the movie with tears running down my face at the end . in the film jesse , the main character , forms an unusual friendship with a girl named leslie . due to a very simple , but careless mistake one of the pair dies . at the time , i found the story very powerful , because the fatal mistake is exactly the type of mistake a kid would make and so any kid watching the film will find it very easy to identify with and feel the emotional weight of the tragedy that en ##su ##es [SEP]


INFO:tensorflow:tokens: [CLS] i was in 6th grade and this movie aired on pbs during a series called ' wonder ##works . ' i distinctly remember sitting on a couch watching the movie with tears running down my face at the end . in the film jesse , the main character , forms an unusual friendship with a girl named leslie . due to a very simple , but careless mistake one of the pair dies . at the time , i found the story very powerful , because the fatal mistake is exactly the type of mistake a kid would make and so any kid watching the film will find it very easy to identify with and feel the emotional weight of the tragedy that en ##su ##es [SEP]


INFO:tensorflow:input_ids: 101 1045 2001 1999 5351 3694 1998 2023 3185 4836 2006 13683 2076 1037 2186 2170 1005 4687 9316 1012 1005 1045 19517 3342 3564 2006 1037 6411 3666 1996 3185 2007 4000 2770 2091 2026 2227 2012 1996 2203 1012 1999 1996 2143 7627 1010 1996 2364 2839 1010 3596 2019 5866 6860 2007 1037 2611 2315 8886 1012 2349 2000 1037 2200 3722 1010 2021 23358 6707 2028 1997 1996 3940 8289 1012 2012 1996 2051 1010 1045 2179 1996 2466 2200 3928 1010 2138 1996 10611 6707 2003 3599 1996 2828 1997 6707 1037 4845 2052 2191 1998 2061 2151 4845 3666 1996 2143 2097 2424 2009 2200 3733 2000 6709 2007 1998 2514 1996 6832 3635 1997 1996 10576 2008 4372 6342 2229 102


INFO:tensorflow:input_ids: 101 1045 2001 1999 5351 3694 1998 2023 3185 4836 2006 13683 2076 1037 2186 2170 1005 4687 9316 1012 1005 1045 19517 3342 3564 2006 1037 6411 3666 1996 3185 2007 4000 2770 2091 2026 2227 2012 1996 2203 1012 1999 1996 2143 7627 1010 1996 2364 2839 1010 3596 2019 5866 6860 2007 1037 2611 2315 8886 1012 2349 2000 1037 2200 3722 1010 2021 23358 6707 2028 1997 1996 3940 8289 1012 2012 1996 2051 1010 1045 2179 1996 2466 2200 3928 1010 2138 1996 10611 6707 2003 3599 1996 2828 1997 6707 1037 4845 2052 2191 1998 2061 2151 4845 3666 1996 2143 2097 2424 2009 2200 3733 2000 6709 2007 1998 2514 1996 6832 3635 1997 1996 10576 2008 4372 6342 2229 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] ' ray ' lives on < br / > < br / > ray dir - taylor hack ##ford cast - jamie fox ##x , kerry washington , regina king , clifton powell , curtis armstrong and sharon warren . written by - taylor hack ##ford and james l . white . rating - * * * < br / > < br / > " hit the road jack , and don ' t come back ##no more , no more , no more , no more ! " who would ' ve thought that this immortal line that has almost become a re ##media ##l mantra for broken relationships in popular culture was conceived over a lovers ' brawl ! ray charles was a [SEP]


INFO:tensorflow:tokens: [CLS] ' ray ' lives on < br / > < br / > ray dir - taylor hack ##ford cast - jamie fox ##x , kerry washington , regina king , clifton powell , curtis armstrong and sharon warren . written by - taylor hack ##ford and james l . white . rating - * * * < br / > < br / > " hit the road jack , and don ' t come back ##no more , no more , no more , no more ! " who would ' ve thought that this immortal line that has almost become a re ##media ##l mantra for broken relationships in popular culture was conceived over a lovers ' brawl ! ray charles was a [SEP]


INFO:tensorflow:input_ids: 101 1005 4097 1005 3268 2006 1026 7987 1013 1028 1026 7987 1013 1028 4097 16101 1011 4202 20578 3877 3459 1011 6175 4419 2595 1010 11260 2899 1010 12512 2332 1010 16271 8997 1010 9195 9143 1998 10666 6031 1012 2517 2011 1011 4202 20578 3877 1998 2508 1048 1012 2317 1012 5790 1011 1008 1008 1008 1026 7987 1013 1028 1026 7987 1013 1028 1000 2718 1996 2346 2990 1010 1998 2123 1005 1056 2272 2067 3630 2062 1010 2053 2062 1010 2053 2062 1010 2053 2062 999 1000 2040 2052 1005 2310 2245 2008 2023 12147 2240 2008 2038 2471 2468 1037 2128 16969 2140 25951 2005 3714 6550 1999 2759 3226 2001 10141 2058 1037 10205 1005 23244 999 4097 2798 2001 1037 102


INFO:tensorflow:input_ids: 101 1005 4097 1005 3268 2006 1026 7987 1013 1028 1026 7987 1013 1028 4097 16101 1011 4202 20578 3877 3459 1011 6175 4419 2595 1010 11260 2899 1010 12512 2332 1010 16271 8997 1010 9195 9143 1998 10666 6031 1012 2517 2011 1011 4202 20578 3877 1998 2508 1048 1012 2317 1012 5790 1011 1008 1008 1008 1026 7987 1013 1028 1026 7987 1013 1028 1000 2718 1996 2346 2990 1010 1998 2123 1005 1056 2272 2067 3630 2062 1010 2053 2062 1010 2053 2062 1010 2053 2062 999 1000 2040 2052 1005 2310 2245 2008 2023 12147 2240 2008 2038 2471 2468 1037 2128 16969 2140 25951 2005 3714 6550 1999 2759 3226 2001 10141 2058 1037 10205 1005 23244 999 4097 2798 2001 1037 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] there is absolutely no doubt that this version of tarzan is the closest to burroughs ' vision . while he gladly collected his royalties from the films produced during his lifetime , he frequently made it clear that they were little more than the bastard children of his tales . the film studios ' lu ##dic ##rous obsession with casting olympic swimmers as tarzan was beyond laugh ##able . i guess we should consider ourselves lucky that they did not set their sights on shot - put ##ters . < br / > < br / > prior to this film , the most faithful adaptations were in comic strips and comic books . as fine as some of these were , we had to wait [SEP]


INFO:tensorflow:tokens: [CLS] there is absolutely no doubt that this version of tarzan is the closest to burroughs ' vision . while he gladly collected his royalties from the films produced during his lifetime , he frequently made it clear that they were little more than the bastard children of his tales . the film studios ' lu ##dic ##rous obsession with casting olympic swimmers as tarzan was beyond laugh ##able . i guess we should consider ourselves lucky that they did not set their sights on shot - put ##ters . < br / > < br / > prior to this film , the most faithful adaptations were in comic strips and comic books . as fine as some of these were , we had to wait [SEP]


INFO:tensorflow:input_ids: 101 2045 2003 7078 2053 4797 2008 2023 2544 1997 24566 2003 1996 7541 2000 25991 1005 4432 1012 2096 2002 24986 5067 2010 25335 2013 1996 3152 2550 2076 2010 6480 1010 2002 4703 2081 2009 3154 2008 2027 2020 2210 2062 2084 1996 8444 2336 1997 2010 7122 1012 1996 2143 4835 1005 11320 14808 13288 17418 2007 9179 4386 21669 2004 24566 2001 3458 4756 3085 1012 1045 3984 2057 2323 5136 9731 5341 2008 2027 2106 2025 2275 2037 15925 2006 2915 1011 2404 7747 1012 1026 7987 1013 1028 1026 7987 1013 1028 3188 2000 2023 2143 1010 1996 2087 11633 17241 2020 1999 5021 12970 1998 5021 2808 1012 2004 2986 2004 2070 1997 2122 2020 1010 2057 2018 2000 3524 102


INFO:tensorflow:input_ids: 101 2045 2003 7078 2053 4797 2008 2023 2544 1997 24566 2003 1996 7541 2000 25991 1005 4432 1012 2096 2002 24986 5067 2010 25335 2013 1996 3152 2550 2076 2010 6480 1010 2002 4703 2081 2009 3154 2008 2027 2020 2210 2062 2084 1996 8444 2336 1997 2010 7122 1012 1996 2143 4835 1005 11320 14808 13288 17418 2007 9179 4386 21669 2004 24566 2001 3458 4756 3085 1012 1045 3984 2057 2323 5136 9731 5341 2008 2027 2106 2025 2275 2037 15925 2006 2915 1011 2404 7747 1012 1026 7987 1013 1028 1026 7987 1013 1028 3188 2000 2023 2143 1010 1996 2087 11633 17241 2020 1999 5021 12970 1998 5021 2808 1012 2004 2986 2004 2070 1997 2122 2020 1010 2057 2018 2000 3524 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] i read angels and demons about 3 years ago , and i can honestly say to is one of the few books that i couldn ' t put down while reading . < br / > < br / > the movie however was pretty much what i expected , a lot of action , with somewhat of a mystery storyline . tom hank ##s plays , in my opinion , a much better role , of professor langdon than in the da vinci code . < br / > < br / > you won ' t have to worry about this being as bad as the da vinci code , this is everything that it wasn ' t . much more interesting , more [SEP]


INFO:tensorflow:tokens: [CLS] i read angels and demons about 3 years ago , and i can honestly say to is one of the few books that i couldn ' t put down while reading . < br / > < br / > the movie however was pretty much what i expected , a lot of action , with somewhat of a mystery storyline . tom hank ##s plays , in my opinion , a much better role , of professor langdon than in the da vinci code . < br / > < br / > you won ' t have to worry about this being as bad as the da vinci code , this is everything that it wasn ' t . much more interesting , more [SEP]


INFO:tensorflow:input_ids: 101 1045 3191 7048 1998 7942 2055 1017 2086 3283 1010 1998 1045 2064 9826 2360 2000 2003 2028 1997 1996 2261 2808 2008 1045 2481 1005 1056 2404 2091 2096 3752 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 3185 2174 2001 3492 2172 2054 1045 3517 1010 1037 2843 1997 2895 1010 2007 5399 1997 1037 6547 9994 1012 3419 9180 2015 3248 1010 1999 2026 5448 1010 1037 2172 2488 2535 1010 1997 2934 15232 2084 1999 1996 4830 23765 3642 1012 1026 7987 1013 1028 1026 7987 1013 1028 2017 2180 1005 1056 2031 2000 4737 2055 2023 2108 2004 2919 2004 1996 4830 23765 3642 1010 2023 2003 2673 2008 2009 2347 1005 1056 1012 2172 2062 5875 1010 2062 102


INFO:tensorflow:input_ids: 101 1045 3191 7048 1998 7942 2055 1017 2086 3283 1010 1998 1045 2064 9826 2360 2000 2003 2028 1997 1996 2261 2808 2008 1045 2481 1005 1056 2404 2091 2096 3752 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 3185 2174 2001 3492 2172 2054 1045 3517 1010 1037 2843 1997 2895 1010 2007 5399 1997 1037 6547 9994 1012 3419 9180 2015 3248 1010 1999 2026 5448 1010 1037 2172 2488 2535 1010 1997 2934 15232 2084 1999 1996 4830 23765 3642 1012 1026 7987 1013 1028 1026 7987 1013 1028 2017 2180 1005 1056 2031 2000 4737 2055 2023 2108 2004 2919 2004 1996 4830 23765 3642 1010 2023 2003 2673 2008 2009 2347 1005 1056 1012 2172 2062 5875 1010 2062 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


#Creating a model

Now that we've prepared our data, let's focus on building a model. `create_model` does just this below. First, it loads the BERT tf hub module again (this time to extract the computation graph). Next, it creates a single new layer that will be trained to adapt BERT to our sentiment task (i.e. classifying whether a movie review is positive or negative). This strategy of using a mostly trained model is called [fine-tuning](http://wiki.fast.ai/index.php/Fine_tuning).

In [0]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)


Next we'll wrap our model function in a `model_fn_builder` function that adapts our model to work for training, evaluation, and prediction.

In [0]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        f1_score = tf.contrib.metrics.f1_score(
            label_ids,
            predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        return {
            "eval_accuracy": accuracy,
            "f1_score": f1_score,
            "auc": auc,
            "precision": precision,
            "recall": recall,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
        }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


In [0]:
# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 500
SAVE_SUMMARY_STEPS = 100

In [0]:
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [0]:
# Specify outpit directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

NameError: ignored

In [0]:
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})


INFO:tensorflow:Using config: {'_model_dir': 'gs://bert-tfhub/aclImdb_v1', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fcedb507be0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


Next we create an input builder function that takes our training feature set (`train_features`) and produces a generator. This is a pretty standard design pattern for working with Tensorflow [Estimators](https://www.tensorflow.org/guide/estimators).

In [0]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

Now we train our model! For me, using a Colab notebook running on Google's GPUs, my training time was about 14 minutes.

In [0]:
print(f'Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!
INFO:tensorflow:Skipping training since max_steps has already saved.
Training took time  0:00:00.759709


Now let's use our test data to see how well our model did:

In [0]:
test_input_fn = run_classifier.input_fn_builder(
    features=test_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

In [0]:
estimator.evaluate(input_fn=test_input_fn, steps=None)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore


  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2019-02-12T21:04:20Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from gs://bert-tfhub/aclImdb_v1/model.ckpt-468
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Finished evaluation at 2019-02-12-21:06:05
INFO:tensorflow:Saving dict for global step 468: auc = 0.86659324, eval_accuracy = 0.8664, f1_score = 0.8659711, false_negatives = 375.0, false_positives = 293.0, global_step = 468, loss = 0.51870537, precision = 0.880457, recall = 0.8519542, true_negatives = 2174.0, true_positives = 2158.0
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 468: gs://bert-tfhub/aclImdb_v1/model.ckpt-468


{'auc': 0.86659324,
 'eval_accuracy': 0.8664,
 'f1_score': 0.8659711,
 'false_negatives': 375.0,
 'false_positives': 293.0,
 'global_step': 468,
 'loss': 0.51870537,
 'precision': 0.880457,
 'recall': 0.8519542,
 'true_negatives': 2174.0,
 'true_positives': 2158.0}

Now let's write code to make predictions on new sentences:

In [0]:
def getPrediction(in_sentences):
  labels = ["Negative", "Positive"]
  input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, label = 0) for x in in_sentences] # here, "" is just a dummy label
  input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
  predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
  predictions = estimator.predict(predict_input_fn)
  return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]

In [0]:
pred_sentences = [
  "That movie was absolutely awful",
  "The acting was a bit lacking",
  "The film was creative and surprising",
  "Absolutely fantastic!"
]

In [0]:
predictions = getPrediction(pred_sentences)

INFO:tensorflow:Writing example 0 of 4
INFO:tensorflow:*** Example ***
INFO:tensorflow:guid: 
INFO:tensorflow:tokens: [CLS] that movie was absolutely awful [SEP]
INFO:tensorflow:input_ids: 101 2008 3185 2001 7078 9643 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Voila! We have a sentiment classifier!

In [0]:
predictions

[('That movie was absolutely awful',
  array([-4.9142293e-03, -5.3180690e+00], dtype=float32),
  'Negative'),
 ('The acting was a bit lacking',
  array([-0.03325794, -3.4200459 ], dtype=float32),
  'Negative'),
 ('The film was creative and surprising',
  array([-5.3589125e+00, -4.7171740e-03], dtype=float32),
  'Positive'),
 ('Absolutely fantastic!',
  array([-5.0434084 , -0.00647258], dtype=float32),
  'Positive')]