# Predicting Movie Review Sentiment with BERT on TF Hub

If you’ve been following Natural Language Processing over the past year, you’ve probably heard of BERT: Bidirectional Encoder Representations from Transformers. It’s a neural network architecture designed by Google researchers that’s totally transformed what’s state-of-the-art for NLP tasks, like text classification, translation, summarization, and question answering.

Now that BERT's been added to [TF Hub](https://www.tensorflow.org/hub) as a loadable module, it's easy(ish) to add into existing Tensorflow text pipelines. In an existing pipeline, BERT can replace text embedding layers like ELMO and GloVE. Alternatively, [finetuning](http://wiki.fast.ai/index.php/Fine_tuning) BERT can provide both an accuracy boost and faster training time in many cases.

Here, we'll train a model to predict whether a review is positive or negative using BERT in Tensorflow with tf hub.

In [1]:
!pip list

Package                            Version   
---------------------------------- ----------
absl-py                            0.9.0     
alabaster                          0.7.10    
anaconda-client                    1.6.14    
anaconda-project                   0.8.2     
asn1crypto                         0.24.0    
astor                              0.8.1     
astroid                            1.6.3     
astropy                            3.0.2     
attrs                              18.1.0    
Automat                            0.3.0     
autovizwidget                      0.15.0    
awscli                             1.18.20   
Babel                              2.5.3     
backcall                           0.1.0     
backports.shutil-get-terminal-size 1.0.0     
bcrypt                             3.1.7     
beautifulsoup4                     4.6.0     
bert-tensorflow                    1.0.1     
bitarray                           0.8.1     
bkcharts     

You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [2]:
!pip install tensorflow-hub

You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [3]:
from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime




In [4]:
print(tf.__version__)

1.15.2


In addition to the standard libraries we imported above, we'll need to install BERT's python package.

In [5]:
!pip install bert-tensorflow

You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [6]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization




Below, we'll set an output directory location to store our model output and checkpoints. This can be a local directory, in which case you'd set OUTPUT_DIR to the name of the directory you'd like to create. If you're running this code in Google's hosted Colab, the directory won't persist after the Colab session ends.

Alternatively, if you're a GCP user, you can store output in a GCP bucket. To do that, set a directory name in OUTPUT_DIR and the name of the GCP bucket in the BUCKET field.

Set DO_DELETE to rewrite the OUTPUT_DIR if it exists. Otherwise, Tensorflow will load existing model checkpoints from that directory (if they exist).

In [7]:
# Set the output directory for saving model file
# Optionally, set a GCP bucket location

OUTPUT_DIR = './output'#@param {type:"string"}
#@markdown Whether or not to clear/delete the directory and create a new one
DO_DELETE = True #@param {type:"boolean"}
#@markdown Set USE_BUCKET and BUCKET if you want to (optionally) store model output on GCP bucket.
USE_BUCKET = False #@param {type:"boolean"}
BUCKET = 'BUCKET_NAME' #@param {type:"string"}

if USE_BUCKET:
  OUTPUT_DIR = 'gs://{}/{}'.format(BUCKET, OUTPUT_DIR)
  from google.colab import auth
  auth.authenticate_user()

if DO_DELETE:
  try:
    tf.gfile.DeleteRecursively(OUTPUT_DIR)
  except:
    # Doesn't matter if the directory didn't exist
    pass
tf.gfile.MakeDirs(OUTPUT_DIR)
print('***** Model output directory: {} *****'.format(OUTPUT_DIR))


***** Model output directory: ./output *****


#Data

First, let's download the dataset, hosted by Stanford. The code below, which downloads, extracts, and imports the IMDB Large Movie Review Dataset, is borrowed from [this Tensorflow tutorial](https://www.tensorflow.org/hub/tutorials/text_classification_with_tf_hub).

In [8]:
from tensorflow import keras
import os
import re

# Load all files from a directory in a DataFrame.
def load_directory_data(directory):
  data = {}
  data["sentence"] = []
  data["sentiment"] = []
  for file_path in os.listdir(directory):
    with tf.gfile.GFile(os.path.join(directory, file_path), "r") as f:
      data["sentence"].append(f.read())
      data["sentiment"].append(re.match("\d+_(\d+)\.txt", file_path).group(1))
  return pd.DataFrame.from_dict(data)

# Merge positive and negative examples, add a polarity column and shuffle.
def load_dataset(directory):
  pos_df = load_directory_data(os.path.join(directory, "pos"))
  neg_df = load_directory_data(os.path.join(directory, "neg"))
  pos_df["polarity"] = 1
  neg_df["polarity"] = 0
  return pd.concat([pos_df, neg_df]).sample(frac=1).reset_index(drop=True)

# Download and process the dataset files.
def download_and_load_datasets(force_download=False):
  dataset = tf.keras.utils.get_file(
      fname="aclImdb.tar.gz", 
      origin="http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz", 
      extract=True)
  
  train_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                       "aclImdb", "train"))
  test_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                      "aclImdb", "test"))
  
  return train_df, test_df


In [9]:
train, test = download_and_load_datasets()

To keep training fast, we'll take a sample of 5000 train and test examples, respectively.

In [10]:
train = train.sample(5000)
test = test.sample(5000)

In [11]:
train.head(5)

Unnamed: 0,sentence,sentiment,polarity
8879,Well where do I begin my story?? I went to thi...,1,0
4386,Jake Speed is a film that lacks one thing  a ...,7,1
6071,This early film has its flaws-- a predictable ...,8,1
12488,"Enchanting, romantic, innovative, and funny. T...",9,1
21011,My interest in Dorothy Stratten caused me to p...,4,0


In [12]:
test.head(5)

Unnamed: 0,sentence,sentiment,polarity
20800,This almost perfect cinematic rendition of Edi...,8,1
20320,I really don't understand who this movie is ai...,1,0
10500,I was greatly disappointed by the quality of t...,2,0
9957,I must admit I wasn't expecting much on this m...,8,1
24085,So then... this is what passes as high art for...,1,0


In [13]:
train.columns

Index(['sentence', 'sentiment', 'polarity'], dtype='object')

For us, our input data is the 'sentence' column and our label is the 'polarity' column (0, 1 for negative and positive, respecitvely)

In [14]:
DATA_COLUMN = 'sentence'
LABEL_COLUMN = 'polarity'
# label_list is the list of labels, i.e. True, False or 0, 1 or 'dog', 'cat'
label_list = [0, 1]

#Data Preprocessing
We'll need to transform our data into a format BERT understands. This involves two steps. First, we create  `InputExample`'s using the constructor provided in the BERT library.

- `text_a` is the text we want to classify, which in this case, is the `Request` field in our Dataframe. 
- `text_b` is used if we're training a model to understand the relationship between sentences (i.e. is `text_b` a translation of `text_a`? Is `text_b` an answer to the question asked by `text_a`?). This doesn't apply to our task, so we can leave `text_b` blank.
- `label` is the label for our example, i.e. True, False

In [15]:
# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this example
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

test_InputExamples = test.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (but don't worry--this is also included in the Python library):


1. Lowercase our text (if we're using a BERT lowercase model)
2. Tokenize it (i.e. "sally says hi" -> ["sally", "says", "hi"])
3. Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
4. Map our words to indexes using a vocab file that BERT provides
5. Add special "CLS" and "SEP" tokens (see the [readme](https://github.com/google-research/bert))
6. Append "index" and "segment" tokens to each input (see the [BERT paper](https://arxiv.org/pdf/1810.04805.pdf))

Happily, we don't have to worry about most of these details.




To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:

In [16]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


Great--we just learned that the BERT model we're using expects lowercase data (that's what stored in tokenization_info["do_lower_case"]) and we also loaded BERT's vocab file. We also created a tokenizer, which breaks words into word pieces:

In [17]:
tokenizer.tokenize("This here's an example of using the BERT tokenizer")

['this',
 'here',
 "'",
 's',
 'an',
 'example',
 'of',
 'using',
 'the',
 'bert',
 'token',
 '##izer']

Using our tokenizer, we'll call `run_classifier.convert_examples_to_features` on our InputExamples to convert them into features BERT understands.

In [18]:
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128


In [19]:
# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)







INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] well where do i begin my story ? ? i went to this movie tonight with a few friends not knowing more than the actors that were in it , and that it was supposed to be a horror movie . < br / > < br / > well i figured out within the first 20 minutes , what a poor decision i had made going out seeing this movie . the plot was crap , and so was the script . the lines were horrible to the point that people in the audience were laughing hysterical ##ly . < br / > < br / > the cast couldn ' t have been more plastic looking . even some of the scenes seemed like [SEP]


INFO:tensorflow:tokens: [CLS] well where do i begin my story ? ? i went to this movie tonight with a few friends not knowing more than the actors that were in it , and that it was supposed to be a horror movie . < br / > < br / > well i figured out within the first 20 minutes , what a poor decision i had made going out seeing this movie . the plot was crap , and so was the script . the lines were horrible to the point that people in the audience were laughing hysterical ##ly . < br / > < br / > the cast couldn ' t have been more plastic looking . even some of the scenes seemed like [SEP]


INFO:tensorflow:input_ids: 101 2092 2073 2079 1045 4088 2026 2466 1029 1029 1045 2253 2000 2023 3185 3892 2007 1037 2261 2814 2025 4209 2062 2084 1996 5889 2008 2020 1999 2009 1010 1998 2008 2009 2001 4011 2000 2022 1037 5469 3185 1012 1026 7987 1013 1028 1026 7987 1013 1028 2092 1045 6618 2041 2306 1996 2034 2322 2781 1010 2054 1037 3532 3247 1045 2018 2081 2183 2041 3773 2023 3185 1012 1996 5436 2001 10231 1010 1998 2061 2001 1996 5896 1012 1996 3210 2020 9202 2000 1996 2391 2008 2111 1999 1996 4378 2020 5870 25614 2135 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 3459 2481 1005 1056 2031 2042 2062 6081 2559 1012 2130 2070 1997 1996 5019 2790 2066 102


INFO:tensorflow:input_ids: 101 2092 2073 2079 1045 4088 2026 2466 1029 1029 1045 2253 2000 2023 3185 3892 2007 1037 2261 2814 2025 4209 2062 2084 1996 5889 2008 2020 1999 2009 1010 1998 2008 2009 2001 4011 2000 2022 1037 5469 3185 1012 1026 7987 1013 1028 1026 7987 1013 1028 2092 1045 6618 2041 2306 1996 2034 2322 2781 1010 2054 1037 3532 3247 1045 2018 2081 2183 2041 3773 2023 3185 1012 1996 5436 2001 10231 1010 1998 2061 2001 1996 5896 1012 1996 3210 2020 9202 2000 1996 2391 2008 2111 1999 1996 4378 2020 5870 25614 2135 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 3459 2481 1005 1056 2031 2042 2062 6081 2559 1012 2130 2070 1997 1996 5019 2790 2066 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] jake speed is a film that lacks one thing a charismatic lead . unfortunately that ' s something that really tai ##nts the entire movie and it ' s a shame because at heart it is an enjoyable action movie with a witty enough script and an interesting , if derivative , premise . although it ' s genesis probably can be traced back to the success of the indiana jones trilogy the film actually plays a little more like ' roman ##cing the stone ' albeit in reverse . it ' s not an author of romantic adventure fiction being led on an adventure by a character very much like one of her creations it is an adventure fiction character ( who happens to chronicle [SEP]


INFO:tensorflow:tokens: [CLS] jake speed is a film that lacks one thing a charismatic lead . unfortunately that ' s something that really tai ##nts the entire movie and it ' s a shame because at heart it is an enjoyable action movie with a witty enough script and an interesting , if derivative , premise . although it ' s genesis probably can be traced back to the success of the indiana jones trilogy the film actually plays a little more like ' roman ##cing the stone ' albeit in reverse . it ' s not an author of romantic adventure fiction being led on an adventure by a character very much like one of her creations it is an adventure fiction character ( who happens to chronicle [SEP]


INFO:tensorflow:input_ids: 101 5180 3177 2003 1037 2143 2008 14087 2028 2518 1037 23916 2599 1012 6854 2008 1005 1055 2242 2008 2428 13843 7666 1996 2972 3185 1998 2009 1005 1055 1037 9467 2138 2012 2540 2009 2003 2019 22249 2895 3185 2007 1037 25591 2438 5896 1998 2019 5875 1010 2065 13819 1010 18458 1012 2348 2009 1005 1055 11046 2763 2064 2022 9551 2067 2000 1996 3112 1997 1996 5242 3557 11544 1996 2143 2941 3248 1037 2210 2062 2066 1005 3142 6129 1996 2962 1005 12167 1999 7901 1012 2009 1005 1055 2025 2019 3166 1997 6298 6172 4349 2108 2419 2006 2019 6172 2011 1037 2839 2200 2172 2066 2028 1997 2014 20677 2009 2003 2019 6172 4349 2839 1006 2040 6433 2000 9519 102


INFO:tensorflow:input_ids: 101 5180 3177 2003 1037 2143 2008 14087 2028 2518 1037 23916 2599 1012 6854 2008 1005 1055 2242 2008 2428 13843 7666 1996 2972 3185 1998 2009 1005 1055 1037 9467 2138 2012 2540 2009 2003 2019 22249 2895 3185 2007 1037 25591 2438 5896 1998 2019 5875 1010 2065 13819 1010 18458 1012 2348 2009 1005 1055 11046 2763 2064 2022 9551 2067 2000 1996 3112 1997 1996 5242 3557 11544 1996 2143 2941 3248 1037 2210 2062 2066 1005 3142 6129 1996 2962 1005 12167 1999 7901 1012 2009 1005 1055 2025 2019 3166 1997 6298 6172 4349 2108 2419 2006 2019 6172 2011 1037 2839 2200 2172 2066 2028 1997 2014 20677 2009 2003 2019 6172 4349 2839 1006 2040 6433 2000 9519 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] this early film has its flaws - - a predictable plot and some over ##long scenes of dubious relevance - - but it already clearly demonstrates hitchcock ' s mastery of editing and the use of powerful images . it ' s also among the most expression ##ist of his films stylistic ##ally ; note , for examples , the weird distortion ##s he uses during the party sequence and the frequent echoes of both title and plot in the imagery . < br / > < br / > its core , though , remains the final match , which is still among the more exciting examples of cinematic boxing . even though you know that the hero has to win , it becomes quite [SEP]


INFO:tensorflow:tokens: [CLS] this early film has its flaws - - a predictable plot and some over ##long scenes of dubious relevance - - but it already clearly demonstrates hitchcock ' s mastery of editing and the use of powerful images . it ' s also among the most expression ##ist of his films stylistic ##ally ; note , for examples , the weird distortion ##s he uses during the party sequence and the frequent echoes of both title and plot in the imagery . < br / > < br / > its core , though , remains the final match , which is still among the more exciting examples of cinematic boxing . even though you know that the hero has to win , it becomes quite [SEP]


INFO:tensorflow:input_ids: 101 2023 2220 2143 2038 2049 21407 1011 1011 1037 21425 5436 1998 2070 2058 10052 5019 1997 22917 21923 1011 1011 2021 2009 2525 4415 16691 19625 1005 1055 26364 1997 9260 1998 1996 2224 1997 3928 4871 1012 2009 1005 1055 2036 2426 1996 2087 3670 2923 1997 2010 3152 24828 3973 1025 3602 1010 2005 4973 1010 1996 6881 20870 2015 2002 3594 2076 1996 2283 5537 1998 1996 6976 17659 1997 2119 2516 1998 5436 1999 1996 13425 1012 1026 7987 1013 1028 1026 7987 1013 1028 2049 4563 1010 2295 1010 3464 1996 2345 2674 1010 2029 2003 2145 2426 1996 2062 10990 4973 1997 21014 8362 1012 2130 2295 2017 2113 2008 1996 5394 2038 2000 2663 1010 2009 4150 3243 102


INFO:tensorflow:input_ids: 101 2023 2220 2143 2038 2049 21407 1011 1011 1037 21425 5436 1998 2070 2058 10052 5019 1997 22917 21923 1011 1011 2021 2009 2525 4415 16691 19625 1005 1055 26364 1997 9260 1998 1996 2224 1997 3928 4871 1012 2009 1005 1055 2036 2426 1996 2087 3670 2923 1997 2010 3152 24828 3973 1025 3602 1010 2005 4973 1010 1996 6881 20870 2015 2002 3594 2076 1996 2283 5537 1998 1996 6976 17659 1997 2119 2516 1998 5436 1999 1996 13425 1012 1026 7987 1013 1028 1026 7987 1013 1028 2049 4563 1010 2295 1010 3464 1996 2345 2674 1010 2029 2003 2145 2426 1996 2062 10990 4973 1997 21014 8362 1012 2130 2295 2017 2113 2008 1996 5394 2038 2000 2663 1010 2009 4150 3243 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] en ##chan ##ting , romantic , innovative , and funny . the vision of this extraordinary film is almost un ##para ##lle ##led , exceeding better known " death romance ##s " such as ghost . while we know intuitive ##ly that peter and june will find ultimate happiness at the end of that long - long stairway , the joy is in the journey . the moral of the tale , of course , is timeless : love conquer ##s all . but the struggle to achieve that victory is played in a celestial arena of sweeping vision and gripping grande ##ur . with more than 500 suit ##ably clad extras portraying various ages and cultures , the directors ' vision of heaven remains memorable [SEP]


INFO:tensorflow:tokens: [CLS] en ##chan ##ting , romantic , innovative , and funny . the vision of this extraordinary film is almost un ##para ##lle ##led , exceeding better known " death romance ##s " such as ghost . while we know intuitive ##ly that peter and june will find ultimate happiness at the end of that long - long stairway , the joy is in the journey . the moral of the tale , of course , is timeless : love conquer ##s all . but the struggle to achieve that victory is played in a celestial arena of sweeping vision and gripping grande ##ur . with more than 500 suit ##ably clad extras portraying various ages and cultures , the directors ' vision of heaven remains memorable [SEP]


INFO:tensorflow:input_ids: 101 4372 14856 3436 1010 6298 1010 9525 1010 1998 6057 1012 1996 4432 1997 2023 9313 2143 2003 2471 4895 28689 6216 3709 1010 17003 2488 2124 1000 2331 7472 2015 1000 2107 2004 5745 1012 2096 2057 2113 29202 2135 2008 2848 1998 2238 2097 2424 7209 8404 2012 1996 2203 1997 2008 2146 1011 2146 21952 1010 1996 6569 2003 1999 1996 4990 1012 1996 7191 1997 1996 6925 1010 1997 2607 1010 2003 27768 1024 2293 16152 2015 2035 1012 2021 1996 5998 2000 6162 2008 3377 2003 2209 1999 1037 17617 5196 1997 12720 4432 1998 13940 9026 3126 1012 2007 2062 2084 3156 4848 8231 13681 26279 17274 2536 5535 1998 8578 1010 1996 5501 1005 4432 1997 6014 3464 13432 102


INFO:tensorflow:input_ids: 101 4372 14856 3436 1010 6298 1010 9525 1010 1998 6057 1012 1996 4432 1997 2023 9313 2143 2003 2471 4895 28689 6216 3709 1010 17003 2488 2124 1000 2331 7472 2015 1000 2107 2004 5745 1012 2096 2057 2113 29202 2135 2008 2848 1998 2238 2097 2424 7209 8404 2012 1996 2203 1997 2008 2146 1011 2146 21952 1010 1996 6569 2003 1999 1996 4990 1012 1996 7191 1997 1996 6925 1010 1997 2607 1010 2003 27768 1024 2293 16152 2015 2035 1012 2021 1996 5998 2000 6162 2008 3377 2003 2209 1999 1037 17617 5196 1997 12720 4432 1998 13940 9026 3126 1012 2007 2062 2084 3156 4848 8231 13681 26279 17274 2536 5535 1998 8578 1010 1996 5501 1005 4432 1997 6014 3464 13432 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] my interest in dorothy st ##rat ##ten caused me to purchase this video . although it had great actors / actresses , there were just too many sub ##pl ##ots going on to retain interest . plus it just wasn ' t that interesting . dialogue was stiff and confusing and the story just flipped around too much to be bel ##ie ##vable . i was pretty disappointed in what i believe was one of audrey hepburn ' s last movies . i ' ll always love john ritter best in slap ##stick . he was just too pathetic here . [SEP]


INFO:tensorflow:tokens: [CLS] my interest in dorothy st ##rat ##ten caused me to purchase this video . although it had great actors / actresses , there were just too many sub ##pl ##ots going on to retain interest . plus it just wasn ' t that interesting . dialogue was stiff and confusing and the story just flipped around too much to be bel ##ie ##vable . i was pretty disappointed in what i believe was one of audrey hepburn ' s last movies . i ' ll always love john ritter best in slap ##stick . he was just too pathetic here . [SEP]


INFO:tensorflow:input_ids: 101 2026 3037 1999 9984 2358 8609 6528 3303 2033 2000 5309 2023 2678 1012 2348 2009 2018 2307 5889 1013 19910 1010 2045 2020 2074 2205 2116 4942 24759 12868 2183 2006 2000 9279 3037 1012 4606 2009 2074 2347 1005 1056 2008 5875 1012 7982 2001 10551 1998 16801 1998 1996 2466 2074 9357 2105 2205 2172 2000 2022 19337 2666 12423 1012 1045 2001 3492 9364 1999 2054 1045 2903 2001 2028 1997 14166 22004 1005 1055 2197 5691 1012 1045 1005 2222 2467 2293 2198 23168 2190 1999 14308 21354 1012 2002 2001 2074 2205 17203 2182 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 2026 3037 1999 9984 2358 8609 6528 3303 2033 2000 5309 2023 2678 1012 2348 2009 2018 2307 5889 1013 19910 1010 2045 2020 2074 2205 2116 4942 24759 12868 2183 2006 2000 9279 3037 1012 4606 2009 2074 2347 1005 1056 2008 5875 1012 7982 2001 10551 1998 16801 1998 1996 2466 2074 9357 2105 2205 2172 2000 2022 19337 2666 12423 1012 1045 2001 3492 9364 1999 2054 1045 2903 2001 2028 1997 14166 22004 1005 1055 2197 5691 1012 1045 1005 2222 2467 2293 2198 23168 2190 1999 14308 21354 1012 2002 2001 2074 2205 17203 2182 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] this almost perfect cinematic rendition of edith nes ##bit ' s popular children ' s novel follows the lives of roberta ( bobbie ) , phyllis , and peter , and their mother , after their father is unfair ##ly accused of treason and sent to prison . they go to live in an almost un ##in ##hab ##ita ##ble house in the country which stands near a railway line mum writes stories to make enough money for food and candles , while the children spend much of their time around the railway station and , specifically , waving to one particular train to ' send their love to father ' . < br / > < br / > always an involving and clever novel [SEP]


INFO:tensorflow:tokens: [CLS] this almost perfect cinematic rendition of edith nes ##bit ' s popular children ' s novel follows the lives of roberta ( bobbie ) , phyllis , and peter , and their mother , after their father is unfair ##ly accused of treason and sent to prison . they go to live in an almost un ##in ##hab ##ita ##ble house in the country which stands near a railway line mum writes stories to make enough money for food and candles , while the children spend much of their time around the railway station and , specifically , waving to one particular train to ' send their love to father ' . < br / > < br / > always an involving and clever novel [SEP]


INFO:tensorflow:input_ids: 101 2023 2471 3819 21014 19187 1997 13257 24524 16313 1005 1055 2759 2336 1005 1055 3117 4076 1996 3268 1997 23455 1006 27731 1007 1010 20328 1010 1998 2848 1010 1998 2037 2388 1010 2044 2037 2269 2003 15571 2135 5496 1997 14712 1998 2741 2000 3827 1012 2027 2175 2000 2444 1999 2019 2471 4895 2378 25459 6590 3468 2160 1999 1996 2406 2029 4832 2379 1037 2737 2240 12954 7009 3441 2000 2191 2438 2769 2005 2833 1998 14006 1010 2096 1996 2336 5247 2172 1997 2037 2051 2105 1996 2737 2276 1998 1010 4919 1010 12015 2000 2028 3327 3345 2000 1005 4604 2037 2293 2000 2269 1005 1012 1026 7987 1013 1028 1026 7987 1013 1028 2467 2019 5994 1998 12266 3117 102


INFO:tensorflow:input_ids: 101 2023 2471 3819 21014 19187 1997 13257 24524 16313 1005 1055 2759 2336 1005 1055 3117 4076 1996 3268 1997 23455 1006 27731 1007 1010 20328 1010 1998 2848 1010 1998 2037 2388 1010 2044 2037 2269 2003 15571 2135 5496 1997 14712 1998 2741 2000 3827 1012 2027 2175 2000 2444 1999 2019 2471 4895 2378 25459 6590 3468 2160 1999 1996 2406 2029 4832 2379 1037 2737 2240 12954 7009 3441 2000 2191 2438 2769 2005 2833 1998 14006 1010 2096 1996 2336 5247 2172 1997 2037 2051 2105 1996 2737 2276 1998 1010 4919 1010 12015 2000 2028 3327 3345 2000 1005 4604 2037 2293 2000 2269 1005 1012 1026 7987 1013 1028 1026 7987 1013 1028 2467 2019 5994 1998 12266 3117 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] i really don ' t understand who this movie is aimed at . from just the absurd ##ity of it , not to mention the ridiculous ##ly bad acting , che ##es ##y dialogue , and the fact that the villain is a child , i ' d assume this was meant to be a children ' s movie . . . but i think there may be more swear words than pulp fiction , not to mention constant references to drugs and general mayhem and killing - so which demographic is it trying to please ? this movie is too sc ##hi ##zo ##ph ##ren ##ic , like trying to combine country music with heavy metal , in the end no one is going to [SEP]


INFO:tensorflow:tokens: [CLS] i really don ' t understand who this movie is aimed at . from just the absurd ##ity of it , not to mention the ridiculous ##ly bad acting , che ##es ##y dialogue , and the fact that the villain is a child , i ' d assume this was meant to be a children ' s movie . . . but i think there may be more swear words than pulp fiction , not to mention constant references to drugs and general mayhem and killing - so which demographic is it trying to please ? this movie is too sc ##hi ##zo ##ph ##ren ##ic , like trying to combine country music with heavy metal , in the end no one is going to [SEP]


INFO:tensorflow:input_ids: 101 1045 2428 2123 1005 1056 3305 2040 2023 3185 2003 6461 2012 1012 2013 2074 1996 18691 3012 1997 2009 1010 2025 2000 5254 1996 9951 2135 2919 3772 1010 18178 2229 2100 7982 1010 1998 1996 2755 2008 1996 12700 2003 1037 2775 1010 1045 1005 1040 7868 2023 2001 3214 2000 2022 1037 2336 1005 1055 3185 1012 1012 1012 2021 1045 2228 2045 2089 2022 2062 8415 2616 2084 16016 4349 1010 2025 2000 5254 5377 7604 2000 5850 1998 2236 26865 1998 4288 1011 2061 2029 15982 2003 2009 2667 2000 3531 1029 2023 3185 2003 2205 8040 4048 6844 8458 7389 2594 1010 2066 2667 2000 11506 2406 2189 2007 3082 3384 1010 1999 1996 2203 2053 2028 2003 2183 2000 102


INFO:tensorflow:input_ids: 101 1045 2428 2123 1005 1056 3305 2040 2023 3185 2003 6461 2012 1012 2013 2074 1996 18691 3012 1997 2009 1010 2025 2000 5254 1996 9951 2135 2919 3772 1010 18178 2229 2100 7982 1010 1998 1996 2755 2008 1996 12700 2003 1037 2775 1010 1045 1005 1040 7868 2023 2001 3214 2000 2022 1037 2336 1005 1055 3185 1012 1012 1012 2021 1045 2228 2045 2089 2022 2062 8415 2616 2084 16016 4349 1010 2025 2000 5254 5377 7604 2000 5850 1998 2236 26865 1998 4288 1011 2061 2029 15982 2003 2009 2667 2000 3531 1029 2023 3185 2003 2205 8040 4048 6844 8458 7389 2594 1010 2066 2667 2000 11506 2406 2189 2007 3082 3384 1010 1999 1996 2203 2053 2028 2003 2183 2000 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] i was greatly disappointed by the quality of this documentary . the content is poorly produced , very poor quality video and , especially awful audio . there ' s extremely little about how bruce ha ##ack produced his music and virtually no examples of direct connection to later and contemporary electronic music . the interviews of people who knew bruce ha ##ack are ad - hoc mostly ina ##rti ##cula ##te mum ##bo - ju ##mbo . too much ya ##k and not enough ha ##ack . although i have a serious personal interest in electronic music and have a higher than average attention span , even for slow and / or difficult subject matter , i fell asleep while watching this documentary and had [SEP]


INFO:tensorflow:tokens: [CLS] i was greatly disappointed by the quality of this documentary . the content is poorly produced , very poor quality video and , especially awful audio . there ' s extremely little about how bruce ha ##ack produced his music and virtually no examples of direct connection to later and contemporary electronic music . the interviews of people who knew bruce ha ##ack are ad - hoc mostly ina ##rti ##cula ##te mum ##bo - ju ##mbo . too much ya ##k and not enough ha ##ack . although i have a serious personal interest in electronic music and have a higher than average attention span , even for slow and / or difficult subject matter , i fell asleep while watching this documentary and had [SEP]


INFO:tensorflow:input_ids: 101 1045 2001 6551 9364 2011 1996 3737 1997 2023 4516 1012 1996 4180 2003 9996 2550 1010 2200 3532 3737 2678 1998 1010 2926 9643 5746 1012 2045 1005 1055 5186 2210 2055 2129 5503 5292 8684 2550 2010 2189 1998 8990 2053 4973 1997 3622 4434 2000 2101 1998 3824 4816 2189 1012 1996 7636 1997 2111 2040 2354 5503 5292 8684 2024 4748 1011 21929 3262 27118 28228 19879 2618 12954 5092 1011 18414 13344 1012 2205 2172 8038 2243 1998 2025 2438 5292 8684 1012 2348 1045 2031 1037 3809 3167 3037 1999 4816 2189 1998 2031 1037 3020 2084 2779 3086 8487 1010 2130 2005 4030 1998 1013 2030 3697 3395 3043 1010 1045 3062 6680 2096 3666 2023 4516 1998 2018 102


INFO:tensorflow:input_ids: 101 1045 2001 6551 9364 2011 1996 3737 1997 2023 4516 1012 1996 4180 2003 9996 2550 1010 2200 3532 3737 2678 1998 1010 2926 9643 5746 1012 2045 1005 1055 5186 2210 2055 2129 5503 5292 8684 2550 2010 2189 1998 8990 2053 4973 1997 3622 4434 2000 2101 1998 3824 4816 2189 1012 1996 7636 1997 2111 2040 2354 5503 5292 8684 2024 4748 1011 21929 3262 27118 28228 19879 2618 12954 5092 1011 18414 13344 1012 2205 2172 8038 2243 1998 2025 2438 5292 8684 1012 2348 1045 2031 1037 3809 3167 3037 1999 4816 2189 1998 2031 1037 3020 2084 2779 3086 8487 1010 2130 2005 4030 1998 1013 2030 3697 3395 3043 1010 1045 3062 6680 2096 3666 2023 4516 1998 2018 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] i must admit i wasn ' t expecting much on this movie . i was surprised i truly enjoyed it as much as i did . the script wasn ' t oscar material , but it wasn ' t horrible either . the acting was great by mark wah ##lberg . jennifer an ##isto ##n had a great supporting role , and looked lovely as ever . what made this movie for me was the music . if you do not like 80 ' s g ##lam metal or hair bands , then you probably won ##t like this movie . its all about being a rocks ##tar . some cl ##iche ' s were present , but didn ' t bring down the movie at [SEP]


INFO:tensorflow:tokens: [CLS] i must admit i wasn ' t expecting much on this movie . i was surprised i truly enjoyed it as much as i did . the script wasn ' t oscar material , but it wasn ' t horrible either . the acting was great by mark wah ##lberg . jennifer an ##isto ##n had a great supporting role , and looked lovely as ever . what made this movie for me was the music . if you do not like 80 ' s g ##lam metal or hair bands , then you probably won ##t like this movie . its all about being a rocks ##tar . some cl ##iche ' s were present , but didn ' t bring down the movie at [SEP]


INFO:tensorflow:input_ids: 101 1045 2442 6449 1045 2347 1005 1056 8074 2172 2006 2023 3185 1012 1045 2001 4527 1045 5621 5632 2009 2004 2172 2004 1045 2106 1012 1996 5896 2347 1005 1056 7436 3430 1010 2021 2009 2347 1005 1056 9202 2593 1012 1996 3772 2001 2307 2011 2928 22894 22927 1012 7673 2019 20483 2078 2018 1037 2307 4637 2535 1010 1998 2246 8403 2004 2412 1012 2054 2081 2023 3185 2005 2033 2001 1996 2189 1012 2065 2017 2079 2025 2066 3770 1005 1055 1043 10278 3384 2030 2606 4996 1010 2059 2017 2763 2180 2102 2066 2023 3185 1012 2049 2035 2055 2108 1037 5749 7559 1012 2070 18856 17322 1005 1055 2020 2556 1010 2021 2134 1005 1056 3288 2091 1996 3185 2012 102


INFO:tensorflow:input_ids: 101 1045 2442 6449 1045 2347 1005 1056 8074 2172 2006 2023 3185 1012 1045 2001 4527 1045 5621 5632 2009 2004 2172 2004 1045 2106 1012 1996 5896 2347 1005 1056 7436 3430 1010 2021 2009 2347 1005 1056 9202 2593 1012 1996 3772 2001 2307 2011 2928 22894 22927 1012 7673 2019 20483 2078 2018 1037 2307 4637 2535 1010 1998 2246 8403 2004 2412 1012 2054 2081 2023 3185 2005 2033 2001 1996 2189 1012 2065 2017 2079 2025 2066 3770 1005 1055 1043 10278 3384 2030 2606 4996 1010 2059 2017 2763 2180 2102 2066 2023 3185 1012 2049 2035 2055 2108 1037 5749 7559 1012 2070 18856 17322 1005 1055 2020 2556 1010 2021 2134 1005 1056 3288 2091 1996 3185 2012 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] so then . . . this is what passes as high art for the likes of s ##x ##sw film festival and sundance , eh ? well , i suppose i can relate as long as story , script , dial ##og , acting ( save for ms . as ##elt ##on ) , cinematography and editing are completely irrelevant . < br / > < br / > i remember telling other film - making friends some years ago that the biggest problem with digital video was that we were now going to have to wade through a future sea of crap to get to anything worth watching now that anyone and his brother ( or brothers in the case of the du ##pl ##ass [SEP]


INFO:tensorflow:tokens: [CLS] so then . . . this is what passes as high art for the likes of s ##x ##sw film festival and sundance , eh ? well , i suppose i can relate as long as story , script , dial ##og , acting ( save for ms . as ##elt ##on ) , cinematography and editing are completely irrelevant . < br / > < br / > i remember telling other film - making friends some years ago that the biggest problem with digital video was that we were now going to have to wade through a future sea of crap to get to anything worth watching now that anyone and his brother ( or brothers in the case of the du ##pl ##ass [SEP]


INFO:tensorflow:input_ids: 101 2061 2059 1012 1012 1012 2023 2003 2054 5235 2004 2152 2396 2005 1996 7777 1997 1055 2595 26760 2143 2782 1998 20140 1010 15501 1029 2092 1010 1045 6814 1045 2064 14396 2004 2146 2004 2466 1010 5896 1010 13764 8649 1010 3772 1006 3828 2005 5796 1012 2004 20042 2239 1007 1010 16434 1998 9260 2024 3294 22537 1012 1026 7987 1013 1028 1026 7987 1013 1028 1045 3342 4129 2060 2143 1011 2437 2814 2070 2086 3283 2008 1996 5221 3291 2007 3617 2678 2001 2008 2057 2020 2085 2183 2000 2031 2000 10653 2083 1037 2925 2712 1997 10231 2000 2131 2000 2505 4276 3666 2085 2008 3087 1998 2010 2567 1006 2030 3428 1999 1996 2553 1997 1996 4241 24759 12054 102


INFO:tensorflow:input_ids: 101 2061 2059 1012 1012 1012 2023 2003 2054 5235 2004 2152 2396 2005 1996 7777 1997 1055 2595 26760 2143 2782 1998 20140 1010 15501 1029 2092 1010 1045 6814 1045 2064 14396 2004 2146 2004 2466 1010 5896 1010 13764 8649 1010 3772 1006 3828 2005 5796 1012 2004 20042 2239 1007 1010 16434 1998 9260 2024 3294 22537 1012 1026 7987 1013 1028 1026 7987 1013 1028 1045 3342 4129 2060 2143 1011 2437 2814 2070 2086 3283 2008 1996 5221 3291 2007 3617 2678 2001 2008 2057 2020 2085 2183 2000 2031 2000 10653 2083 1037 2925 2712 1997 10231 2000 2131 2000 2505 4276 3666 2085 2008 3087 1998 2010 2567 1006 2030 3428 1999 1996 2553 1997 1996 4241 24759 12054 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


#Creating a model

Now that we've prepared our data, let's focus on building a model. `create_model` does just this below. First, it loads the BERT tf hub module again (this time to extract the computation graph). Next, it creates a single new layer that will be trained to adapt BERT to our sentiment task (i.e. classifying whether a movie review is positive or negative). This strategy of using a mostly trained model is called [fine-tuning](http://wiki.fast.ai/index.php/Fine_tuning).

In [20]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)


Next we'll wrap our model function in a `model_fn_builder` function that adapts our model to work for training, evaluation, and prediction.

In [21]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        f1_score = tf.contrib.metrics.f1_score(
            label_ids,
            predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        return {
            "eval_accuracy": accuracy,
            "f1_score": f1_score,
            "auc": auc,
            "precision": precision,
            "recall": recall,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
        }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


In [22]:
# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 1.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 1
SAVE_SUMMARY_STEPS = 1

In [23]:
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [24]:
# Specify outpit directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [25]:
model_fn = model_fn_builder(
    num_labels=len(label_list),
    learning_rate=LEARNING_RATE,
    num_train_steps=num_train_steps,
    num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
    model_fn=model_fn,
    config=run_config,
    params={"batch_size": BATCH_SIZE})

INFO:tensorflow:Using config: {'_model_dir': './output', '_tf_random_seed': None, '_save_summary_steps': 1, '_save_checkpoints_steps': 1, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fef748916d8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


INFO:tensorflow:Using config: {'_model_dir': './output', '_tf_random_seed': None, '_save_summary_steps': 1, '_save_checkpoints_steps': 1, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fef748916d8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


Next we create an input builder function that takes our training feature set (`train_features`) and produces a generator. This is a pretty standard design pattern for working with Tensorflow [Estimators](https://www.tensorflow.org/guide/estimators).

In [26]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

Now we train our model! For me, using a Colab notebook running on Google's GPUs, my training time was about 14 minutes.

In [1]:
print(f'Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!


NameError: name 'datetime' is not defined

Now let's use our test data to see how well our model did:

In [None]:
test_input_fn = run_classifier.input_fn_builder(
    features=test_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

In [None]:
estimator.evaluate(input_fn=test_input_fn, steps=None)

Now let's write code to make predictions on new sentences:

In [None]:
def getPrediction(in_sentences):
    labels = ["Negative", "Positive"]
    input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, label = 0) for x in in_sentences] # here, "" is just a dummy label
    input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
    predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
    predictions = estimator.predict(predict_input_fn)
    return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]

In [None]:
pred_sentences = [
  "That movie was absolutely awful",
  "The acting was a bit lacking",
  "The film was creative and surprising",
  "Absolutely fantastic!"
]

In [None]:
predictions = getPrediction(pred_sentences)

Voila! We have a sentiment classifier!

In [None]:
predictions