<a href="https://colab.research.google.com/github/masubi/grover/blob/master/Predicting_Movie_Reviews_with_BERT_on_TF_Hub.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
# Copyright 2019 Google Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#Predicting Movie Review Sentiment with BERT on TF Hub

If you’ve been following Natural Language Processing over the past year, you’ve probably heard of BERT: Bidirectional Encoder Representations from Transformers. It’s a neural network architecture designed by Google researchers that’s totally transformed what’s state-of-the-art for NLP tasks, like text classification, translation, summarization, and question answering.

Now that BERT's been added to [TF Hub](https://www.tensorflow.org/hub) as a loadable module, it's easy(ish) to add into existing Tensorflow text pipelines. In an existing pipeline, BERT can replace text embedding layers like ELMO and GloVE. Alternatively, [finetuning](http://wiki.fast.ai/index.php/Fine_tuning) BERT can provide both an accuracy boost and faster training time in many cases.

Here, we'll train a model to predict whether an IMDB movie review is positive or negative using BERT in Tensorflow with tf hub. Some code was adapted from [this colab notebook](https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb). Let's get started!

In [0]:
#import tensorflow.compat.v1 as tf
#tf.disable_v2_behavior()

%tensorflow_version 1.x

TensorFlow 1.x selected.


In [0]:
from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime

In addition to the standard libraries we imported above, we'll need to install BERT's python package.

In [0]:
!pip install bert-tensorflow



In [0]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization




Below, we'll set an output directory location to store our model output and checkpoints. This can be a local directory, in which case you'd set OUTPUT_DIR to the name of the directory you'd like to create. If you're running this code in Google's hosted Colab, the directory won't persist after the Colab session ends.

Alternatively, if you're a GCP user, you can store output in a GCP bucket. To do that, set a directory name in OUTPUT_DIR and the name of the GCP bucket in the BUCKET field.

Set DO_DELETE to rewrite the OUTPUT_DIR if it exists. Otherwise, Tensorflow will load existing model checkpoints from that directory (if they exist).

In [0]:
# Set the output directory for saving model file
# Optionally, set a GCP bucket location

OUTPUT_DIR = 'OUTPUT_DIR_NAME'#@param {type:"string"}
#@markdown Whether or not to clear/delete the directory and create a new one
DO_DELETE = True #@param {type:"boolean"}
#@markdown Set USE_BUCKET and BUCKET if you want to (optionally) store model output on GCP bucket.
USE_BUCKET = True #@param {type:"boolean"}
BUCKET = 'mybuckettest001' #@param {type:"string"}

if USE_BUCKET:
  OUTPUT_DIR = 'gs://{}/{}'.format(BUCKET, OUTPUT_DIR)
  from google.colab import auth
  auth.authenticate_user()

if DO_DELETE:
  try:
    tf.gfile.DeleteRecursively(OUTPUT_DIR)
  except:
    # Doesn't matter if the directory didn't exist
    pass
tf.gfile.MakeDirs(OUTPUT_DIR)
print('***** Model output directory: {} *****'.format(OUTPUT_DIR))


***** Model output directory: gs://mybuckettest001/OUTPUT_DIR_NAME *****


#Data

First, let's download the dataset, hosted by Stanford. The code below, which downloads, extracts, and imports the IMDB Large Movie Review Dataset, is borrowed from [this Tensorflow tutorial](https://www.tensorflow.org/hub/tutorials/text_classification_with_tf_hub).

In [0]:
from tensorflow import keras
import os
import re

# Load all files from a directory in a DataFrame.
def load_directory_data(directory):
  data = {}
  data["sentence"] = []
  data["sentiment"] = []
  for file_path in os.listdir(directory):
    with tf.gfile.GFile(os.path.join(directory, file_path), "r") as f:
      data["sentence"].append(f.read())
      data["sentiment"].append(re.match("\d+_(\d+)\.txt", file_path).group(1))
  return pd.DataFrame.from_dict(data)

# Merge positive and negative examples, add a polarity column and shuffle.
def load_dataset(directory):
  pos_df = load_directory_data(os.path.join(directory, "pos"))
  neg_df = load_directory_data(os.path.join(directory, "neg"))
  pos_df["polarity"] = 1
  neg_df["polarity"] = 0
  return pd.concat([pos_df, neg_df]).sample(frac=1).reset_index(drop=True)

# Download and process the dataset files.
def download_and_load_datasets(force_download=False):
  dataset = tf.keras.utils.get_file(
      fname="aclImdb.tar.gz", 
      origin="http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz", 
      extract=True)
  
  train_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                       "aclImdb", "train"))
  test_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                      "aclImdb", "test"))
  
  return train_df, test_df


In [0]:
train, test = download_and_load_datasets()

Downloading data from http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz


To keep training fast, we'll take a sample of 5000 train and test examples, respectively.

In [0]:
train = train.sample(5000)
test = test.sample(5000)

In [0]:
train.columns

Index(['sentence', 'sentiment', 'polarity'], dtype='object')

For us, our input data is the 'sentence' column and our label is the 'polarity' column (0, 1 for negative and positive, respecitvely)

In [0]:
DATA_COLUMN = 'sentence'
LABEL_COLUMN = 'polarity'
# label_list is the list of labels, i.e. True, False or 0, 1 or 'dog', 'cat'
label_list = [0, 1]

#Data Preprocessing
We'll need to transform our data into a format BERT understands. This involves two steps. First, we create  `InputExample`'s using the constructor provided in the BERT library.

- `text_a` is the text we want to classify, which in this case, is the `Request` field in our Dataframe. 
- `text_b` is used if we're training a model to understand the relationship between sentences (i.e. is `text_b` a translation of `text_a`? Is `text_b` an answer to the question asked by `text_a`?). This doesn't apply to our task, so we can leave `text_b` blank.
- `label` is the label for our example, i.e. True, False

In [0]:
# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this example
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

test_InputExamples = test.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (but don't worry--this is also included in the Python library):


1. Lowercase our text (if we're using a BERT lowercase model)
2. Tokenize it (i.e. "sally says hi" -> ["sally", "says", "hi"])
3. Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
4. Map our words to indexes using a vocab file that BERT provides
5. Add special "CLS" and "SEP" tokens (see the [readme](https://github.com/google-research/bert))
6. Append "index" and "segment" tokens to each input (see the [BERT paper](https://arxiv.org/pdf/1810.04805.pdf))

Happily, we don't have to worry about most of these details.




To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:

In [0]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


Great--we just learned that the BERT model we're using expects lowercase data (that's what stored in tokenization_info["do_lower_case"]) and we also loaded BERT's vocab file. We also created a tokenizer, which breaks words into word pieces:

In [0]:
tokenizer.tokenize("This here's an example of using the BERT tokenizer")

['this',
 'here',
 "'",
 's',
 'an',
 'example',
 'of',
 'using',
 'the',
 'bert',
 'token',
 '##izer']

Using our tokenizer, we'll call `run_classifier.convert_examples_to_features` on our InputExamples to convert them into features BERT understands.

In [0]:
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128
# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)







INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] the quote i used for my summary occurs about halfway through the good earth , as a captain of a chinese revolutionary army ( played by philip ah ##n ) apologize ##s to a mob for not having time to shoot more of the lo ##ote ##rs among them , as his unit has just been called back to the front lines . of course , the next lo ##ote ##r about to be found out and shot is the main character of the film , the former kitchen slave girl o - lan ( for whose portrayal luis ##e rainer , now 99 - years - old , won her second consecutive best actress oscar ) . < br / > < br / > [SEP]


INFO:tensorflow:tokens: [CLS] the quote i used for my summary occurs about halfway through the good earth , as a captain of a chinese revolutionary army ( played by philip ah ##n ) apologize ##s to a mob for not having time to shoot more of the lo ##ote ##rs among them , as his unit has just been called back to the front lines . of course , the next lo ##ote ##r about to be found out and shot is the main character of the film , the former kitchen slave girl o - lan ( for whose portrayal luis ##e rainer , now 99 - years - old , won her second consecutive best actress oscar ) . < br / > < br / > [SEP]


INFO:tensorflow:input_ids: 101 1996 14686 1045 2109 2005 2026 12654 5158 2055 8576 2083 1996 2204 3011 1010 2004 1037 2952 1997 1037 2822 6208 2390 1006 2209 2011 5170 6289 2078 1007 12134 2015 2000 1037 11240 2005 2025 2383 2051 2000 5607 2062 1997 1996 8840 12184 2869 2426 2068 1010 2004 2010 3131 2038 2074 2042 2170 2067 2000 1996 2392 3210 1012 1997 2607 1010 1996 2279 8840 12184 2099 2055 2000 2022 2179 2041 1998 2915 2003 1996 2364 2839 1997 1996 2143 1010 1996 2280 3829 6658 2611 1051 1011 17595 1006 2005 3005 13954 6446 2063 28035 1010 2085 5585 1011 2086 1011 2214 1010 2180 2014 2117 5486 2190 3883 7436 1007 1012 1026 7987 1013 1028 1026 7987 1013 1028 102


INFO:tensorflow:input_ids: 101 1996 14686 1045 2109 2005 2026 12654 5158 2055 8576 2083 1996 2204 3011 1010 2004 1037 2952 1997 1037 2822 6208 2390 1006 2209 2011 5170 6289 2078 1007 12134 2015 2000 1037 11240 2005 2025 2383 2051 2000 5607 2062 1997 1996 8840 12184 2869 2426 2068 1010 2004 2010 3131 2038 2074 2042 2170 2067 2000 1996 2392 3210 1012 1997 2607 1010 1996 2279 8840 12184 2099 2055 2000 2022 2179 2041 1998 2915 2003 1996 2364 2839 1997 1996 2143 1010 1996 2280 3829 6658 2611 1051 1011 17595 1006 2005 3005 13954 6446 2063 28035 1010 2085 5585 1011 2086 1011 2214 1010 2180 2014 2117 5486 2190 3883 7436 1007 1012 1026 7987 1013 1028 1026 7987 1013 1028 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] franco rossi ' s 1985 six - hour italian mini - series of quo va ##dis is a very curious beast , creating an absolutely convincing ancient roman world shot in matter of fact fashion ( very few long shots , no big city ##sca ##pes ) , but playing the drama down so much in favour of all ##usions to classical literature and history that the story constantly gets lost in the background . < br / > < br / > the shifting structure ( much of episode one is played out via voice over letters ) and lack of narrative urgency makes the full six - hour version simultaneously demanding and und ##eman ##ding , and certainly far too often un ##in ##vo [SEP]


INFO:tensorflow:tokens: [CLS] franco rossi ' s 1985 six - hour italian mini - series of quo va ##dis is a very curious beast , creating an absolutely convincing ancient roman world shot in matter of fact fashion ( very few long shots , no big city ##sca ##pes ) , but playing the drama down so much in favour of all ##usions to classical literature and history that the story constantly gets lost in the background . < br / > < br / > the shifting structure ( much of episode one is played out via voice over letters ) and lack of narrative urgency makes the full six - hour version simultaneously demanding and und ##eman ##ding , and certainly far too often un ##in ##vo [SEP]


INFO:tensorflow:input_ids: 101 9341 18451 1005 1055 3106 2416 1011 3178 3059 7163 1011 2186 1997 22035 12436 10521 2003 1037 2200 8025 6841 1010 4526 2019 7078 13359 3418 3142 2088 2915 1999 3043 1997 2755 4827 1006 2200 2261 2146 7171 1010 2053 2502 2103 15782 10374 1007 1010 2021 2652 1996 3689 2091 2061 2172 1999 7927 1997 2035 22016 2000 4556 3906 1998 2381 2008 1996 2466 7887 4152 2439 1999 1996 4281 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 9564 3252 1006 2172 1997 2792 2028 2003 2209 2041 3081 2376 2058 4144 1007 1998 3768 1997 7984 19353 3084 1996 2440 2416 1011 3178 2544 7453 9694 1998 6151 16704 4667 1010 1998 5121 2521 2205 2411 4895 2378 6767 102


INFO:tensorflow:input_ids: 101 9341 18451 1005 1055 3106 2416 1011 3178 3059 7163 1011 2186 1997 22035 12436 10521 2003 1037 2200 8025 6841 1010 4526 2019 7078 13359 3418 3142 2088 2915 1999 3043 1997 2755 4827 1006 2200 2261 2146 7171 1010 2053 2502 2103 15782 10374 1007 1010 2021 2652 1996 3689 2091 2061 2172 1999 7927 1997 2035 22016 2000 4556 3906 1998 2381 2008 1996 2466 7887 4152 2439 1999 1996 4281 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 9564 3252 1006 2172 1997 2792 2028 2003 2209 2041 3081 2376 2058 4144 1007 1998 3768 1997 7984 19353 3084 1996 2440 2416 1011 3178 2544 7453 9694 1998 6151 16704 4667 1010 1998 5121 2521 2205 2411 4895 2378 6767 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] this adaptation of pearl s . buck ' s film is certainly a classic . a true hollywood epic , it has all the things a great hollywood film has : birth , death , happiness , sadness , ex ##hila ##ration , despair , and so on . there is only one thing that ir ##ks me . i know it was a sign of the times , but neither of the two main characters are played by asian actors . paul mu ##ni was a great actor , and he does an ad ##mir ##able job as wang lung , the owner of a farm in china . luis ##e rain ##ier plays ol ##an . she does not even look chinese ! i [SEP]


INFO:tensorflow:tokens: [CLS] this adaptation of pearl s . buck ' s film is certainly a classic . a true hollywood epic , it has all the things a great hollywood film has : birth , death , happiness , sadness , ex ##hila ##ration , despair , and so on . there is only one thing that ir ##ks me . i know it was a sign of the times , but neither of the two main characters are played by asian actors . paul mu ##ni was a great actor , and he does an ad ##mir ##able job as wang lung , the owner of a farm in china . luis ##e rain ##ier plays ol ##an . she does not even look chinese ! i [SEP]


INFO:tensorflow:input_ids: 101 2023 6789 1997 7247 1055 1012 10131 1005 1055 2143 2003 5121 1037 4438 1012 1037 2995 5365 8680 1010 2009 2038 2035 1996 2477 1037 2307 5365 2143 2038 1024 4182 1010 2331 1010 8404 1010 12039 1010 4654 26415 8156 1010 13905 1010 1998 2061 2006 1012 2045 2003 2069 2028 2518 2008 20868 5705 2033 1012 1045 2113 2009 2001 1037 3696 1997 1996 2335 1010 2021 4445 1997 1996 2048 2364 3494 2024 2209 2011 4004 5889 1012 2703 14163 3490 2001 1037 2307 3364 1010 1998 2002 2515 2019 4748 14503 3085 3105 2004 7418 11192 1010 1996 3954 1997 1037 3888 1999 2859 1012 6446 2063 4542 3771 3248 19330 2319 1012 2016 2515 2025 2130 2298 2822 999 1045 102


INFO:tensorflow:input_ids: 101 2023 6789 1997 7247 1055 1012 10131 1005 1055 2143 2003 5121 1037 4438 1012 1037 2995 5365 8680 1010 2009 2038 2035 1996 2477 1037 2307 5365 2143 2038 1024 4182 1010 2331 1010 8404 1010 12039 1010 4654 26415 8156 1010 13905 1010 1998 2061 2006 1012 2045 2003 2069 2028 2518 2008 20868 5705 2033 1012 1045 2113 2009 2001 1037 3696 1997 1996 2335 1010 2021 4445 1997 1996 2048 2364 3494 2024 2209 2011 4004 5889 1012 2703 14163 3490 2001 1037 2307 3364 1010 1998 2002 2515 2019 4748 14503 3085 3105 2004 7418 11192 1010 1996 3954 1997 1037 3888 1999 2859 1012 6446 2063 4542 3771 3248 19330 2319 1012 2016 2515 2025 2130 2298 2822 999 1045 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] some critics found this film bleak , but for me there was enough good humour and optimism to overcome this impression . for example , the quietly positive and st ##oic character of the daughter is the still centre of the film , often counter ##bala ##nc ##ing the unhappy aspects of the setting and plot ##line . < br / > < br / > the film is full of original ideas and characters , and the final outcome is not predictable : i felt it could ' ve gone either way . < br / > < br / > by the way , many reviews i ' ve read mention the effective use of black and white , but the print i saw [SEP]


INFO:tensorflow:tokens: [CLS] some critics found this film bleak , but for me there was enough good humour and optimism to overcome this impression . for example , the quietly positive and st ##oic character of the daughter is the still centre of the film , often counter ##bala ##nc ##ing the unhappy aspects of the setting and plot ##line . < br / > < br / > the film is full of original ideas and characters , and the final outcome is not predictable : i felt it could ' ve gone either way . < br / > < br / > by the way , many reviews i ' ve read mention the effective use of black and white , but the print i saw [SEP]


INFO:tensorflow:input_ids: 101 2070 4401 2179 2023 2143 21657 1010 2021 2005 2033 2045 2001 2438 2204 17211 1998 27451 2000 9462 2023 8605 1012 2005 2742 1010 1996 5168 3893 1998 2358 19419 2839 1997 1996 2684 2003 1996 2145 2803 1997 1996 2143 1010 2411 4675 25060 12273 2075 1996 12511 5919 1997 1996 4292 1998 5436 4179 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 2143 2003 2440 1997 2434 4784 1998 3494 1010 1998 1996 2345 9560 2003 2025 21425 1024 1045 2371 2009 2071 1005 2310 2908 2593 2126 1012 1026 7987 1013 1028 1026 7987 1013 1028 2011 1996 2126 1010 2116 4391 1045 1005 2310 3191 5254 1996 4621 2224 1997 2304 1998 2317 1010 2021 1996 6140 1045 2387 102


INFO:tensorflow:input_ids: 101 2070 4401 2179 2023 2143 21657 1010 2021 2005 2033 2045 2001 2438 2204 17211 1998 27451 2000 9462 2023 8605 1012 2005 2742 1010 1996 5168 3893 1998 2358 19419 2839 1997 1996 2684 2003 1996 2145 2803 1997 1996 2143 1010 2411 4675 25060 12273 2075 1996 12511 5919 1997 1996 4292 1998 5436 4179 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 2143 2003 2440 1997 2434 4784 1998 3494 1010 1998 1996 2345 9560 2003 2025 21425 1024 1045 2371 2009 2071 1005 2310 2908 2593 2126 1012 1026 7987 1013 1028 1026 7987 1013 1028 2011 1996 2126 1010 2116 4391 1045 1005 2310 3191 5254 1996 4621 2224 1997 2304 1998 2317 1010 2021 1996 6140 1045 2387 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] my sincere advice to all : don ' t watch the movie . < br / > < br / > don ' t even go near to the theater where this movie is being played ! ! even a glimpse of it is bad for health . serious . no jokes . it ' s 3 . 30 am in the morning . and i returned from this crap ##pies ##t movie on this universe . four hours damn ! ! ! i am proud that i survived after all of it ! if this is called survival . < br / > < br / > i am highly frustrated . annoyed . disappointed . it was sheer waste of time ! money went [SEP]


INFO:tensorflow:tokens: [CLS] my sincere advice to all : don ' t watch the movie . < br / > < br / > don ' t even go near to the theater where this movie is being played ! ! even a glimpse of it is bad for health . serious . no jokes . it ' s 3 . 30 am in the morning . and i returned from this crap ##pies ##t movie on this universe . four hours damn ! ! ! i am proud that i survived after all of it ! if this is called survival . < br / > < br / > i am highly frustrated . annoyed . disappointed . it was sheer waste of time ! money went [SEP]


INFO:tensorflow:input_ids: 101 2026 18006 6040 2000 2035 1024 2123 1005 1056 3422 1996 3185 1012 1026 7987 1013 1028 1026 7987 1013 1028 2123 1005 1056 2130 2175 2379 2000 1996 4258 2073 2023 3185 2003 2108 2209 999 999 2130 1037 12185 1997 2009 2003 2919 2005 2740 1012 3809 1012 2053 13198 1012 2009 1005 1055 1017 1012 2382 2572 1999 1996 2851 1012 1998 1045 2513 2013 2023 10231 13046 2102 3185 2006 2023 5304 1012 2176 2847 4365 999 999 999 1045 2572 7098 2008 1045 5175 2044 2035 1997 2009 999 2065 2023 2003 2170 7691 1012 1026 7987 1013 1028 1026 7987 1013 1028 1045 2572 3811 10206 1012 11654 1012 9364 1012 2009 2001 11591 5949 1997 2051 999 2769 2253 102


INFO:tensorflow:input_ids: 101 2026 18006 6040 2000 2035 1024 2123 1005 1056 3422 1996 3185 1012 1026 7987 1013 1028 1026 7987 1013 1028 2123 1005 1056 2130 2175 2379 2000 1996 4258 2073 2023 3185 2003 2108 2209 999 999 2130 1037 12185 1997 2009 2003 2919 2005 2740 1012 3809 1012 2053 13198 1012 2009 1005 1055 1017 1012 2382 2572 1999 1996 2851 1012 1998 1045 2513 2013 2023 10231 13046 2102 3185 2006 2023 5304 1012 2176 2847 4365 999 999 999 1045 2572 7098 2008 1045 5175 2044 2035 1997 2009 999 2065 2023 2003 2170 7691 1012 1026 7987 1013 1028 1026 7987 1013 1028 1045 2572 3811 10206 1012 11654 1012 9364 1012 2009 2001 11591 5949 1997 2051 999 2769 2253 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] this one is a poor attempt at spinning the old " con ##s turn good " yarn , which we have seen so many times before . it actually reminded me of the american series ' the players ' , although nowhere near as good . omar ep ##ps is totally un ##con ##vin ##cing as the hard man of the bunch , as is rib ##isi , who ' s attempt at being the funny guy gets lost along the way . danes performance was decent though , and you can see from this performance , why she was cast in term ##inator 4 . < br / > < br / > the mod squad is a film which lies in a kind of [SEP]


INFO:tensorflow:tokens: [CLS] this one is a poor attempt at spinning the old " con ##s turn good " yarn , which we have seen so many times before . it actually reminded me of the american series ' the players ' , although nowhere near as good . omar ep ##ps is totally un ##con ##vin ##cing as the hard man of the bunch , as is rib ##isi , who ' s attempt at being the funny guy gets lost along the way . danes performance was decent though , and you can see from this performance , why she was cast in term ##inator 4 . < br / > < br / > the mod squad is a film which lies in a kind of [SEP]


INFO:tensorflow:input_ids: 101 2023 2028 2003 1037 3532 3535 2012 9419 1996 2214 1000 9530 2015 2735 2204 1000 27158 1010 2029 2057 2031 2464 2061 2116 2335 2077 1012 2009 2941 6966 2033 1997 1996 2137 2186 1005 1996 2867 1005 1010 2348 7880 2379 2004 2204 1012 13192 4958 4523 2003 6135 4895 8663 6371 6129 2004 1996 2524 2158 1997 1996 9129 1010 2004 2003 19395 17417 1010 2040 1005 1055 3535 2012 2108 1996 6057 3124 4152 2439 2247 1996 2126 1012 27476 2836 2001 11519 2295 1010 1998 2017 2064 2156 2013 2023 2836 1010 2339 2016 2001 3459 1999 2744 23207 1018 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 16913 4686 2003 1037 2143 2029 3658 1999 1037 2785 1997 102


INFO:tensorflow:input_ids: 101 2023 2028 2003 1037 3532 3535 2012 9419 1996 2214 1000 9530 2015 2735 2204 1000 27158 1010 2029 2057 2031 2464 2061 2116 2335 2077 1012 2009 2941 6966 2033 1997 1996 2137 2186 1005 1996 2867 1005 1010 2348 7880 2379 2004 2204 1012 13192 4958 4523 2003 6135 4895 8663 6371 6129 2004 1996 2524 2158 1997 1996 9129 1010 2004 2003 19395 17417 1010 2040 1005 1055 3535 2012 2108 1996 6057 3124 4152 2439 2247 1996 2126 1012 27476 2836 2001 11519 2295 1010 1998 2017 2064 2156 2013 2023 2836 1010 2339 2016 2001 3459 1999 2744 23207 1018 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 16913 4686 2003 1037 2143 2029 3658 1999 1037 2785 1997 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] and the worst part is that it could have been good . but something horribly wrong . first thing first , they should not have cast ami ##ta ##bh bach ##chan in this film at all . he is too much of an icon to tackle such a delicate and controversial topic let alone the role itself . < br / > < br / > secondly , ram go ##pal var ##ma ought to be ashamed of himself for taking the classic story of lo ##lita and turning it into a pathetic predictable sl ##ut - fest . his lo ##lita is named jia ( played by newcomer jia ##h khan ) and when we meet her , she is devoid of any ink ##ling [SEP]


INFO:tensorflow:tokens: [CLS] and the worst part is that it could have been good . but something horribly wrong . first thing first , they should not have cast ami ##ta ##bh bach ##chan in this film at all . he is too much of an icon to tackle such a delicate and controversial topic let alone the role itself . < br / > < br / > secondly , ram go ##pal var ##ma ought to be ashamed of himself for taking the classic story of lo ##lita and turning it into a pathetic predictable sl ##ut - fest . his lo ##lita is named jia ( played by newcomer jia ##h khan ) and when we meet her , she is devoid of any ink ##ling [SEP]


INFO:tensorflow:input_ids: 101 1998 1996 5409 2112 2003 2008 2009 2071 2031 2042 2204 1012 2021 2242 27762 3308 1012 2034 2518 2034 1010 2027 2323 2025 2031 3459 26445 2696 23706 10384 14856 1999 2023 2143 2012 2035 1012 2002 2003 2205 2172 1997 2019 12696 2000 11147 2107 1037 10059 1998 6801 8476 2292 2894 1996 2535 2993 1012 1026 7987 1013 1028 1026 7987 1013 1028 16378 1010 8223 2175 12952 13075 2863 11276 2000 2022 14984 1997 2370 2005 2635 1996 4438 2466 1997 8840 27606 1998 3810 2009 2046 1037 17203 21425 22889 4904 1011 17037 1012 2010 8840 27606 2003 2315 25871 1006 2209 2011 16866 25871 2232 4967 1007 1998 2043 2057 3113 2014 1010 2016 2003 22808 1997 2151 10710 2989 102


INFO:tensorflow:input_ids: 101 1998 1996 5409 2112 2003 2008 2009 2071 2031 2042 2204 1012 2021 2242 27762 3308 1012 2034 2518 2034 1010 2027 2323 2025 2031 3459 26445 2696 23706 10384 14856 1999 2023 2143 2012 2035 1012 2002 2003 2205 2172 1997 2019 12696 2000 11147 2107 1037 10059 1998 6801 8476 2292 2894 1996 2535 2993 1012 1026 7987 1013 1028 1026 7987 1013 1028 16378 1010 8223 2175 12952 13075 2863 11276 2000 2022 14984 1997 2370 2005 2635 1996 4438 2466 1997 8840 27606 1998 3810 2009 2046 1037 17203 21425 22889 4904 1011 17037 1012 2010 8840 27606 2003 2315 25871 1006 2209 2011 16866 25871 2232 4967 1007 1998 2043 2057 3113 2014 1010 2016 2003 22808 1997 2151 10710 2989 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] i heard what people were saying , but i ignored them . being rushed at blockbuster i grabbed copy of this movie and ran out . < br / > < br / > 45 minutes into i was fighting to stay awake . there is some attempt to keep the film interesting , but it was just bad . a chase of some sort takes place , but it was long and drawn out - the perfect time to make a snack . by the time this movie was over i didn ' t care how ended , i just wanted it to end . walking in and out of my room checking to see if it was over . < br / > < [SEP]


INFO:tensorflow:tokens: [CLS] i heard what people were saying , but i ignored them . being rushed at blockbuster i grabbed copy of this movie and ran out . < br / > < br / > 45 minutes into i was fighting to stay awake . there is some attempt to keep the film interesting , but it was just bad . a chase of some sort takes place , but it was long and drawn out - the perfect time to make a snack . by the time this movie was over i didn ' t care how ended , i just wanted it to end . walking in and out of my room checking to see if it was over . < br / > < [SEP]


INFO:tensorflow:input_ids: 101 1045 2657 2054 2111 2020 3038 1010 2021 1045 6439 2068 1012 2108 6760 2012 27858 1045 4046 6100 1997 2023 3185 1998 2743 2041 1012 1026 7987 1013 1028 1026 7987 1013 1028 3429 2781 2046 1045 2001 3554 2000 2994 8300 1012 2045 2003 2070 3535 2000 2562 1996 2143 5875 1010 2021 2009 2001 2074 2919 1012 1037 5252 1997 2070 4066 3138 2173 1010 2021 2009 2001 2146 1998 4567 2041 1011 1996 3819 2051 2000 2191 1037 19782 1012 2011 1996 2051 2023 3185 2001 2058 1045 2134 1005 1056 2729 2129 3092 1010 1045 2074 2359 2009 2000 2203 1012 3788 1999 1998 2041 1997 2026 2282 9361 2000 2156 2065 2009 2001 2058 1012 1026 7987 1013 1028 1026 102


INFO:tensorflow:input_ids: 101 1045 2657 2054 2111 2020 3038 1010 2021 1045 6439 2068 1012 2108 6760 2012 27858 1045 4046 6100 1997 2023 3185 1998 2743 2041 1012 1026 7987 1013 1028 1026 7987 1013 1028 3429 2781 2046 1045 2001 3554 2000 2994 8300 1012 2045 2003 2070 3535 2000 2562 1996 2143 5875 1010 2021 2009 2001 2074 2919 1012 1037 5252 1997 2070 4066 3138 2173 1010 2021 2009 2001 2146 1998 4567 2041 1011 1996 3819 2051 2000 2191 1037 19782 1012 2011 1996 2051 2023 3185 2001 2058 1045 2134 1005 1056 2729 2129 3092 1010 1045 2074 2359 2009 2000 2203 1012 3788 1999 1998 2041 1997 2026 2282 9361 2000 2156 2065 2009 2001 2058 1012 1026 7987 1013 1028 1026 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] as americans , we have come to expect crap ##iness as " par for the course " when it comes to horror and unknown directors directing unknown actors acting for unknown writers . we truly expect this to suck & when they don ' t suck , it becomes an over night success . < br / > < br / > this is not an over night success , nor is it an over the weekend or over the month success . this blows from start to finish and my only recommendation is this : go into this knowing it sucks , enjoy it for what it is , for what it isn ' t & if you have something better to do , keep [SEP]


INFO:tensorflow:tokens: [CLS] as americans , we have come to expect crap ##iness as " par for the course " when it comes to horror and unknown directors directing unknown actors acting for unknown writers . we truly expect this to suck & when they don ' t suck , it becomes an over night success . < br / > < br / > this is not an over night success , nor is it an over the weekend or over the month success . this blows from start to finish and my only recommendation is this : go into this knowing it sucks , enjoy it for what it is , for what it isn ' t & if you have something better to do , keep [SEP]


INFO:tensorflow:input_ids: 101 2004 4841 1010 2057 2031 2272 2000 5987 10231 9961 2004 1000 11968 2005 1996 2607 1000 2043 2009 3310 2000 5469 1998 4242 5501 9855 4242 5889 3772 2005 4242 4898 1012 2057 5621 5987 2023 2000 11891 1004 2043 2027 2123 1005 1056 11891 1010 2009 4150 2019 2058 2305 3112 1012 1026 7987 1013 1028 1026 7987 1013 1028 2023 2003 2025 2019 2058 2305 3112 1010 4496 2003 2009 2019 2058 1996 5353 2030 2058 1996 3204 3112 1012 2023 13783 2013 2707 2000 3926 1998 2026 2069 12832 2003 2023 1024 2175 2046 2023 4209 2009 19237 1010 5959 2009 2005 2054 2009 2003 1010 2005 2054 2009 3475 1005 1056 1004 2065 2017 2031 2242 2488 2000 2079 1010 2562 102


INFO:tensorflow:input_ids: 101 2004 4841 1010 2057 2031 2272 2000 5987 10231 9961 2004 1000 11968 2005 1996 2607 1000 2043 2009 3310 2000 5469 1998 4242 5501 9855 4242 5889 3772 2005 4242 4898 1012 2057 5621 5987 2023 2000 11891 1004 2043 2027 2123 1005 1056 11891 1010 2009 4150 2019 2058 2305 3112 1012 1026 7987 1013 1028 1026 7987 1013 1028 2023 2003 2025 2019 2058 2305 3112 1010 4496 2003 2009 2019 2058 1996 5353 2030 2058 1996 3204 3112 1012 2023 13783 2013 2707 2000 3926 1998 2026 2069 12832 2003 2023 1024 2175 2046 2023 4209 2009 19237 1010 5959 2009 2005 2054 2009 2003 1010 2005 2054 2009 3475 1005 1056 1004 2065 2017 2031 2242 2488 2000 2079 1010 2562 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] i just returned from viewing this academy award - nominated doc , and i was thoroughly touched and interested in exploring the works of this fellow i ' d never heard of before . of course i ' m someone who ' s capt ##ivated with beautiful architecture , so i realize others won ' t care . < br / > < br / > we can only imagine if there had been a couple more vision ##aries in philadelphia back in the late 60 ' s when kahn ' s plans were a possibility , what a wonderful city center there would be . if you wonder whether you ' ll see more about the bangladesh building at the beginning of the movie , [SEP]


INFO:tensorflow:tokens: [CLS] i just returned from viewing this academy award - nominated doc , and i was thoroughly touched and interested in exploring the works of this fellow i ' d never heard of before . of course i ' m someone who ' s capt ##ivated with beautiful architecture , so i realize others won ' t care . < br / > < br / > we can only imagine if there had been a couple more vision ##aries in philadelphia back in the late 60 ' s when kahn ' s plans were a possibility , what a wonderful city center there would be . if you wonder whether you ' ll see more about the bangladesh building at the beginning of the movie , [SEP]


INFO:tensorflow:input_ids: 101 1045 2074 2513 2013 10523 2023 2914 2400 1011 4222 9986 1010 1998 1045 2001 12246 5028 1998 4699 1999 11131 1996 2573 1997 2023 3507 1045 1005 1040 2196 2657 1997 2077 1012 1997 2607 1045 1005 1049 2619 2040 1005 1055 14408 21967 2007 3376 4294 1010 2061 1045 5382 2500 2180 1005 1056 2729 1012 1026 7987 1013 1028 1026 7987 1013 1028 2057 2064 2069 5674 2065 2045 2018 2042 1037 3232 2062 4432 12086 1999 4407 2067 1999 1996 2397 3438 1005 1055 2043 19361 1005 1055 3488 2020 1037 6061 1010 2054 1037 6919 2103 2415 2045 2052 2022 1012 2065 2017 4687 3251 2017 1005 2222 2156 2062 2055 1996 7269 2311 2012 1996 2927 1997 1996 3185 1010 102


INFO:tensorflow:input_ids: 101 1045 2074 2513 2013 10523 2023 2914 2400 1011 4222 9986 1010 1998 1045 2001 12246 5028 1998 4699 1999 11131 1996 2573 1997 2023 3507 1045 1005 1040 2196 2657 1997 2077 1012 1997 2607 1045 1005 1049 2619 2040 1005 1055 14408 21967 2007 3376 4294 1010 2061 1045 5382 2500 2180 1005 1056 2729 1012 1026 7987 1013 1028 1026 7987 1013 1028 2057 2064 2069 5674 2065 2045 2018 2042 1037 3232 2062 4432 12086 1999 4407 2067 1999 1996 2397 3438 1005 1055 2043 19361 1005 1055 3488 2020 1037 6061 1010 2054 1037 6919 2103 2415 2045 2052 2022 1012 2065 2017 4687 3251 2017 1005 2222 2156 2062 2055 1996 7269 2311 2012 1996 2927 1997 1996 3185 1010 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


#Creating a model

Now that we've prepared our data, let's focus on building a model. `create_model` does just this below. First, it loads the BERT tf hub module again (this time to extract the computation graph). Next, it creates a single new layer that will be trained to adapt BERT to our sentiment task (i.e. classifying whether a movie review is positive or negative). This strategy of using a mostly trained model is called [fine-tuning](http://wiki.fast.ai/index.php/Fine_tuning).

In [0]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)


Next we'll wrap our model function in a `model_fn_builder` function that adapts our model to work for training, evaluation, and prediction.

In [0]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        f1_score = tf.contrib.metrics.f1_score(
            label_ids,
            predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        return {
            "eval_accuracy": accuracy,
            "f1_score": f1_score,
            "auc": auc,
            "precision": precision,
            "recall": recall,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
        }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


In [0]:
# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 500
SAVE_SUMMARY_STEPS = 100

In [0]:
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [0]:
# Specify outpit directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [0]:
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})


INFO:tensorflow:Using config: {'_model_dir': 'gs://mybuckettest001/OUTPUT_DIR_NAME', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f3db007d470>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


INFO:tensorflow:Using config: {'_model_dir': 'gs://mybuckettest001/OUTPUT_DIR_NAME', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f3db007d470>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


Next we create an input builder function that takes our training feature set (`train_features`) and produces a generator. This is a pretty standard design pattern for working with Tensorflow [Estimators](https://www.tensorflow.org/guide/estimators).

In [0]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

Now we train our model! For me, using a Colab notebook running on Google's GPUs, my training time was about 14 minutes.

In [0]:
print(f'Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.


Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.




















Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where


Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into gs://mybuckettest001/OUTPUT_DIR_NAME/model.ckpt.


INFO:tensorflow:Saving checkpoints for 0 into gs://mybuckettest001/OUTPUT_DIR_NAME/model.ckpt.


Now let's use our test data to see how well our model did:

In [0]:
test_input_fn = run_classifier.input_fn_builder(
    features=test_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

In [0]:
estimator.evaluate(input_fn=test_input_fn, steps=None)

Now let's write code to make predictions on new sentences:

In [0]:
def getPrediction(in_sentences):
  labels = ["Negative", "Positive"]
  input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, label = 0) for x in in_sentences] # here, "" is just a dummy label
  input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
  predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
  predictions = estimator.predict(predict_input_fn)
  return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]

In [0]:
pred_sentences = [
  "That movie was absolutely awful",
  "The acting was a bit lacking",
  "The film was creative and surprising",
  "Absolutely fantastic!"
]

In [0]:
predictions = getPrediction(pred_sentences)

Voila! We have a sentiment classifier!

In [0]:
predictions