In [0]:
# Copyright 2019 Google Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#Predicting Movie Review Sentiment with BERT on TF Hub

If you’ve been following Natural Language Processing over the past year, you’ve probably heard of BERT: Bidirectional Encoder Representations from Transformers. It’s a neural network architecture designed by Google researchers that’s totally transformed what’s state-of-the-art for NLP tasks, like text classification, translation, summarization, and question answering.

Now that BERT's been added to [TF Hub](https://www.tensorflow.org/hub) as a loadable module, it's easy(ish) to add into existing Tensorflow text pipelines. In an existing pipeline, BERT can replace text embedding layers like ELMO and GloVE. Alternatively, [finetuning](http://wiki.fast.ai/index.php/Fine_tuning) BERT can provide both an accuracy boost and faster training time in many cases.

Here, we'll train a model to predict whether an IMDB movie review is positive or negative using BERT in Tensorflow with tf hub. Some code was adapted from [this colab notebook](https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb). Let's get started!

In [1]:
from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime

In addition to the standard libraries we imported above, we'll need to install BERT's python package.

In [2]:
!pip install bert-tensorflow

Collecting bert-tensorflow
[?25l  Downloading https://files.pythonhosted.org/packages/a6/66/7eb4e8b6ea35b7cc54c322c816f976167a43019750279a8473d355800a93/bert_tensorflow-1.0.1-py2.py3-none-any.whl (67kB)
[K     |████▉                           | 10kB 27.7MB/s eta 0:00:01[K     |█████████▊                      | 20kB 3.0MB/s eta 0:00:01[K     |██████████████▋                 | 30kB 4.3MB/s eta 0:00:01[K     |███████████████████▍            | 40kB 2.9MB/s eta 0:00:01[K     |████████████████████████▎       | 51kB 3.5MB/s eta 0:00:01[K     |█████████████████████████████▏  | 61kB 4.2MB/s eta 0:00:01[K     |████████████████████████████████| 71kB 3.5MB/s 
Installing collected packages: bert-tensorflow
Successfully installed bert-tensorflow-1.0.1


In [3]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization




Below, we'll set an output directory location to store our model output and checkpoints. This can be a local directory, in which case you'd set OUTPUT_DIR to the name of the directory you'd like to create. If you're running this code in Google's hosted Colab, the directory won't persist after the Colab session ends.

Alternatively, if you're a GCP user, you can store output in a GCP bucket. To do that, set a directory name in OUTPUT_DIR and the name of the GCP bucket in the BUCKET field.

Set DO_DELETE to rewrite the OUTPUT_DIR if it exists. Otherwise, Tensorflow will load existing model checkpoints from that directory (if they exist).

In [8]:
# Set the output directory for saving model file
# Optionally, set a GCP bucket location

OUTPUT_DIR = 'imdb_movie_reviews'#@param {type:"string"}
#@markdown Whether or not to clear/delete the directory and create a new one
DO_DELETE = False #@param {type:"boolean"}
#@markdown Set USE_BUCKET and BUCKET if you want to (optionally) store model output on GCP bucket.
USE_BUCKET = True #@param {type:"boolean"}
BUCKET = 'yk-first-08-2018' #@param {type:"string"}

if USE_BUCKET:
  OUTPUT_DIR = 'gs://{}/{}'.format(BUCKET, OUTPUT_DIR)
  from google.colab import auth
  auth.authenticate_user()

if DO_DELETE:
  try:
    tf.gfile.DeleteRecursively(OUTPUT_DIR)
  except:
    # Doesn't matter if the directory didn't exist
    pass
tf.gfile.MakeDirs(OUTPUT_DIR)
print('***** Model output directory: {} *****'.format(OUTPUT_DIR))


***** Model output directory: gs://yk-first-08-2018/imdb_movie_reviews *****


#Data

First, let's download the dataset, hosted by Stanford. The code below, which downloads, extracts, and imports the IMDB Large Movie Review Dataset, is borrowed from [this Tensorflow tutorial](https://www.tensorflow.org/hub/tutorials/text_classification_with_tf_hub).

In [0]:
from tensorflow import keras
import os
import re

# Load all files from a directory in a DataFrame.
def load_directory_data(directory):
  data = {}
  data["sentence"] = []
  data["sentiment"] = []
  for file_path in os.listdir(directory):
    with tf.gfile.GFile(os.path.join(directory, file_path), "r") as f:
      data["sentence"].append(f.read())
      data["sentiment"].append(re.match("\d+_(\d+)\.txt", file_path).group(1))
  return pd.DataFrame.from_dict(data)

# Merge positive and negative examples, add a polarity column and shuffle.
def load_dataset(directory):
  pos_df = load_directory_data(os.path.join(directory, "pos"))
  neg_df = load_directory_data(os.path.join(directory, "neg"))
  pos_df["polarity"] = 1
  neg_df["polarity"] = 0
  return pd.concat([pos_df, neg_df]).sample(frac=1).reset_index(drop=True)

# Download and process the dataset files.
def download_and_load_datasets(force_download=False):
  dataset = tf.keras.utils.get_file(
      fname="aclImdb.tar.gz", 
      origin="http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz", 
      extract=True)
  
  train_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                       "aclImdb", "train"))
  test_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                      "aclImdb", "test"))
  
  return train_df, test_df


In [10]:
train, test = download_and_load_datasets()

Downloading data from http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz


To keep training fast, we'll take a sample of 5000 train and test examples, respectively.

In [0]:
train = train.sample(5000)
test = test.sample(5000)

In [12]:
train.columns

Index(['sentence', 'sentiment', 'polarity'], dtype='object')

For us, our input data is the 'sentence' column and our label is the 'polarity' column (0, 1 for negative and positive, respecitvely)

In [0]:
DATA_COLUMN = 'sentence'
LABEL_COLUMN = 'polarity'
# label_list is the list of labels, i.e. True, False or 0, 1 or 'dog', 'cat'
label_list = [0, 1]

#Data Preprocessing
We'll need to transform our data into a format BERT understands. This involves two steps. First, we create  `InputExample`'s using the constructor provided in the BERT library.

- `text_a` is the text we want to classify, which in this case, is the `Request` field in our Dataframe. 
- `text_b` is used if we're training a model to understand the relationship between sentences (i.e. is `text_b` a translation of `text_a`? Is `text_b` an answer to the question asked by `text_a`?). This doesn't apply to our task, so we can leave `text_b` blank.
- `label` is the label for our example, i.e. True, False

In [0]:
# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this example
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

test_InputExamples = test.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (but don't worry--this is also included in the Python library):


1. Lowercase our text (if we're using a BERT lowercase model)
2. Tokenize it (i.e. "sally says hi" -> ["sally", "says", "hi"])
3. Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
4. Map our words to indexes using a vocab file that BERT provides
5. Add special "CLS" and "SEP" tokens (see the [readme](https://github.com/google-research/bert))
6. Append "index" and "segment" tokens to each input (see the [BERT paper](https://arxiv.org/pdf/1810.04805.pdf))

Happily, we don't have to worry about most of these details.




To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:

In [15]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


Great--we just learned that the BERT model we're using expects lowercase data (that's what stored in tokenization_info["do_lower_case"]) and we also loaded BERT's vocab file. We also created a tokenizer, which breaks words into word pieces:

In [16]:
tokenizer.tokenize("This here's an example of using the BERT tokenizer")

['this',
 'here',
 "'",
 's',
 'an',
 'example',
 'of',
 'using',
 'the',
 'bert',
 'token',
 '##izer']

Using our tokenizer, we'll call `run_classifier.convert_examples_to_features` on our InputExamples to convert them into features BERT understands.

In [17]:
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128
# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)







INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] * * * * excellent < br / > < br / > * * * good < br / > < br / > * * fair < br / > < br / > * poor < br / > < br / > ` go ahead , make my day ! ' < br / > < br / > the fourth picture in the series is directed by eastwood himself ( who was rumored of directing most of magnum force ) and he brings back the violent society from the first two films . however , the film still lacks impact and bel ##ie ##va ##bility . this film was released in the early ` 80s , the time of regan and [SEP]


INFO:tensorflow:tokens: [CLS] * * * * excellent < br / > < br / > * * * good < br / > < br / > * * fair < br / > < br / > * poor < br / > < br / > ` go ahead , make my day ! ' < br / > < br / > the fourth picture in the series is directed by eastwood himself ( who was rumored of directing most of magnum force ) and he brings back the violent society from the first two films . however , the film still lacks impact and bel ##ie ##va ##bility . this film was released in the early ` 80s , the time of regan and [SEP]


INFO:tensorflow:input_ids: 101 1008 1008 1008 1008 6581 1026 7987 1013 1028 1026 7987 1013 1028 1008 1008 1008 2204 1026 7987 1013 1028 1026 7987 1013 1028 1008 1008 4189 1026 7987 1013 1028 1026 7987 1013 1028 1008 3532 1026 7987 1013 1028 1026 7987 1013 1028 1036 2175 3805 1010 2191 2026 2154 999 1005 1026 7987 1013 1028 1026 7987 1013 1028 1996 2959 3861 1999 1996 2186 2003 2856 2011 24201 2370 1006 2040 2001 22710 1997 9855 2087 1997 19691 2486 1007 1998 2002 7545 2067 1996 6355 2554 2013 1996 2034 2048 3152 1012 2174 1010 1996 2143 2145 14087 4254 1998 19337 2666 3567 8553 1012 2023 2143 2001 2207 1999 1996 2220 1036 16002 1010 1996 2051 1997 16964 1998 102


INFO:tensorflow:input_ids: 101 1008 1008 1008 1008 6581 1026 7987 1013 1028 1026 7987 1013 1028 1008 1008 1008 2204 1026 7987 1013 1028 1026 7987 1013 1028 1008 1008 4189 1026 7987 1013 1028 1026 7987 1013 1028 1008 3532 1026 7987 1013 1028 1026 7987 1013 1028 1036 2175 3805 1010 2191 2026 2154 999 1005 1026 7987 1013 1028 1026 7987 1013 1028 1996 2959 3861 1999 1996 2186 2003 2856 2011 24201 2370 1006 2040 2001 22710 1997 9855 2087 1997 19691 2486 1007 1998 2002 7545 2067 1996 6355 2554 2013 1996 2034 2048 3152 1012 2174 1010 1996 2143 2145 14087 4254 1998 19337 2666 3567 8553 1012 2023 2143 2001 2207 1999 1996 2220 1036 16002 1010 1996 2051 1997 16964 1998 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] i finally got myself set up on mail order dvd rental so i could find movies not available to me in the stores . i chose the soul ##er opposite because i love christopher mel ##oni , and also like small , often ignored films . < br / > < br / > this one is such a treat ! mel ##oni has such charm in this part . it ' s easy to pigeon hole him is you only ever see him as his alter ego elliot stable ##r ( los ##vu ) . in this film , mel ##oni is an out of step una ##tta ##ched mid - life ##r who is hitting the ski ##ds in many ways , only to [SEP]


INFO:tensorflow:tokens: [CLS] i finally got myself set up on mail order dvd rental so i could find movies not available to me in the stores . i chose the soul ##er opposite because i love christopher mel ##oni , and also like small , often ignored films . < br / > < br / > this one is such a treat ! mel ##oni has such charm in this part . it ' s easy to pigeon hole him is you only ever see him as his alter ego elliot stable ##r ( los ##vu ) . in this film , mel ##oni is an out of step una ##tta ##ched mid - life ##r who is hitting the ski ##ds in many ways , only to [SEP]


INFO:tensorflow:input_ids: 101 1045 2633 2288 2870 2275 2039 2006 5653 2344 4966 12635 2061 1045 2071 2424 5691 2025 2800 2000 2033 1999 1996 5324 1012 1045 4900 1996 3969 2121 4500 2138 1045 2293 5696 11463 10698 1010 1998 2036 2066 2235 1010 2411 6439 3152 1012 1026 7987 1013 1028 1026 7987 1013 1028 2023 2028 2003 2107 1037 7438 999 11463 10698 2038 2107 11084 1999 2023 2112 1012 2009 1005 1055 3733 2000 16516 4920 2032 2003 2017 2069 2412 2156 2032 2004 2010 11477 13059 11759 6540 2099 1006 3050 19722 1007 1012 1999 2023 2143 1010 11463 10698 2003 2019 2041 1997 3357 14477 5946 7690 3054 1011 2166 2099 2040 2003 7294 1996 8301 5104 1999 2116 3971 1010 2069 2000 102


INFO:tensorflow:input_ids: 101 1045 2633 2288 2870 2275 2039 2006 5653 2344 4966 12635 2061 1045 2071 2424 5691 2025 2800 2000 2033 1999 1996 5324 1012 1045 4900 1996 3969 2121 4500 2138 1045 2293 5696 11463 10698 1010 1998 2036 2066 2235 1010 2411 6439 3152 1012 1026 7987 1013 1028 1026 7987 1013 1028 2023 2028 2003 2107 1037 7438 999 11463 10698 2038 2107 11084 1999 2023 2112 1012 2009 1005 1055 3733 2000 16516 4920 2032 2003 2017 2069 2412 2156 2032 2004 2010 11477 13059 11759 6540 2099 1006 3050 19722 1007 1012 1999 2023 2143 1010 11463 10698 2003 2019 2041 1997 3357 14477 5946 7690 3054 1011 2166 2099 2040 2003 7294 1996 8301 5104 1999 2116 3971 1010 2069 2000 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] i ' m fan of art , i like anything about art , i like paintings , sculptures , etc . this movie shows it , so i like it a lot , it shows how a woman wants to paint anything about art , especially naked bodies , but she can ' t do it because of her strict family ( father ) , at the beginning of the movie she painted herself naked , but she wanted a man for her paintings , but her family didn ' t let her paint naked men because it ' s against the moral . even so artemis ##ia could paint her boyfriend and her art teacher completely naked . she falls in love with her art [SEP]


INFO:tensorflow:tokens: [CLS] i ' m fan of art , i like anything about art , i like paintings , sculptures , etc . this movie shows it , so i like it a lot , it shows how a woman wants to paint anything about art , especially naked bodies , but she can ' t do it because of her strict family ( father ) , at the beginning of the movie she painted herself naked , but she wanted a man for her paintings , but her family didn ' t let her paint naked men because it ' s against the moral . even so artemis ##ia could paint her boyfriend and her art teacher completely naked . she falls in love with her art [SEP]


INFO:tensorflow:input_ids: 101 1045 1005 1049 5470 1997 2396 1010 1045 2066 2505 2055 2396 1010 1045 2066 5265 1010 10801 1010 4385 1012 2023 3185 3065 2009 1010 2061 1045 2066 2009 1037 2843 1010 2009 3065 2129 1037 2450 4122 2000 6773 2505 2055 2396 1010 2926 6248 4230 1010 2021 2016 2064 1005 1056 2079 2009 2138 1997 2014 9384 2155 1006 2269 1007 1010 2012 1996 2927 1997 1996 3185 2016 4993 2841 6248 1010 2021 2016 2359 1037 2158 2005 2014 5265 1010 2021 2014 2155 2134 1005 1056 2292 2014 6773 6248 2273 2138 2009 1005 1055 2114 1996 7191 1012 2130 2061 19063 2401 2071 6773 2014 6898 1998 2014 2396 3836 3294 6248 1012 2016 4212 1999 2293 2007 2014 2396 102


INFO:tensorflow:input_ids: 101 1045 1005 1049 5470 1997 2396 1010 1045 2066 2505 2055 2396 1010 1045 2066 5265 1010 10801 1010 4385 1012 2023 3185 3065 2009 1010 2061 1045 2066 2009 1037 2843 1010 2009 3065 2129 1037 2450 4122 2000 6773 2505 2055 2396 1010 2926 6248 4230 1010 2021 2016 2064 1005 1056 2079 2009 2138 1997 2014 9384 2155 1006 2269 1007 1010 2012 1996 2927 1997 1996 3185 2016 4993 2841 6248 1010 2021 2016 2359 1037 2158 2005 2014 5265 1010 2021 2014 2155 2134 1005 1056 2292 2014 6773 6248 2273 2138 2009 1005 1055 2114 1996 7191 1012 2130 2061 19063 2401 2071 6773 2014 6898 1998 2014 2396 3836 3294 6248 1012 2016 4212 1999 2293 2007 2014 2396 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] this is a very strange hk film in many ways . first , many of the action sequences really aren ' t that much fun . the very first gun battle the occurs in the film was just silly . not cool silly , or even funny silly , but just silly . that ' s not to say there aren ' t some great action scenes , but most simply don ' t come up to the level of some of the other films i have seen . the opposite side is that this film actually has characters , not just people . all of the main characters are interesting ( except for the head bad guy , who is flat as a bill ##iard [SEP]


INFO:tensorflow:tokens: [CLS] this is a very strange hk film in many ways . first , many of the action sequences really aren ' t that much fun . the very first gun battle the occurs in the film was just silly . not cool silly , or even funny silly , but just silly . that ' s not to say there aren ' t some great action scenes , but most simply don ' t come up to the level of some of the other films i have seen . the opposite side is that this film actually has characters , not just people . all of the main characters are interesting ( except for the head bad guy , who is flat as a bill ##iard [SEP]


INFO:tensorflow:input_ids: 101 2023 2003 1037 2200 4326 22563 2143 1999 2116 3971 1012 2034 1010 2116 1997 1996 2895 10071 2428 4995 1005 1056 2008 2172 4569 1012 1996 2200 2034 3282 2645 1996 5158 1999 1996 2143 2001 2074 10021 1012 2025 4658 10021 1010 2030 2130 6057 10021 1010 2021 2074 10021 1012 2008 1005 1055 2025 2000 2360 2045 4995 1005 1056 2070 2307 2895 5019 1010 2021 2087 3432 2123 1005 1056 2272 2039 2000 1996 2504 1997 2070 1997 1996 2060 3152 1045 2031 2464 1012 1996 4500 2217 2003 2008 2023 2143 2941 2038 3494 1010 2025 2074 2111 1012 2035 1997 1996 2364 3494 2024 5875 1006 3272 2005 1996 2132 2919 3124 1010 2040 2003 4257 2004 1037 3021 14619 102


INFO:tensorflow:input_ids: 101 2023 2003 1037 2200 4326 22563 2143 1999 2116 3971 1012 2034 1010 2116 1997 1996 2895 10071 2428 4995 1005 1056 2008 2172 4569 1012 1996 2200 2034 3282 2645 1996 5158 1999 1996 2143 2001 2074 10021 1012 2025 4658 10021 1010 2030 2130 6057 10021 1010 2021 2074 10021 1012 2008 1005 1055 2025 2000 2360 2045 4995 1005 1056 2070 2307 2895 5019 1010 2021 2087 3432 2123 1005 1056 2272 2039 2000 1996 2504 1997 2070 1997 1996 2060 3152 1045 2031 2464 1012 1996 4500 2217 2003 2008 2023 2143 2941 2038 3494 1010 2025 2074 2111 1012 2035 1997 1996 2364 3494 2024 5875 1006 3272 2005 1996 2132 2919 3124 1010 2040 2003 4257 2004 1037 3021 14619 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] come on people . this movie is better than 4 . i can see this happening . . . wealthy people have done cr ##azi ##er things than this . and it was funny . < br / > < br / > i watch a comedy to be entertained , escape from the pressures of the world for a short while , and not to have to take anything too seriously . this movie fully suits that purpose . i judge a movie on its own merits and am not about to compare surviving christmas to blazing saddle ##s . i watched totally dysfunction ##al people grow into caring , li ##ka ##ble individuals who could easily live down the street from my home . [SEP]


INFO:tensorflow:tokens: [CLS] come on people . this movie is better than 4 . i can see this happening . . . wealthy people have done cr ##azi ##er things than this . and it was funny . < br / > < br / > i watch a comedy to be entertained , escape from the pressures of the world for a short while , and not to have to take anything too seriously . this movie fully suits that purpose . i judge a movie on its own merits and am not about to compare surviving christmas to blazing saddle ##s . i watched totally dysfunction ##al people grow into caring , li ##ka ##ble individuals who could easily live down the street from my home . [SEP]


INFO:tensorflow:input_ids: 101 2272 2006 2111 1012 2023 3185 2003 2488 2084 1018 1012 1045 2064 2156 2023 6230 1012 1012 1012 7272 2111 2031 2589 13675 16103 2121 2477 2084 2023 1012 1998 2009 2001 6057 1012 1026 7987 1013 1028 1026 7987 1013 1028 1045 3422 1037 4038 2000 2022 21474 1010 4019 2013 1996 15399 1997 1996 2088 2005 1037 2460 2096 1010 1998 2025 2000 2031 2000 2202 2505 2205 5667 1012 2023 3185 3929 11072 2008 3800 1012 1045 3648 1037 3185 2006 2049 2219 22617 1998 2572 2025 2055 2000 12826 6405 4234 2000 17162 12279 2015 1012 1045 3427 6135 28466 2389 2111 4982 2046 11922 1010 5622 2912 3468 3633 2040 2071 4089 2444 2091 1996 2395 2013 2026 2188 1012 102


INFO:tensorflow:input_ids: 101 2272 2006 2111 1012 2023 3185 2003 2488 2084 1018 1012 1045 2064 2156 2023 6230 1012 1012 1012 7272 2111 2031 2589 13675 16103 2121 2477 2084 2023 1012 1998 2009 2001 6057 1012 1026 7987 1013 1028 1026 7987 1013 1028 1045 3422 1037 4038 2000 2022 21474 1010 4019 2013 1996 15399 1997 1996 2088 2005 1037 2460 2096 1010 1998 2025 2000 2031 2000 2202 2505 2205 5667 1012 2023 3185 3929 11072 2008 3800 1012 1045 3648 1037 3185 2006 2049 2219 22617 1998 2572 2025 2055 2000 12826 6405 4234 2000 17162 12279 2015 1012 1045 3427 6135 28466 2389 2111 4982 2046 11922 1010 5622 2912 3468 3633 2040 2071 4089 2444 2091 1996 2395 2013 2026 2188 1012 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] ' bluff ' has been showing for a good few months at movie theatres in bogota , and today i finally got round to seeing it . i didn ' t really know what to expect at all , but was very happily surprised . it is a crime comedy of the same il ##k as snatch etc , and it manages to nicely balance elements of suspense with comedy . the style of the film is established early on with cheerful music and the argentine ##an narrator who makes aside ##s direct to camera - - odd the first time , but subsequently fitting naturally . < br / > < br / > with my less - than - perfect spanish , i still [SEP]


INFO:tensorflow:tokens: [CLS] ' bluff ' has been showing for a good few months at movie theatres in bogota , and today i finally got round to seeing it . i didn ' t really know what to expect at all , but was very happily surprised . it is a crime comedy of the same il ##k as snatch etc , and it manages to nicely balance elements of suspense with comedy . the style of the film is established early on with cheerful music and the argentine ##an narrator who makes aside ##s direct to camera - - odd the first time , but subsequently fitting naturally . < br / > < br / > with my less - than - perfect spanish , i still [SEP]


INFO:tensorflow:input_ids: 101 1005 14441 1005 2038 2042 4760 2005 1037 2204 2261 2706 2012 3185 13166 1999 21240 1010 1998 2651 1045 2633 2288 2461 2000 3773 2009 1012 1045 2134 1005 1056 2428 2113 2054 2000 5987 2012 2035 1010 2021 2001 2200 11361 4527 1012 2009 2003 1037 4126 4038 1997 1996 2168 6335 2243 2004 23365 4385 1010 1998 2009 9020 2000 19957 5703 3787 1997 23873 2007 4038 1012 1996 2806 1997 1996 2143 2003 2511 2220 2006 2007 18350 2189 1998 1996 8511 2319 11185 2040 3084 4998 2015 3622 2000 4950 1011 1011 5976 1996 2034 2051 1010 2021 3525 11414 8100 1012 1026 7987 1013 1028 1026 7987 1013 1028 2007 2026 2625 1011 2084 1011 3819 3009 1010 1045 2145 102


INFO:tensorflow:input_ids: 101 1005 14441 1005 2038 2042 4760 2005 1037 2204 2261 2706 2012 3185 13166 1999 21240 1010 1998 2651 1045 2633 2288 2461 2000 3773 2009 1012 1045 2134 1005 1056 2428 2113 2054 2000 5987 2012 2035 1010 2021 2001 2200 11361 4527 1012 2009 2003 1037 4126 4038 1997 1996 2168 6335 2243 2004 23365 4385 1010 1998 2009 9020 2000 19957 5703 3787 1997 23873 2007 4038 1012 1996 2806 1997 1996 2143 2003 2511 2220 2006 2007 18350 2189 1998 1996 8511 2319 11185 2040 3084 4998 2015 3622 2000 4950 1011 1011 5976 1996 2034 2051 1010 2021 3525 11414 8100 1012 1026 7987 1013 1028 1026 7987 1013 1028 2007 2026 2625 1011 2084 1011 3819 3009 1010 1045 2145 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] so , this starts with at least an interesting and promising basic idea , goes on and on with tension , carey in a good un ##typical role but in a less than you expected performance , weak direction from joel schumacher match with some plot holes , the " detective scenes " show us the luck of creativity . if you don ' t have great expectations ( because of the negative reviews ) maybe you will enjoy this . at the end they offer to us a lesson about morality ( for those who remember " falling down " ) and the " family joy and cure " that ruins every possibility to be kind and find the film watch ##able p . s [SEP]


INFO:tensorflow:tokens: [CLS] so , this starts with at least an interesting and promising basic idea , goes on and on with tension , carey in a good un ##typical role but in a less than you expected performance , weak direction from joel schumacher match with some plot holes , the " detective scenes " show us the luck of creativity . if you don ' t have great expectations ( because of the negative reviews ) maybe you will enjoy this . at the end they offer to us a lesson about morality ( for those who remember " falling down " ) and the " family joy and cure " that ruins every possibility to be kind and find the film watch ##able p . s [SEP]


INFO:tensorflow:input_ids: 101 2061 1010 2023 4627 2007 2012 2560 2019 5875 1998 10015 3937 2801 1010 3632 2006 1998 2006 2007 6980 1010 11782 1999 1037 2204 4895 27086 2535 2021 1999 1037 2625 2084 2017 3517 2836 1010 5410 3257 2013 8963 22253 2674 2007 2070 5436 8198 1010 1996 1000 6317 5019 1000 2265 2149 1996 6735 1997 14842 1012 2065 2017 2123 1005 1056 2031 2307 10908 1006 2138 1997 1996 4997 4391 1007 2672 2017 2097 5959 2023 1012 2012 1996 2203 2027 3749 2000 2149 1037 10800 2055 16561 1006 2005 2216 2040 3342 1000 4634 2091 1000 1007 1998 1996 1000 2155 6569 1998 9526 1000 2008 8435 2296 6061 2000 2022 2785 1998 2424 1996 2143 3422 3085 1052 1012 1055 102


INFO:tensorflow:input_ids: 101 2061 1010 2023 4627 2007 2012 2560 2019 5875 1998 10015 3937 2801 1010 3632 2006 1998 2006 2007 6980 1010 11782 1999 1037 2204 4895 27086 2535 2021 1999 1037 2625 2084 2017 3517 2836 1010 5410 3257 2013 8963 22253 2674 2007 2070 5436 8198 1010 1996 1000 6317 5019 1000 2265 2149 1996 6735 1997 14842 1012 2065 2017 2123 1005 1056 2031 2307 10908 1006 2138 1997 1996 4997 4391 1007 2672 2017 2097 5959 2023 1012 2012 1996 2203 2027 3749 2000 2149 1037 10800 2055 16561 1006 2005 2216 2040 3342 1000 4634 2091 1000 1007 1998 1996 1000 2155 6569 1998 9526 1000 2008 8435 2296 6061 2000 2022 2785 1998 2424 1996 2143 3422 3085 1052 1012 1055 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] ' ' meet she ##rri . . for an evening of pleasure and terror ! ' ' cheap special effects , che ##es ##y lines , yep its the original 1978 movie ' ' nurse she ##rri ' ' starting geoffrey land as peter desmond , and jill jacobs ##on as she ##rri martin and directed by al adams ##on . < br / > < br / > the movie is about an evil ancient spirit that possesses a nurse at a hospital , then she starts killing doctors one by one . the acting was okay but some of the acting was robotic . the storyline was good but the sex scenes were just thrown in there probably to get more views . the [SEP]


INFO:tensorflow:tokens: [CLS] ' ' meet she ##rri . . for an evening of pleasure and terror ! ' ' cheap special effects , che ##es ##y lines , yep its the original 1978 movie ' ' nurse she ##rri ' ' starting geoffrey land as peter desmond , and jill jacobs ##on as she ##rri martin and directed by al adams ##on . < br / > < br / > the movie is about an evil ancient spirit that possesses a nurse at a hospital , then she starts killing doctors one by one . the acting was okay but some of the acting was robotic . the storyline was good but the sex scenes were just thrown in there probably to get more views . the [SEP]


INFO:tensorflow:input_ids: 101 1005 1005 3113 2016 18752 1012 1012 2005 2019 3944 1997 5165 1998 7404 999 1005 1005 10036 2569 3896 1010 18178 2229 2100 3210 1010 15624 2049 1996 2434 3301 3185 1005 1005 6821 2016 18752 1005 1005 3225 11023 2455 2004 2848 16192 1010 1998 10454 12988 2239 2004 2016 18752 3235 1998 2856 2011 2632 5922 2239 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 3185 2003 2055 2019 4763 3418 4382 2008 14882 1037 6821 2012 1037 2902 1010 2059 2016 4627 4288 7435 2028 2011 2028 1012 1996 3772 2001 3100 2021 2070 1997 1996 3772 2001 20478 1012 1996 9994 2001 2204 2021 1996 3348 5019 2020 2074 6908 1999 2045 2763 2000 2131 2062 5328 1012 1996 102


INFO:tensorflow:input_ids: 101 1005 1005 3113 2016 18752 1012 1012 2005 2019 3944 1997 5165 1998 7404 999 1005 1005 10036 2569 3896 1010 18178 2229 2100 3210 1010 15624 2049 1996 2434 3301 3185 1005 1005 6821 2016 18752 1005 1005 3225 11023 2455 2004 2848 16192 1010 1998 10454 12988 2239 2004 2016 18752 3235 1998 2856 2011 2632 5922 2239 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 3185 2003 2055 2019 4763 3418 4382 2008 14882 1037 6821 2012 1037 2902 1010 2059 2016 4627 4288 7435 2028 2011 2028 1012 1996 3772 2001 3100 2021 2070 1997 1996 3772 2001 20478 1012 1996 9994 2001 2204 2021 1996 3348 5019 2020 2074 6908 1999 2045 2763 2000 2131 2062 5328 1012 1996 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] i was 12 when this film was released and adored it . the song ' s were inspiring and it made me feel good , watching it several time ' s at the cinema . i actually had the soundtrack album and played the song ' s over and over . < br / > < br / > 26 years later . . . i ' m ashamed . just sat and watched it with my 2 daughters who enjoyed it lot ' s but my cynical older grown up eyes hated it . it ' s very poorly directed in many places and considering it was lionel jeff ##ries directing i really wanted to enjoy it . the character animation was so rough yet [SEP]


INFO:tensorflow:tokens: [CLS] i was 12 when this film was released and adored it . the song ' s were inspiring and it made me feel good , watching it several time ' s at the cinema . i actually had the soundtrack album and played the song ' s over and over . < br / > < br / > 26 years later . . . i ' m ashamed . just sat and watched it with my 2 daughters who enjoyed it lot ' s but my cynical older grown up eyes hated it . it ' s very poorly directed in many places and considering it was lionel jeff ##ries directing i really wanted to enjoy it . the character animation was so rough yet [SEP]


INFO:tensorflow:input_ids: 101 1045 2001 2260 2043 2023 2143 2001 2207 1998 28456 2009 1012 1996 2299 1005 1055 2020 18988 1998 2009 2081 2033 2514 2204 1010 3666 2009 2195 2051 1005 1055 2012 1996 5988 1012 1045 2941 2018 1996 6050 2201 1998 2209 1996 2299 1005 1055 2058 1998 2058 1012 1026 7987 1013 1028 1026 7987 1013 1028 2656 2086 2101 1012 1012 1012 1045 1005 1049 14984 1012 2074 2938 1998 3427 2009 2007 2026 1016 5727 2040 5632 2009 2843 1005 1055 2021 2026 26881 3080 4961 2039 2159 6283 2009 1012 2009 1005 1055 2200 9996 2856 1999 2116 3182 1998 6195 2009 2001 14377 5076 5134 9855 1045 2428 2359 2000 5959 2009 1012 1996 2839 7284 2001 2061 5931 2664 102


INFO:tensorflow:input_ids: 101 1045 2001 2260 2043 2023 2143 2001 2207 1998 28456 2009 1012 1996 2299 1005 1055 2020 18988 1998 2009 2081 2033 2514 2204 1010 3666 2009 2195 2051 1005 1055 2012 1996 5988 1012 1045 2941 2018 1996 6050 2201 1998 2209 1996 2299 1005 1055 2058 1998 2058 1012 1026 7987 1013 1028 1026 7987 1013 1028 2656 2086 2101 1012 1012 1012 1045 1005 1049 14984 1012 2074 2938 1998 3427 2009 2007 2026 1016 5727 2040 5632 2009 2843 1005 1055 2021 2026 26881 3080 4961 2039 2159 6283 2009 1012 2009 1005 1055 2200 9996 2856 1999 2116 3182 1998 6195 2009 2001 14377 5076 5134 9855 1045 2428 2359 2000 5959 2009 1012 1996 2839 7284 2001 2061 5931 2664 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] i think the problem with this show not getting the respect it truly deserves is that it comes after se ##in ##feld , after el ##r and after friends . those three sitcom ##s were the star shows of their time . < br / > < br / > ko ##q ##s came at the end of this special time in tv . < br / > < br / > but don ' t let that di ##ss ##ua ##de you . < br / > < br / > king of queens is as good if not better than two of the three mentioned . < br / > < br / > se ##in ##feld started it all and was and is [SEP]


INFO:tensorflow:tokens: [CLS] i think the problem with this show not getting the respect it truly deserves is that it comes after se ##in ##feld , after el ##r and after friends . those three sitcom ##s were the star shows of their time . < br / > < br / > ko ##q ##s came at the end of this special time in tv . < br / > < br / > but don ' t let that di ##ss ##ua ##de you . < br / > < br / > king of queens is as good if not better than two of the three mentioned . < br / > < br / > se ##in ##feld started it all and was and is [SEP]


INFO:tensorflow:input_ids: 101 1045 2228 1996 3291 2007 2023 2265 2025 2893 1996 4847 2009 5621 17210 2003 2008 2009 3310 2044 7367 2378 8151 1010 2044 3449 2099 1998 2044 2814 1012 2216 2093 13130 2015 2020 1996 2732 3065 1997 2037 2051 1012 1026 7987 1013 1028 1026 7987 1013 1028 12849 4160 2015 2234 2012 1996 2203 1997 2023 2569 2051 1999 2694 1012 1026 7987 1013 1028 1026 7987 1013 1028 2021 2123 1005 1056 2292 2008 4487 4757 6692 3207 2017 1012 1026 7987 1013 1028 1026 7987 1013 1028 2332 1997 8603 2003 2004 2204 2065 2025 2488 2084 2048 1997 1996 2093 3855 1012 1026 7987 1013 1028 1026 7987 1013 1028 7367 2378 8151 2318 2009 2035 1998 2001 1998 2003 102


INFO:tensorflow:input_ids: 101 1045 2228 1996 3291 2007 2023 2265 2025 2893 1996 4847 2009 5621 17210 2003 2008 2009 3310 2044 7367 2378 8151 1010 2044 3449 2099 1998 2044 2814 1012 2216 2093 13130 2015 2020 1996 2732 3065 1997 2037 2051 1012 1026 7987 1013 1028 1026 7987 1013 1028 12849 4160 2015 2234 2012 1996 2203 1997 2023 2569 2051 1999 2694 1012 1026 7987 1013 1028 1026 7987 1013 1028 2021 2123 1005 1056 2292 2008 4487 4757 6692 3207 2017 1012 1026 7987 1013 1028 1026 7987 1013 1028 2332 1997 8603 2003 2004 2204 2065 2025 2488 2084 2048 1997 1996 2093 3855 1012 1026 7987 1013 1028 1026 7987 1013 1028 7367 2378 8151 2318 2009 2035 1998 2001 1998 2003 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


#Creating a model

Now that we've prepared our data, let's focus on building a model. `create_model` does just this below. First, it loads the BERT tf hub module again (this time to extract the computation graph). Next, it creates a single new layer that will be trained to adapt BERT to our sentiment task (i.e. classifying whether a movie review is positive or negative). This strategy of using a mostly trained model is called [fine-tuning](http://wiki.fast.ai/index.php/Fine_tuning).

In [0]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    ### For multiclass softmax will be replaced by another optimization algorithm
    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)
    print('one_hot_labels : ', one_hot_labels)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)


Next we'll wrap our model function in a `model_fn_builder` function that adapts our model to work for training, evaluation, and prediction.

In [0]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        f1_score = tf.contrib.metrics.f1_score(
            label_ids,
            predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        return {
            "eval_accuracy": accuracy,
            "f1_score": f1_score,
            "auc": auc,
            "precision": precision,
            "recall": recall,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
        }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


In [0]:
# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 500
SAVE_SUMMARY_STEPS = 100

In [0]:
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [0]:
# Specify outpit directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [23]:
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})


INFO:tensorflow:Using config: {'_model_dir': 'gs://yk-first-08-2018/imdb_movie_reviews', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f1e95d66048>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


INFO:tensorflow:Using config: {'_model_dir': 'gs://yk-first-08-2018/imdb_movie_reviews', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f1e95d66048>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


Next we create an input builder function that takes our training feature set (`train_features`) and produces a generator. This is a pretty standard design pattern for working with Tensorflow [Estimators](https://www.tensorflow.org/guide/estimators).

In [0]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

Now we train our model! For me, using a Colab notebook running on Google's GPUs, my training time was about 14 minutes.

In [26]:
print(f'Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.


Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


one_hot_labels :  Tensor("loss/one_hot:0", shape=(?, 2), dtype=float32)


















Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where


Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into gs://yk-first-08-2018/imdb_movie_reviews/model.ckpt.


INFO:tensorflow:Saving checkpoints for 0 into gs://yk-first-08-2018/imdb_movie_reviews/model.ckpt.


INFO:tensorflow:loss = 0.6985192, step = 0


INFO:tensorflow:loss = 0.6985192, step = 0


INFO:tensorflow:global_step/sec: 1.6166


INFO:tensorflow:global_step/sec: 1.6166


INFO:tensorflow:loss = 0.5094373, step = 100 (63.625 sec)


INFO:tensorflow:loss = 0.5094373, step = 100 (63.625 sec)


INFO:tensorflow:global_step/sec: 2.03212


INFO:tensorflow:global_step/sec: 2.03212


INFO:tensorflow:loss = 0.03153464, step = 200 (47.444 sec)


INFO:tensorflow:loss = 0.03153464, step = 200 (47.444 sec)


INFO:tensorflow:global_step/sec: 2.10663


INFO:tensorflow:global_step/sec: 2.10663


INFO:tensorflow:loss = 0.3555407, step = 300 (47.472 sec)


INFO:tensorflow:loss = 0.3555407, step = 300 (47.472 sec)


INFO:tensorflow:global_step/sec: 2.03836


INFO:tensorflow:global_step/sec: 2.03836


INFO:tensorflow:loss = 0.17384459, step = 400 (49.056 sec)


INFO:tensorflow:loss = 0.17384459, step = 400 (49.056 sec)


INFO:tensorflow:Saving checkpoints for 468 into gs://yk-first-08-2018/imdb_movie_reviews/model.ckpt.


INFO:tensorflow:Saving checkpoints for 468 into gs://yk-first-08-2018/imdb_movie_reviews/model.ckpt.


INFO:tensorflow:Loss for final step: 0.015067293.


INFO:tensorflow:Loss for final step: 0.015067293.


Training took time  0:05:48.309615


Now let's use our test data to see how well our model did:

In [0]:
test_input_fn = run_classifier.input_fn_builder(
    features=test_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

In [0]:
estimator.evaluate(input_fn=test_input_fn, steps=None)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore


  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2019-02-12T21:04:20Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from gs://bert-tfhub/aclImdb_v1/model.ckpt-468
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Finished evaluation at 2019-02-12-21:06:05
INFO:tensorflow:Saving dict for global step 468: auc = 0.86659324, eval_accuracy = 0.8664, f1_score = 0.8659711, false_negatives = 375.0, false_positives = 293.0, global_step = 468, loss = 0.51870537, precision = 0.880457, recall = 0.8519542, true_negatives = 2174.0, true_positives = 2158.0
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 468: gs://bert-tfhub/aclImdb_v1/model.ckpt-468


{'auc': 0.86659324,
 'eval_accuracy': 0.8664,
 'f1_score': 0.8659711,
 'false_negatives': 375.0,
 'false_positives': 293.0,
 'global_step': 468,
 'loss': 0.51870537,
 'precision': 0.880457,
 'recall': 0.8519542,
 'true_negatives': 2174.0,
 'true_positives': 2158.0}

Now let's write code to make predictions on new sentences:

In [0]:
def getPrediction(in_sentences):
  labels = ["Negative", "Positive"]
  input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, label = 0) for x in in_sentences] # here, "" is just a dummy label
  input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
  predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
  predictions = estimator.predict(predict_input_fn)
  return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]

In [0]:
pred_sentences = [
  "That movie was absolutely awful",
  "The acting was a bit lacking",
  "The film was creative and surprising",
  "Absolutely fantastic!"
]

In [0]:
predictions = getPrediction(pred_sentences)

INFO:tensorflow:Writing example 0 of 4
INFO:tensorflow:*** Example ***
INFO:tensorflow:guid: 
INFO:tensorflow:tokens: [CLS] that movie was absolutely awful [SEP]
INFO:tensorflow:input_ids: 101 2008 3185 2001 7078 9643 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Voila! We have a sentiment classifier!

In [0]:
predictions

[('That movie was absolutely awful',
  array([-4.9142293e-03, -5.3180690e+00], dtype=float32),
  'Negative'),
 ('The acting was a bit lacking',
  array([-0.03325794, -3.4200459 ], dtype=float32),
  'Negative'),
 ('The film was creative and surprising',
  array([-5.3589125e+00, -4.7171740e-03], dtype=float32),
  'Positive'),
 ('Absolutely fantastic!',
  array([-5.0434084 , -0.00647258], dtype=float32),
  'Positive')]