<a href="https://colab.research.google.com/github/isabelline/BioNLPDatasets/blob/master/Predicting_Movie_Reviews_with_BERT_on_TF_Hub_ipynb%EC%9D%98_%EC%82%AC%EB%B3%B8.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
# Copyright 2019 Google Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#Predicting Movie Review Sentiment with BERT on TF Hub

If you’ve been following Natural Language Processing over the past year, you’ve probably heard of BERT: Bidirectional Encoder Representations from Transformers. It’s a neural network architecture designed by Google researchers that’s totally transformed what’s state-of-the-art for NLP tasks, like text classification, translation, summarization, and question answering.

Now that BERT's been added to [TF Hub](https://www.tensorflow.org/hub) as a loadable module, it's easy(ish) to add into existing Tensorflow text pipelines. In an existing pipeline, BERT can replace text embedding layers like ELMO and GloVE. Alternatively, [finetuning](http://wiki.fast.ai/index.php/Fine_tuning) BERT can provide both an accuracy boost and faster training time in many cases.

Here, we'll train a model to predict whether an IMDB movie review is positive or negative using BERT in Tensorflow with tf hub. Some code was adapted from [this colab notebook](https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb). Let's get started!

In [1]:
from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime

In addition to the standard libraries we imported above, we'll need to install BERT's python package.

In [2]:
!pip install bert-tensorflow

Collecting bert-tensorflow
[?25l  Downloading https://files.pythonhosted.org/packages/a6/66/7eb4e8b6ea35b7cc54c322c816f976167a43019750279a8473d355800a93/bert_tensorflow-1.0.1-py2.py3-none-any.whl (67kB)
[K     |████▉                           | 10kB 31.7MB/s eta 0:00:01[K     |█████████▊                      | 20kB 3.0MB/s eta 0:00:01[K     |██████████████▋                 | 30kB 4.3MB/s eta 0:00:01[K     |███████████████████▍            | 40kB 2.9MB/s eta 0:00:01[K     |████████████████████████▎       | 51kB 3.5MB/s eta 0:00:01[K     |█████████████████████████████▏  | 61kB 4.2MB/s eta 0:00:01[K     |████████████████████████████████| 71kB 3.6MB/s 
Installing collected packages: bert-tensorflow
Successfully installed bert-tensorflow-1.0.1


In [3]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization




Below, we'll set an output directory location to store our model output and checkpoints. This can be a local directory, in which case you'd set OUTPUT_DIR to the name of the directory you'd like to create. If you're running this code in Google's hosted Colab, the directory won't persist after the Colab session ends.

Alternatively, if you're a GCP user, you can store output in a GCP bucket. To do that, set a directory name in OUTPUT_DIR and the name of the GCP bucket in the BUCKET field.

Set DO_DELETE to rewrite the OUTPUT_DIR if it exists. Otherwise, Tensorflow will load existing model checkpoints from that directory (if they exist).

In [4]:
# Set the output directory for saving model file
# Optionally, set a GCP bucket location

OUTPUT_DIR = 'outputdir'#@param {type:"string"}
#@markdown Whether or not to clear/delete the directory and create a new one
DO_DELETE = False #@param {type:"boolean"}
#@markdown Set USE_BUCKET and BUCKET if you want to (optionally) store model output on GCP bucket.
USE_BUCKET = True #@param {type:"boolean"}
BUCKET = 'translation310' #@param {type:"string"}

if USE_BUCKET:
  OUTPUT_DIR = 'gs://{}/{}'.format(BUCKET, OUTPUT_DIR)
  from google.colab import auth
  auth.authenticate_user()

if DO_DELETE:
  try:
    tf.gfile.DeleteRecursively(OUTPUT_DIR)
  except:
    # Doesn't matter if the directory didn't exist
    pass
tf.gfile.MakeDirs(OUTPUT_DIR)
print('***** Model output directory: {} *****'.format(OUTPUT_DIR))


***** Model output directory: gs://translation310/outputdir *****


#Data

First, let's download the dataset, hosted by Stanford. The code below, which downloads, extracts, and imports the IMDB Large Movie Review Dataset, is borrowed from [this Tensorflow tutorial](https://www.tensorflow.org/hub/tutorials/text_classification_with_tf_hub).

In [0]:
from tensorflow import keras
import os
import re

# Load all files from a directory in a DataFrame.
def load_directory_data(directory):
  data = {}
  data["sentence"] = []
  data["sentiment"] = []
  for file_path in os.listdir(directory):
    with tf.gfile.GFile(os.path.join(directory, file_path), "r") as f:
      data["sentence"].append(f.read())
      data["sentiment"].append(re.match("\d+_(\d+)\.txt", file_path).group(1))
  return pd.DataFrame.from_dict(data)

# Merge positive and negative examples, add a polarity column and shuffle.
def load_dataset(directory):
  pos_df = load_directory_data(os.path.join(directory, "pos"))
  neg_df = load_directory_data(os.path.join(directory, "neg"))
  pos_df["polarity"] = 1
  neg_df["polarity"] = 0
  return pd.concat([pos_df, neg_df]).sample(frac=1).reset_index(drop=True)

# Download and process the dataset files.
def download_and_load_datasets(force_download=False):
  dataset = tf.keras.utils.get_file(
      fname="aclImdb.tar.gz", 
      origin="http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz", 
      extract=True)
  
  train_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                       "aclImdb", "train"))
  test_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                      "aclImdb", "test"))
  
  return train_df, test_df


In [6]:
train, test = download_and_load_datasets()

Downloading data from http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz


To keep training fast, we'll take a sample of 5000 train and test examples, respectively.

In [0]:
train = train.sample(5000)
test = test.sample(5000)

In [8]:
train.columns

Index(['sentence', 'sentiment', 'polarity'], dtype='object')

For us, our input data is the 'sentence' column and our label is the 'polarity' column (0, 1 for negative and positive, respecitvely)

In [0]:
DATA_COLUMN = 'sentence'
LABEL_COLUMN = 'polarity'
# label_list is the list of labels, i.e. True, False or 0, 1 or 'dog', 'cat'
label_list = [0, 1]

#Data Preprocessing
We'll need to transform our data into a format BERT understands. This involves two steps. First, we create  `InputExample`'s using the constructor provided in the BERT library.

- `text_a` is the text we want to classify, which in this case, is the `Request` field in our Dataframe. 
- `text_b` is used if we're training a model to understand the relationship between sentences (i.e. is `text_b` a translation of `text_a`? Is `text_b` an answer to the question asked by `text_a`?). This doesn't apply to our task, so we can leave `text_b` blank.
- `label` is the label for our example, i.e. True, False

In [0]:
# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this example
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

test_InputExamples = test.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (but don't worry--this is also included in the Python library):


1. Lowercase our text (if we're using a BERT lowercase model)
2. Tokenize it (i.e. "sally says hi" -> ["sally", "says", "hi"])
3. Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
4. Map our words to indexes using a vocab file that BERT provides
5. Add special "CLS" and "SEP" tokens (see the [readme](https://github.com/google-research/bert))
6. Append "index" and "segment" tokens to each input (see the [BERT paper](https://arxiv.org/pdf/1810.04805.pdf))

Happily, we don't have to worry about most of these details.




To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:

In [11]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_multi_cased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


Great--we just learned that the BERT model we're using expects lowercase data (that's what stored in tokenization_info["do_lower_case"]) and we also loaded BERT's vocab file. We also created a tokenizer, which breaks words into word pieces:

In [12]:
tokenizer.tokenize("This here's an example of using the BERT tokenizer")

['This',
 'here',
 "'",
 's',
 'an',
 'example',
 'of',
 'using',
 'the',
 'BE',
 '##RT',
 'tok',
 '##eni',
 '##zer']

Using our tokenizer, we'll call `run_classifier.convert_examples_to_features` on our InputExamples to convert them into features BERT understands.

In [13]:
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128
# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)







INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] I don ' t think I ' ll ever understand the hat ##e for Ren ##ny Har ##lin . ' Die Hard 2 ' was cool , and he gave the world ' Cliff ##hang ##er ' , one of the most aw ##eso ##me action movies ever . That ' s right , you little punk ##s , ' Cliff ##hang ##er ' rules , and we all know it . < br / > < br / > S ##ly plays Ga ##be Walker , a former rescue climb ##er who is ' just visiting ' his old town when he is asked to help a former friend , Hal Tucker ( Michael R ##ook ##er ) , assist in a rescue on a [SEP]


INFO:tensorflow:tokens: [CLS] I don ' t think I ' ll ever understand the hat ##e for Ren ##ny Har ##lin . ' Die Hard 2 ' was cool , and he gave the world ' Cliff ##hang ##er ' , one of the most aw ##eso ##me action movies ever . That ' s right , you little punk ##s , ' Cliff ##hang ##er ' rules , and we all know it . < br / > < br / > S ##ly plays Ga ##be Walker , a former rescue climb ##er who is ' just visiting ' his old town when he is asked to help a former friend , Hal Tucker ( Michael R ##ook ##er ) , assist in a rescue on a [SEP]


INFO:tensorflow:input_ids: 101 146 16938 112 188 27874 146 112 22469 17038 49151 10105 11250 10112 10142 52712 10756 55737 13020 119 112 10236 23946 123 112 10134 67420 117 10111 10261 15362 10105 11356 112 42593 30222 10165 112 117 10464 10108 10105 10992 56237 41939 10627 14204 39129 17038 119 13646 112 187 13448 117 13028 16745 23251 10107 117 112 42593 30222 10165 112 23123 117 10111 11951 10435 21852 10271 119 133 33989 120 135 133 33989 120 135 156 10454 17724 69699 11044 15432 117 169 11775 48022 93274 10165 10479 10124 112 12820 48780 112 10226 12898 12221 10841 10261 10124 22151 10114 15217 169 11775 20104 117 21699 40518 113 10631 155 46921 10165 114 117 40960 10106 169 48022 10135 169 102


INFO:tensorflow:input_ids: 101 146 16938 112 188 27874 146 112 22469 17038 49151 10105 11250 10112 10142 52712 10756 55737 13020 119 112 10236 23946 123 112 10134 67420 117 10111 10261 15362 10105 11356 112 42593 30222 10165 112 117 10464 10108 10105 10992 56237 41939 10627 14204 39129 17038 119 13646 112 187 13448 117 13028 16745 23251 10107 117 112 42593 30222 10165 112 23123 117 10111 11951 10435 21852 10271 119 133 33989 120 135 133 33989 120 135 156 10454 17724 69699 11044 15432 117 169 11775 48022 93274 10165 10479 10124 112 12820 48780 112 10226 12898 12221 10841 10261 10124 22151 10114 15217 169 11775 20104 117 21699 40518 113 10631 155 46921 10165 114 117 40960 10106 169 48022 10135 169 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] I saw this movie with my family and it was great ! This film is more than just a documentary ( that offers not more than cold facts ) with long mono / - duo ##logue ' s and lots of charts . . . The complete " power " of this movie comes from the impressive pictures being filmed under water , in the air or the Arctic . With watching this movie you can learn more about our planet than with just reading a book , it shows that W ##E are em ##bedded in the circular flow of life . This movie is not only for " environmental fan ##atic ##s " although people that want to look a good movie with a [SEP]


INFO:tensorflow:tokens: [CLS] I saw this movie with my family and it was great ! This film is more than just a documentary ( that offers not more than cold facts ) with long mono / - duo ##logue ' s and lots of charts . . . The complete " power " of this movie comes from the impressive pictures being filmed under water , in the air or the Arctic . With watching this movie you can learn more about our planet than with just reading a book , it shows that W ##E are em ##bedded in the circular flow of life . This movie is not only for " environmental fan ##atic ##s " although people that want to look a good movie with a [SEP]


INFO:tensorflow:input_ids: 101 146 17112 10531 18379 10169 15127 11365 10111 10271 10134 14772 106 10747 10458 10124 10798 11084 12820 169 27838 113 10189 23818 10472 10798 11084 41626 73367 114 10169 11695 70997 120 118 23000 40609 112 187 10111 87202 10108 32171 119 119 119 10117 17876 107 13183 107 10108 10531 18379 21405 10188 10105 80914 54156 11223 43729 10571 12286 117 10106 10105 12566 10345 10105 46910 119 12613 84532 10531 18379 13028 10944 42671 10798 10978 17446 15690 11084 10169 12820 32432 169 12748 117 10271 15573 10189 160 11259 10301 10266 85727 10106 10105 35961 30676 10108 12103 119 10747 18379 10124 10472 10893 10142 107 32704 10862 69252 10107 107 14779 11426 10189 21528 10114 25157 169 15198 18379 10169 169 102


INFO:tensorflow:input_ids: 101 146 17112 10531 18379 10169 15127 11365 10111 10271 10134 14772 106 10747 10458 10124 10798 11084 12820 169 27838 113 10189 23818 10472 10798 11084 41626 73367 114 10169 11695 70997 120 118 23000 40609 112 187 10111 87202 10108 32171 119 119 119 10117 17876 107 13183 107 10108 10531 18379 21405 10188 10105 80914 54156 11223 43729 10571 12286 117 10106 10105 12566 10345 10105 46910 119 12613 84532 10531 18379 13028 10944 42671 10798 10978 17446 15690 11084 10169 12820 32432 169 12748 117 10271 15573 10189 160 11259 10301 10266 85727 10106 10105 35961 30676 10108 12103 119 10747 18379 10124 10472 10893 10142 107 32704 10862 69252 10107 107 14779 11426 10189 21528 10114 25157 169 15198 18379 10169 169 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] A Christmas Story Is A Holiday Classic And My Favorite Movie . So Natural ##ly , I Was Ela ##ted When This Movie Came Out In 1994 . I Saw It Opening Day and Was Pre ##pare ##d To En ##joy Myself . I Came Away Rev ##olt ##ed And Di ##gust ##ed . The Anti ##cipation that Rang True In A Christmas Story Is Cu ##rious ##ly Missing from This mes ##s . A Red Ryder BB Gun Is Better to get than a chi ##nese top . And It Is Not Very Funny At all . Charles G ##rod ##in Is Good but the Buck Stop ##s There . Bottom Line : 1 Star . Don ' t Even Both ##er . [SEP]


INFO:tensorflow:tokens: [CLS] A Christmas Story Is A Holiday Classic And My Favorite Movie . So Natural ##ly , I Was Ela ##ted When This Movie Came Out In 1994 . I Saw It Opening Day and Was Pre ##pare ##d To En ##joy Myself . I Came Away Rev ##olt ##ed And Di ##gust ##ed . The Anti ##cipation that Rang True In A Christmas Story Is Cu ##rious ##ly Missing from This mes ##s . A Red Ryder BB Gun Is Better to get than a chi ##nese top . And It Is Not Very Funny At all . Charles G ##rod ##in Is Good but the Buck Stop ##s There . Bottom Line : 1 Star . Don ' t Even Both ##er . [SEP]


INFO:tensorflow:input_ids: 101 138 17265 14656 12034 138 40205 20542 12689 11590 78511 14785 119 12882 13817 10454 117 146 22034 26271 11912 12242 10747 14785 73206 14504 10167 10444 119 146 74666 10377 76064 12360 10111 22034 35248 28927 10162 11469 10243 107073 88441 119 146 73206 24598 24774 27667 10336 12689 12944 104277 10336 119 10117 26267 101492 10189 28221 24079 10167 138 17265 14656 12034 34387 37789 10454 65182 10188 10747 17954 10107 119 138 11641 71379 49622 31328 12034 34961 10114 15329 11084 169 14325 33550 12364 119 12689 10377 12034 16040 37282 83852 11699 10435 119 10925 144 46114 10245 12034 13073 10473 10105 40477 29195 10107 11723 119 84358 14357 131 122 11836 119 11740 112 188 28140 20973 10165 119 102 0 0


INFO:tensorflow:input_ids: 101 138 17265 14656 12034 138 40205 20542 12689 11590 78511 14785 119 12882 13817 10454 117 146 22034 26271 11912 12242 10747 14785 73206 14504 10167 10444 119 146 74666 10377 76064 12360 10111 22034 35248 28927 10162 11469 10243 107073 88441 119 146 73206 24598 24774 27667 10336 12689 12944 104277 10336 119 10117 26267 101492 10189 28221 24079 10167 138 17265 14656 12034 34387 37789 10454 65182 10188 10747 17954 10107 119 138 11641 71379 49622 31328 12034 34961 10114 15329 11084 169 14325 33550 12364 119 12689 10377 12034 16040 37282 83852 11699 10435 119 10925 144 46114 10245 12034 13073 10473 10105 40477 29195 10107 11723 119 84358 14357 131 122 11836 119 11740 112 188 28140 20973 10165 119 102 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] Two years ago I watched " The Mata ##dor " in cinema and I loved everything about this movie . Ob ##vio ##usly , I was totally under impression of Pierce Bros ##an ' s mag ##nificent role . Yesterday , I caught this movie again on TV so I looked at it a bit deep ##er . Now , I can say with certain that this movie isn ' t that special but you just got ##ta ' love it because of one man . < br / > < br / > Bros ##nan lift ##s its grade up in my opinion with ama ##zing performance of Julian Noble , tire ##d hit - man who has no friends . Soon Julian meets Danny [SEP]


INFO:tensorflow:tokens: [CLS] Two years ago I watched " The Mata ##dor " in cinema and I loved everything about this movie . Ob ##vio ##usly , I was totally under impression of Pierce Bros ##an ' s mag ##nificent role . Yesterday , I caught this movie again on TV so I looked at it a bit deep ##er . Now , I can say with certain that this movie isn ' t that special but you just got ##ta ' love it because of one man . < br / > < br / > Bros ##nan lift ##s its grade up in my opinion with ama ##zing performance of Julian Noble , tire ##d hit - man who has no friends . Soon Julian meets Danny [SEP]


INFO:tensorflow:input_ids: 101 13214 10855 36390 146 92147 107 10117 38373 11849 107 10106 18458 10111 146 82321 42536 10978 10531 18379 119 43019 18574 61289 117 146 10134 110240 10571 59513 10108 38581 23844 10206 112 187 20722 97026 12971 119 86073 117 146 39797 10531 18379 13123 10135 10813 10380 146 59822 10160 10271 169 17684 26591 10165 119 17121 117 146 10944 23763 10169 16620 10189 10531 18379 98370 112 188 10189 14478 10473 13028 12820 19556 10213 112 16138 10271 12373 10108 10464 10817 119 133 33989 120 135 133 33989 120 135 23844 13470 63376 10107 10474 21958 10741 10106 15127 32282 10169 28149 19308 14432 10108 23154 43994 117 71841 10162 14946 118 10817 10479 10393 10192 21997 119 40456 23154 40427 20340 102


INFO:tensorflow:input_ids: 101 13214 10855 36390 146 92147 107 10117 38373 11849 107 10106 18458 10111 146 82321 42536 10978 10531 18379 119 43019 18574 61289 117 146 10134 110240 10571 59513 10108 38581 23844 10206 112 187 20722 97026 12971 119 86073 117 146 39797 10531 18379 13123 10135 10813 10380 146 59822 10160 10271 169 17684 26591 10165 119 17121 117 146 10944 23763 10169 16620 10189 10531 18379 98370 112 188 10189 14478 10473 13028 12820 19556 10213 112 16138 10271 12373 10108 10464 10817 119 133 33989 120 135 133 33989 120 135 23844 13470 63376 10107 10474 21958 10741 10106 15127 32282 10169 28149 19308 14432 10108 23154 43994 117 71841 10162 14946 118 10817 10479 10393 10192 21997 119 40456 23154 40427 20340 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] Boy , this was one lo ##us ##y movie ! While I haven ' t seen all of the Burton / Taylor collaboration ##s , I can say with confidence that this is the worst . This rich but ill woman ( Taylor , of course ) owns this beautiful island in the Medi ##tter ##anea ##n , ruling over a put - upon staff when she ' s suddenly visited by this traveling poet , who mouth ##s plat ##itude ##s . In fact , the whole film is just a talk fest , with much of the talk making no sense . Even in 1968 , no one could make heads or tail ##s of this pret ##enti ##ous non ##sens ##e , and [SEP]


INFO:tensorflow:tokens: [CLS] Boy , this was one lo ##us ##y movie ! While I haven ' t seen all of the Burton / Taylor collaboration ##s , I can say with confidence that this is the worst . This rich but ill woman ( Taylor , of course ) owns this beautiful island in the Medi ##tter ##anea ##n , ruling over a put - upon staff when she ' s suddenly visited by this traveling poet , who mouth ##s plat ##itude ##s . In fact , the whole film is just a talk fest , with much of the talk making no sense . Even in 1968 , no one could make heads or tail ##s of this pret ##enti ##ous non ##sens ##e , and [SEP]


INFO:tensorflow:input_ids: 101 15384 117 10531 10134 10464 10406 10251 10157 18379 106 14600 146 65000 112 188 15652 10435 10108 10105 15514 120 13399 23522 10107 117 146 10944 23763 10169 74187 10189 10531 10124 10105 62006 119 10747 33250 10473 32941 18299 113 13399 117 10108 15348 114 76282 10531 42235 17354 10106 10105 41512 18413 88734 10115 117 50788 10491 169 14499 118 15378 18927 10841 10833 112 187 80263 30270 10155 10531 70780 24633 117 10479 42213 10107 48740 59586 10107 119 10167 18638 117 10105 21047 10458 10124 12820 169 31311 34519 117 10169 13172 10108 10105 31311 14293 10192 15495 119 28140 10106 10698 117 10192 10464 12174 13086 42399 10345 48497 10107 10108 10531 49775 21688 13499 10446 59077 10112 117 10111 102


INFO:tensorflow:input_ids: 101 15384 117 10531 10134 10464 10406 10251 10157 18379 106 14600 146 65000 112 188 15652 10435 10108 10105 15514 120 13399 23522 10107 117 146 10944 23763 10169 74187 10189 10531 10124 10105 62006 119 10747 33250 10473 32941 18299 113 13399 117 10108 15348 114 76282 10531 42235 17354 10106 10105 41512 18413 88734 10115 117 50788 10491 169 14499 118 15378 18927 10841 10833 112 187 80263 30270 10155 10531 70780 24633 117 10479 42213 10107 48740 59586 10107 119 10167 18638 117 10105 21047 10458 10124 12820 169 31311 34519 117 10169 13172 10108 10105 31311 14293 10192 15495 119 28140 10106 10698 117 10192 10464 12174 13086 42399 10345 48497 10107 10108 10531 49775 21688 13499 10446 59077 10112 117 10111 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] It starts out looking like it may be going some ##where , then quickly leads the main characters into a three - ring ci ##rc ##us of remarkable st ##upi ##dity which permanently destroy ##s any lika ##bility of the characters . I ' m a huge collector of stone ##r movies , but this is something I would not consider a valid addition . Bon ##g Water is tras ##h from the very deep ##est regions of the dum ##ps ##ter , and I would ##n ' t be caught dead with this on my she ##lf . I ' m actually convinced this movie was created by a Partnership for A Drug Free America . If you ' re a Jack Black fan then [SEP]


INFO:tensorflow:tokens: [CLS] It starts out looking like it may be going some ##where , then quickly leads the main characters into a three - ring ci ##rc ##us of remarkable st ##upi ##dity which permanently destroy ##s any lika ##bility of the characters . I ' m a huge collector of stone ##r movies , but this is something I would not consider a valid addition . Bon ##g Water is tras ##h from the very deep ##est regions of the dum ##ps ##ter , and I would ##n ' t be caught dead with this on my she ##lf . I ' m actually convinced this movie was created by a Partnership for A Drug Free America . If you ' re a Jack Black fan then [SEP]


INFO:tensorflow:input_ids: 101 10377 33039 10950 34279 11850 10271 11387 10347 19090 11152 30935 117 11059 23590 34868 10105 12126 19174 10708 169 11003 118 21550 11322 46382 10251 10108 88916 28780 90695 100060 10319 76494 59792 10107 11178 64992 20838 10108 10105 19174 119 146 112 181 169 42126 101248 10108 23905 10129 39129 117 10473 10531 10124 26133 146 10894 10472 44856 169 64999 14763 119 30120 10240 17702 10124 14807 10237 10188 10105 12558 26591 13051 21721 10108 10105 54892 13221 10877 117 10111 146 10894 10115 112 188 10347 39797 23457 10169 10531 10135 15127 10833 35173 119 146 112 181 24376 71869 10531 18379 10134 13745 10155 169 101476 10142 138 33977 16122 11440 119 14535 13028 112 11639 169 12342 11750 10862 11059 102


INFO:tensorflow:input_ids: 101 10377 33039 10950 34279 11850 10271 11387 10347 19090 11152 30935 117 11059 23590 34868 10105 12126 19174 10708 169 11003 118 21550 11322 46382 10251 10108 88916 28780 90695 100060 10319 76494 59792 10107 11178 64992 20838 10108 10105 19174 119 146 112 181 169 42126 101248 10108 23905 10129 39129 117 10473 10531 10124 26133 146 10894 10472 44856 169 64999 14763 119 30120 10240 17702 10124 14807 10237 10188 10105 12558 26591 13051 21721 10108 10105 54892 13221 10877 117 10111 146 10894 10115 112 188 10347 39797 23457 10169 10531 10135 15127 10833 35173 119 146 112 181 24376 71869 10531 18379 10134 13745 10155 169 101476 10142 138 33977 16122 11440 119 14535 13028 112 11639 169 12342 11750 10862 11059 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] One of the dum ##bes ##t movies in the history of cinema . Wait , I take that back - - this movie can ' t be included in any category related to " cinema " ; it belongs in categories like " waste " , tur ##ds , or similar categories . Iron ##ically , it ' s even _ about _ two gar ##ba ##gem ##en . The movie is " Men At Work " , a light ##weight crime comedy starring the Esteve ##z ( She ##en ) brothers from 1990 . Set ##ting aside the asi ##nine and im ##pla ##usi ##ble plot line , bad acting , bad dialogue , poorly executed st ##unt ##s and sl ##aps ##tic ##k , [SEP]


INFO:tensorflow:tokens: [CLS] One of the dum ##bes ##t movies in the history of cinema . Wait , I take that back - - this movie can ' t be included in any category related to " cinema " ; it belongs in categories like " waste " , tur ##ds , or similar categories . Iron ##ically , it ' s even _ about _ two gar ##ba ##gem ##en . The movie is " Men At Work " , a light ##weight crime comedy starring the Esteve ##z ( She ##en ) brothers from 1990 . Set ##ting aside the asi ##nine and im ##pla ##usi ##ble plot line , bad acting , bad dialogue , poorly executed st ##unt ##s and sl ##aps ##tic ##k , [SEP]


INFO:tensorflow:input_ids: 101 11340 10108 10105 54892 16216 10123 39129 10106 10105 11486 10108 18458 119 87519 117 146 13574 10189 12014 118 118 10531 18379 10944 112 188 10347 12742 10106 11178 29737 16382 10114 107 18458 107 132 10271 61437 10106 43398 11850 107 59158 107 117 32461 13268 117 10345 13213 43398 119 19247 52917 117 10271 112 187 13246 168 10978 168 10551 47243 10537 20531 10136 119 10117 18379 10124 107 13026 11699 25641 107 117 169 15765 31869 22564 25737 27519 10105 110062 10305 113 11149 10136 114 28764 10188 10420 119 14245 12141 95167 10105 21744 65528 10111 10211 48590 15780 11203 32473 12117 117 15838 25086 117 15838 51077 117 93530 45955 28780 20631 10107 10111 38523 76591 13275 10174 117 102


INFO:tensorflow:input_ids: 101 11340 10108 10105 54892 16216 10123 39129 10106 10105 11486 10108 18458 119 87519 117 146 13574 10189 12014 118 118 10531 18379 10944 112 188 10347 12742 10106 11178 29737 16382 10114 107 18458 107 132 10271 61437 10106 43398 11850 107 59158 107 117 32461 13268 117 10345 13213 43398 119 19247 52917 117 10271 112 187 13246 168 10978 168 10551 47243 10537 20531 10136 119 10117 18379 10124 107 13026 11699 25641 107 117 169 15765 31869 22564 25737 27519 10105 110062 10305 113 11149 10136 114 28764 10188 10420 119 14245 12141 95167 10105 21744 65528 10111 10211 48590 15780 11203 32473 12117 117 15838 25086 117 15838 51077 117 93530 45955 28780 20631 10107 10111 38523 76591 13275 10174 117 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] This was the very first movie I ever saw in my life back in 1974 or 1975 . I was 4 years old at the time and saw it at a drive - in theatre . I did not gras ##p that this would be a classic at the time ( I went to sleep about twenty minutes into the movie ) . After seeing it on the television - along with two of my other favourite movies Car Was ##h ( my favourite movie ) and The Wi ##z which seemed to come on every year about the same time all together - about 40 , 50 , 75 times I knew that here was a movie that I would have as one of my [SEP]


INFO:tensorflow:tokens: [CLS] This was the very first movie I ever saw in my life back in 1974 or 1975 . I was 4 years old at the time and saw it at a drive - in theatre . I did not gras ##p that this would be a classic at the time ( I went to sleep about twenty minutes into the movie ) . After seeing it on the television - along with two of my other favourite movies Car Was ##h ( my favourite movie ) and The Wi ##z which seemed to come on every year about the same time all together - about 40 , 50 , 75 times I knew that here was a movie that I would have as one of my [SEP]


INFO:tensorflow:input_ids: 101 10747 10134 10105 12558 10422 18379 146 17038 17112 10106 15127 12103 12014 10106 10723 10345 10665 119 146 10134 125 10855 12898 10160 10105 10635 10111 17112 10271 10160 169 23806 118 10106 28016 119 146 12172 10472 90097 10410 10189 10531 10894 10347 169 36592 10160 10105 10635 113 146 13446 10114 63658 10978 26051 15304 10708 10105 18379 114 119 11301 57039 10271 10135 10105 14162 118 12400 10169 10551 10108 15127 10684 80494 39129 23962 22034 10237 113 15127 80494 18379 114 10111 10117 52742 10305 10319 64676 10114 10678 10135 14234 10924 10978 10105 11561 10635 10435 14229 118 10978 10533 117 10462 117 11417 13465 146 46000 10189 19353 10134 169 18379 10189 146 10894 10529 10146 10464 10108 15127 102


INFO:tensorflow:input_ids: 101 10747 10134 10105 12558 10422 18379 146 17038 17112 10106 15127 12103 12014 10106 10723 10345 10665 119 146 10134 125 10855 12898 10160 10105 10635 10111 17112 10271 10160 169 23806 118 10106 28016 119 146 12172 10472 90097 10410 10189 10531 10894 10347 169 36592 10160 10105 10635 113 146 13446 10114 63658 10978 26051 15304 10708 10105 18379 114 119 11301 57039 10271 10135 10105 14162 118 12400 10169 10551 10108 15127 10684 80494 39129 23962 22034 10237 113 15127 80494 18379 114 10111 10117 52742 10305 10319 64676 10114 10678 10135 14234 10924 10978 10105 11561 10635 10435 14229 118 10978 10533 117 10462 117 11417 13465 146 46000 10189 19353 10134 169 18379 10189 146 10894 10529 10146 10464 10108 15127 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] De ##kal ##og Five was an interesting view ##ing experience for me , because of the question Ki ##es ##low ##ski seems to sub ##tly ask the audience . Three men are the focus of this chapter , and Ki ##es ##low ##ski present the two involved in murder with traits both good and bad ( In one ' s case , almost over ##w ##hel ##ming ##ly bad ) . With such vile characters , I found my ##self almost g ##lad that they would receive some sort of punishment . However , when the time comes for the murder ( And it ' s subsequent effect on the murder ##er ) , Ki ##es ##low ##ski takes an interesting angle and seems to ask [SEP]


INFO:tensorflow:tokens: [CLS] De ##kal ##og Five was an interesting view ##ing experience for me , because of the question Ki ##es ##low ##ski seems to sub ##tly ask the audience . Three men are the focus of this chapter , and Ki ##es ##low ##ski present the two involved in murder with traits both good and bad ( In one ' s case , almost over ##w ##hel ##ming ##ly bad ) . With such vile characters , I found my ##self almost g ##lad that they would receive some sort of punishment . However , when the time comes for the murder ( And it ' s subsequent effect on the murder ##er ) , Ki ##es ##low ##ski takes an interesting angle and seems to ask [SEP]


INFO:tensorflow:input_ids: 101 10190 17463 12717 19268 10134 10151 64888 17904 10230 20627 10142 10911 117 12373 10108 10105 20210 28941 10171 27863 11401 34208 10114 13987 69253 63001 10105 26070 119 15139 10588 10301 10105 23195 10108 10531 39486 117 10111 28941 10171 27863 11401 12254 10105 10551 16247 10106 29448 10169 68986 11408 15198 10111 15838 113 10167 10464 112 187 13474 117 17122 10491 10874 31572 16405 10454 15838 114 119 12613 11049 77017 19174 117 146 11823 15127 43310 17122 175 19505 10189 10689 10894 26286 11152 20363 10108 80149 119 12209 117 10841 10105 10635 21405 10142 10105 29448 113 12689 10271 112 187 30335 18514 10135 10105 29448 10165 114 117 28941 10171 27863 11401 19135 10151 64888 30891 10111 34208 10114 63001 102


INFO:tensorflow:input_ids: 101 10190 17463 12717 19268 10134 10151 64888 17904 10230 20627 10142 10911 117 12373 10108 10105 20210 28941 10171 27863 11401 34208 10114 13987 69253 63001 10105 26070 119 15139 10588 10301 10105 23195 10108 10531 39486 117 10111 28941 10171 27863 11401 12254 10105 10551 16247 10106 29448 10169 68986 11408 15198 10111 15838 113 10167 10464 112 187 13474 117 17122 10491 10874 31572 16405 10454 15838 114 119 12613 11049 77017 19174 117 146 11823 15127 43310 17122 175 19505 10189 10689 10894 26286 11152 20363 10108 80149 119 12209 117 10841 10105 10635 21405 10142 10105 29448 113 12689 10271 112 187 30335 18514 10135 10105 29448 10165 114 117 28941 10171 27863 11401 19135 10151 64888 30891 10111 34208 10114 63001 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] movie I have ever seen . Act ##ually I find it one of the more enter ##taining episodes of MS ##T 3000 I have seen . Not that it was good , but for anyone who has seen Man ##os : the Hands of Fate knows this one wasn ' t two bad . The monster in the movie looked terrible , everyone wore upset ##ting s ##wim suit ##s , and the plot was lau ##gha ##ble . I still don ' t have a c ##lue as to why they made the monster , they never really gave a good reason . The lead female had to be the s ##cra ##wnie ##st gal I have ever seen . They would have done better [SEP]


INFO:tensorflow:tokens: [CLS] movie I have ever seen . Act ##ually I find it one of the more enter ##taining episodes of MS ##T 3000 I have seen . Not that it was good , but for anyone who has seen Man ##os : the Hands of Fate knows this one wasn ' t two bad . The monster in the movie looked terrible , everyone wore upset ##ting s ##wim suit ##s , and the plot was lau ##gha ##ble . I still don ' t have a c ##lue as to why they made the monster , they never really gave a good reason . The lead female had to be the s ##cra ##wnie ##st gal I have ever seen . They would have done better [SEP]


INFO:tensorflow:input_ids: 101 18379 146 10529 17038 15652 119 13968 79090 146 17860 10271 10464 10108 10105 10798 31006 70700 23604 10108 21018 11090 15335 146 10529 15652 119 16040 10189 10271 10134 15198 117 10473 10142 51747 10479 10393 15652 11343 10310 131 10105 50526 10108 49955 75354 10531 10464 65390 112 188 10551 15838 119 10117 76343 10106 10105 18379 59822 70032 117 48628 80602 96213 12141 187 80217 26315 10107 117 10111 10105 32473 10134 27207 102121 11203 119 146 12647 16938 112 188 10529 169 171 75483 10146 10114 31237 10689 11019 10105 76343 117 10689 14794 30181 15362 169 15198 27949 119 10117 14107 16762 10374 10114 10347 10105 187 40333 102622 10562 79332 146 10529 17038 15652 119 11696 10894 10529 20378 18322 102


INFO:tensorflow:input_ids: 101 18379 146 10529 17038 15652 119 13968 79090 146 17860 10271 10464 10108 10105 10798 31006 70700 23604 10108 21018 11090 15335 146 10529 15652 119 16040 10189 10271 10134 15198 117 10473 10142 51747 10479 10393 15652 11343 10310 131 10105 50526 10108 49955 75354 10531 10464 65390 112 188 10551 15838 119 10117 76343 10106 10105 18379 59822 70032 117 48628 80602 96213 12141 187 80217 26315 10107 117 10111 10105 32473 10134 27207 102121 11203 119 146 12647 16938 112 188 10529 169 171 75483 10146 10114 31237 10689 11019 10105 76343 117 10689 14794 30181 15362 169 15198 27949 119 10117 14107 16762 10374 10114 10347 10105 187 40333 102622 10562 79332 146 10529 17038 15652 119 11696 10894 10529 20378 18322 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


#Creating a model

Now that we've prepared our data, let's focus on building a model. `create_model` does just this below. First, it loads the BERT tf hub module again (this time to extract the computation graph). Next, it creates a single new layer that will be trained to adapt BERT to our sentiment task (i.e. classifying whether a movie review is positive or negative). This strategy of using a mostly trained model is called [fine-tuning](http://wiki.fast.ai/index.php/Fine_tuning).

In [0]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)


Next we'll wrap our model function in a `model_fn_builder` function that adapts our model to work for training, evaluation, and prediction.

In [0]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        f1_score = tf.contrib.metrics.f1_score(
            label_ids,
            predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        return {
            "eval_accuracy": accuracy,
            "f1_score": f1_score,
            "auc": auc,
            "precision": precision,
            "recall": recall,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
        }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


In [0]:
# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 500
SAVE_SUMMARY_STEPS = 100

In [0]:
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [0]:
# Specify outpit directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [19]:
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})


INFO:tensorflow:Using config: {'_model_dir': 'gs://translation310/outputdir', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f3c8a46e0b8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


INFO:tensorflow:Using config: {'_model_dir': 'gs://translation310/outputdir', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f3c8a46e0b8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


Next we create an input builder function that takes our training feature set (`train_features`) and produces a generator. This is a pretty standard design pattern for working with Tensorflow [Estimators](https://www.tensorflow.org/guide/estimators).

In [0]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

Now we train our model! For me, using a Colab notebook running on Google's GPUs, my training time was about 14 minutes.

In [21]:
print(f'Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.


Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.




















Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where


Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into gs://translation310/outputdir/model.ckpt.


INFO:tensorflow:Saving checkpoints for 0 into gs://translation310/outputdir/model.ckpt.


INFO:tensorflow:loss = 0.7001082, step = 0


INFO:tensorflow:loss = 0.7001082, step = 0


INFO:tensorflow:global_step/sec: 1.5515


INFO:tensorflow:global_step/sec: 1.5515


INFO:tensorflow:loss = 0.4813712, step = 100 (64.455 sec)


INFO:tensorflow:loss = 0.4813712, step = 100 (64.455 sec)


INFO:tensorflow:global_step/sec: 2.00766


INFO:tensorflow:global_step/sec: 2.00766


INFO:tensorflow:loss = 0.39755067, step = 200 (49.809 sec)


INFO:tensorflow:loss = 0.39755067, step = 200 (49.809 sec)


INFO:tensorflow:global_step/sec: 2.00949


INFO:tensorflow:global_step/sec: 2.00949


INFO:tensorflow:loss = 0.326311, step = 300 (49.765 sec)


INFO:tensorflow:loss = 0.326311, step = 300 (49.765 sec)


INFO:tensorflow:global_step/sec: 1.92355


INFO:tensorflow:global_step/sec: 1.92355


INFO:tensorflow:loss = 0.18694788, step = 400 (51.986 sec)


INFO:tensorflow:loss = 0.18694788, step = 400 (51.986 sec)


INFO:tensorflow:Saving checkpoints for 468 into gs://translation310/outputdir/model.ckpt.


INFO:tensorflow:Saving checkpoints for 468 into gs://translation310/outputdir/model.ckpt.


INFO:tensorflow:Loss for final step: 0.0064707845.


INFO:tensorflow:Loss for final step: 0.0064707845.


Training took time  0:07:18.931698


Now let's use our test data to see how well our model did:

In [0]:
test_input_fn = run_classifier.input_fn_builder(
    features=test_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

In [23]:
estimator.evaluate(input_fn=test_input_fn, steps=None)

INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Starting evaluation at 2020-02-04T07:16:49Z


INFO:tensorflow:Starting evaluation at 2020-02-04T07:16:49Z


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Restoring parameters from gs://translation310/outputdir/model.ckpt-468


INFO:tensorflow:Restoring parameters from gs://translation310/outputdir/model.ckpt-468


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Finished evaluation at 2020-02-04-07:19:33


INFO:tensorflow:Finished evaluation at 2020-02-04-07:19:33


INFO:tensorflow:Saving dict for global step 468: auc = 0.8244763, eval_accuracy = 0.8246, f1_score = 0.8286774, false_negatives = 396.0, false_positives = 481.0, global_step = 468, loss = 0.60789233, precision = 0.8151422, recall = 0.84266984, true_negatives = 2002.0, true_positives = 2121.0


INFO:tensorflow:Saving dict for global step 468: auc = 0.8244763, eval_accuracy = 0.8246, f1_score = 0.8286774, false_negatives = 396.0, false_positives = 481.0, global_step = 468, loss = 0.60789233, precision = 0.8151422, recall = 0.84266984, true_negatives = 2002.0, true_positives = 2121.0


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 468: gs://translation310/outputdir/model.ckpt-468


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 468: gs://translation310/outputdir/model.ckpt-468


{'auc': 0.8244763,
 'eval_accuracy': 0.8246,
 'f1_score': 0.8286774,
 'false_negatives': 396.0,
 'false_positives': 481.0,
 'global_step': 468,
 'loss': 0.60789233,
 'precision': 0.8151422,
 'recall': 0.84266984,
 'true_negatives': 2002.0,
 'true_positives': 2121.0}

Now let's write code to make predictions on new sentences:

In [0]:
def getPrediction(in_sentences):
  labels = ["Negative", "Positive"]
  input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, label = 0) for x in in_sentences] # here, "" is just a dummy label
  input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
  predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
  predictions = estimator.predict(predict_input_fn)
  return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]

In [0]:
pred_sentences = [
  "That movie was absolutely awful",
  "The acting was a bit lacking",
  "The film was creative and surprising",
  "Absolutely fantastic!"
]

In [0]:
predictions = getPrediction(pred_sentences)

INFO:tensorflow:Writing example 0 of 4
INFO:tensorflow:*** Example ***
INFO:tensorflow:guid: 
INFO:tensorflow:tokens: [CLS] that movie was absolutely awful [SEP]
INFO:tensorflow:input_ids: 101 2008 3185 2001 7078 9643 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Voila! We have a sentiment classifier!

In [0]:
predictions

[('That movie was absolutely awful',
  array([-4.9142293e-03, -5.3180690e+00], dtype=float32),
  'Negative'),
 ('The acting was a bit lacking',
  array([-0.03325794, -3.4200459 ], dtype=float32),
  'Negative'),
 ('The film was creative and surprising',
  array([-5.3589125e+00, -4.7171740e-03], dtype=float32),
  'Positive'),
 ('Absolutely fantastic!',
  array([-5.0434084 , -0.00647258], dtype=float32),
  'Positive')]