In [0]:
# Copyright 2019 Google Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#Predicting Movie Review Sentiment with BERT on TF Hub

If you’ve been following Natural Language Processing over the past year, you’ve probably heard of BERT: Bidirectional Encoder Representations from Transformers. It’s a neural network architecture designed by Google researchers that’s totally transformed what’s state-of-the-art for NLP tasks, like text classification, translation, summarization, and question answering.

Now that BERT's been added to [TF Hub](https://www.tensorflow.org/hub) as a loadable module, it's easy(ish) to add into existing Tensorflow text pipelines. In an existing pipeline, BERT can replace text embedding layers like ELMO and GloVE. Alternatively, [finetuning](http://wiki.fast.ai/index.php/Fine_tuning) BERT can provide both an accuracy boost and faster training time in many cases.

Here, we'll train a model to predict whether an IMDB movie review is positive or negative using BERT in Tensorflow with tf hub. Some code was adapted from [this colab notebook](https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb). Let's get started!

In [2]:
from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime

In addition to the standard libraries we imported above, we'll need to install BERT's python package.

In [3]:
!pip install bert-tensorflow

Collecting bert-tensorflow
[?25l  Downloading https://files.pythonhosted.org/packages/a6/66/7eb4e8b6ea35b7cc54c322c816f976167a43019750279a8473d355800a93/bert_tensorflow-1.0.1-py2.py3-none-any.whl (67kB)
[K     |████████████████████████████████| 71kB 1.9MB/s eta 0:00:011
Installing collected packages: bert-tensorflow
Successfully installed bert-tensorflow-1.0.1


In [4]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization




Below, we'll set an output directory location to store our model output and checkpoints. This can be a local directory, in which case you'd set OUTPUT_DIR to the name of the directory you'd like to create. If you're running this code in Google's hosted Colab, the directory won't persist after the Colab session ends.

Alternatively, if you're a GCP user, you can store output in a GCP bucket. To do that, set a directory name in OUTPUT_DIR and the name of the GCP bucket in the BUCKET field.

Set DO_DELETE to rewrite the OUTPUT_DIR if it exists. Otherwise, Tensorflow will load existing model checkpoints from that directory (if they exist).

In [5]:
# Set the output directory for saving model file
# Optionally, set a GCP bucket location

OUTPUT_DIR = 'OUTPUT_DIR_NAME'#@param {type:"string"}
#@markdown Whether or not to clear/delete the directory and create a new one
DO_DELETE = False #@param {type:"boolean"}
#@markdown Set USE_BUCKET and BUCKET if you want to (optionally) store model output on GCP bucket.
USE_BUCKET = True #@param {type:"boolean"}
BUCKET = 'BUCKET_NAME' #@param {type:"string"}

if USE_BUCKET:
  OUTPUT_DIR = 'gs://{}/{}'.format(BUCKET, OUTPUT_DIR)
  from google.colab import auth
  auth.authenticate_user()

if DO_DELETE:
  try:
    tf.gfile.DeleteRecursively(OUTPUT_DIR)
  except:
    # Doesn't matter if the directory didn't exist
    pass
tf.gfile.MakeDirs(OUTPUT_DIR)
print('***** Model output directory: {} *****'.format(OUTPUT_DIR))


ModuleNotFoundError: No module named 'google.colab'

#Data

First, let's download the dataset, hosted by Stanford. The code below, which downloads, extracts, and imports the IMDB Large Movie Review Dataset, is borrowed from [this Tensorflow tutorial](https://www.tensorflow.org/hub/tutorials/text_classification_with_tf_hub).

In [6]:
from tensorflow import keras
import os
import re

# Load all files from a directory in a DataFrame.
def load_directory_data(directory):
  data = {}
  data["sentence"] = []
  data["sentiment"] = []
  for file_path in os.listdir(directory):
    with tf.gfile.GFile(os.path.join(directory, file_path), "r") as f:
      data["sentence"].append(f.read())
      data["sentiment"].append(re.match("\d+_(\d+)\.txt", file_path).group(1))
  return pd.DataFrame.from_dict(data)

# Merge positive and negative examples, add a polarity column and shuffle.
def load_dataset(directory):
  pos_df = load_directory_data(os.path.join(directory, "pos"))
  neg_df = load_directory_data(os.path.join(directory, "neg"))
  pos_df["polarity"] = 1
  neg_df["polarity"] = 0
  return pd.concat([pos_df, neg_df]).sample(frac=1).reset_index(drop=True)

# Download and process the dataset files.
def download_and_load_datasets(force_download=False):
  dataset = tf.keras.utils.get_file(
      fname="aclImdb.tar.gz", 
      origin="http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz", 
      extract=True)
  
  train_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                       "aclImdb", "train"))
  test_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                      "aclImdb", "test"))
  
  return train_df, test_df


In [7]:
train, test = download_and_load_datasets()

Downloading data from http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz


To keep training fast, we'll take a sample of 5000 train and test examples, respectively.

In [8]:
train = train.sample(5000)
test = test.sample(5000)

In [9]:
train.columns

Index(['sentence', 'sentiment', 'polarity'], dtype='object')

For us, our input data is the 'sentence' column and our label is the 'polarity' column (0, 1 for negative and positive, respecitvely)

In [10]:
DATA_COLUMN = 'sentence'
LABEL_COLUMN = 'polarity'
# label_list is the list of labels, i.e. True, False or 0, 1 or 'dog', 'cat'
label_list = [0, 1]

#Data Preprocessing
We'll need to transform our data into a format BERT understands. This involves two steps. First, we create  `InputExample`'s using the constructor provided in the BERT library.

- `text_a` is the text we want to classify, which in this case, is the `Request` field in our Dataframe. 
- `text_b` is used if we're training a model to understand the relationship between sentences (i.e. is `text_b` a translation of `text_a`? Is `text_b` an answer to the question asked by `text_a`?). This doesn't apply to our task, so we can leave `text_b` blank.
- `label` is the label for our example, i.e. True, False

In [11]:
# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this example
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

test_InputExamples = test.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (but don't worry--this is also included in the Python library):


1. Lowercase our text (if we're using a BERT lowercase model)
2. Tokenize it (i.e. "sally says hi" -> ["sally", "says", "hi"])
3. Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
4. Map our words to indexes using a vocab file that BERT provides
5. Add special "CLS" and "SEP" tokens (see the [readme](https://github.com/google-research/bert))
6. Append "index" and "segment" tokens to each input (see the [BERT paper](https://arxiv.org/pdf/1810.04805.pdf))

Happily, we don't have to worry about most of these details.




To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:

In [12]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore








Great--we just learned that the BERT model we're using expects lowercase data (that's what stored in tokenization_info["do_lower_case"]) and we also loaded BERT's vocab file. We also created a tokenizer, which breaks words into word pieces:

In [14]:
tokenizer.tokenize("This here's an example of using the BERT tokenizer")

['this',
 'here',
 "'",
 's',
 'an',
 'example',
 'of',
 'using',
 'the',
 'bert',
 'token',
 '##izer']

Using our tokenizer, we'll call `run_classifier.convert_examples_to_features` on our InputExamples to convert them into features BERT understands.

In [None]:
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128
# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)







INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] tarzan and jane are living happily in the jungle . some men come looking for ivory and to take jane back to civilization . but jane loves tarzan and refuses to leave . one of the men falls in love with jane and is determined to take her back . . . even if that means killing tarzan . < br / > < br / > this is a ra ##rity - - a sequel that ' s better than the original . " tarzan , the ape man " of 1932 was good but had some dreadful special effects and sort of dragged . this one has much better effects and is a lot more adult . there is tons of b ##lat ##ant [SEP]


INFO:tensorflow:tokens: [CLS] tarzan and jane are living happily in the jungle . some men come looking for ivory and to take jane back to civilization . but jane loves tarzan and refuses to leave . one of the men falls in love with jane and is determined to take her back . . . even if that means killing tarzan . < br / > < br / > this is a ra ##rity - - a sequel that ' s better than the original . " tarzan , the ape man " of 1932 was good but had some dreadful special effects and sort of dragged . this one has much better effects and is a lot more adult . there is tons of b ##lat ##ant [SEP]


INFO:tensorflow:input_ids: 101 24566 1998 4869 2024 2542 11361 1999 1996 8894 1012 2070 2273 2272 2559 2005 11554 1998 2000 2202 4869 2067 2000 10585 1012 2021 4869 7459 24566 1998 10220 2000 2681 1012 2028 1997 1996 2273 4212 1999 2293 2007 4869 1998 2003 4340 2000 2202 2014 2067 1012 1012 1012 2130 2065 2008 2965 4288 24566 1012 1026 7987 1013 1028 1026 7987 1013 1028 2023 2003 1037 10958 15780 1011 1011 1037 8297 2008 1005 1055 2488 2084 1996 2434 1012 1000 24566 1010 1996 23957 2158 1000 1997 4673 2001 2204 2021 2018 2070 21794 2569 3896 1998 4066 1997 7944 1012 2023 2028 2038 2172 2488 3896 1998 2003 1037 2843 2062 4639 1012 2045 2003 6197 1997 1038 20051 4630 102


INFO:tensorflow:input_ids: 101 24566 1998 4869 2024 2542 11361 1999 1996 8894 1012 2070 2273 2272 2559 2005 11554 1998 2000 2202 4869 2067 2000 10585 1012 2021 4869 7459 24566 1998 10220 2000 2681 1012 2028 1997 1996 2273 4212 1999 2293 2007 4869 1998 2003 4340 2000 2202 2014 2067 1012 1012 1012 2130 2065 2008 2965 4288 24566 1012 1026 7987 1013 1028 1026 7987 1013 1028 2023 2003 1037 10958 15780 1011 1011 1037 8297 2008 1005 1055 2488 2084 1996 2434 1012 1000 24566 1010 1996 23957 2158 1000 1997 4673 2001 2204 2021 2018 2070 21794 2569 3896 1998 4066 1997 7944 1012 2023 2028 2038 2172 2488 3896 1998 2003 1037 2843 2062 4639 1012 2045 2003 6197 1997 1038 20051 4630 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] this movie wasn ' t that bad when compared to the first two sequels to the original . it ' s directed by martin kit ##ross ##er of friday the 13th fame . the acting is very bad indeed , but the gore and special effects help make it interesting . that ##s one thing i like about screaming mad george ( make up effects artist for the film ) , his effects are so off - the - wall and bizarre that they will keep you watching a bad movie just to find out how crazy they ' re gonna get . the movie isn ' t really all that go ##ry , but there is an extremely nasty eye ##ball - mu ##nch ##ing [SEP]


INFO:tensorflow:tokens: [CLS] this movie wasn ' t that bad when compared to the first two sequels to the original . it ' s directed by martin kit ##ross ##er of friday the 13th fame . the acting is very bad indeed , but the gore and special effects help make it interesting . that ##s one thing i like about screaming mad george ( make up effects artist for the film ) , his effects are so off - the - wall and bizarre that they will keep you watching a bad movie just to find out how crazy they ' re gonna get . the movie isn ' t really all that go ##ry , but there is an extremely nasty eye ##ball - mu ##nch ##ing [SEP]


INFO:tensorflow:input_ids: 101 2023 3185 2347 1005 1056 2008 2919 2043 4102 2000 1996 2034 2048 25815 2000 1996 2434 1012 2009 1005 1055 2856 2011 3235 8934 25725 2121 1997 5958 1996 6122 4476 1012 1996 3772 2003 2200 2919 5262 1010 2021 1996 13638 1998 2569 3896 2393 2191 2009 5875 1012 2008 2015 2028 2518 1045 2066 2055 7491 5506 2577 1006 2191 2039 3896 3063 2005 1996 2143 1007 1010 2010 3896 2024 2061 2125 1011 1996 1011 2813 1998 13576 2008 2027 2097 2562 2017 3666 1037 2919 3185 2074 2000 2424 2041 2129 4689 2027 1005 2128 6069 2131 1012 1996 3185 3475 1005 1056 2428 2035 2008 2175 2854 1010 2021 2045 2003 2019 5186 11808 3239 7384 1011 14163 12680 2075 102


INFO:tensorflow:input_ids: 101 2023 3185 2347 1005 1056 2008 2919 2043 4102 2000 1996 2034 2048 25815 2000 1996 2434 1012 2009 1005 1055 2856 2011 3235 8934 25725 2121 1997 5958 1996 6122 4476 1012 1996 3772 2003 2200 2919 5262 1010 2021 1996 13638 1998 2569 3896 2393 2191 2009 5875 1012 2008 2015 2028 2518 1045 2066 2055 7491 5506 2577 1006 2191 2039 3896 3063 2005 1996 2143 1007 1010 2010 3896 2024 2061 2125 1011 1996 1011 2813 1998 13576 2008 2027 2097 2562 2017 3666 1037 2919 3185 2074 2000 2424 2041 2129 4689 2027 1005 2128 6069 2131 1012 1996 3185 3475 1005 1056 2428 2035 2008 2175 2854 1010 2021 2045 2003 2019 5186 11808 3239 7384 1011 14163 12680 2075 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] some genre films need to be dressed up . this one was an exception . taken on its own merit , it ' s a dressed down version of the horror genre film . with minimal special effects , it manages to be a psychological study of sorts , with a simple yet existent ##ial theme - who gets hit by the bus , and why her ? it ' s not a great film , yet because there is little con ##tri ##ved about it , the story works . subtle , and all about the interactions of the characters . actually , there is one con ##tri ##vance in the opening scenes , but it may have been placed there to simply set the [SEP]


INFO:tensorflow:tokens: [CLS] some genre films need to be dressed up . this one was an exception . taken on its own merit , it ' s a dressed down version of the horror genre film . with minimal special effects , it manages to be a psychological study of sorts , with a simple yet existent ##ial theme - who gets hit by the bus , and why her ? it ' s not a great film , yet because there is little con ##tri ##ved about it , the story works . subtle , and all about the interactions of the characters . actually , there is one con ##tri ##vance in the opening scenes , but it may have been placed there to simply set the [SEP]


INFO:tensorflow:input_ids: 101 2070 6907 3152 2342 2000 2022 5102 2039 1012 2023 2028 2001 2019 6453 1012 2579 2006 2049 2219 7857 1010 2009 1005 1055 1037 5102 2091 2544 1997 1996 5469 6907 2143 1012 2007 10124 2569 3896 1010 2009 9020 2000 2022 1037 8317 2817 1997 11901 1010 2007 1037 3722 2664 25953 4818 4323 1011 2040 4152 2718 2011 1996 3902 1010 1998 2339 2014 1029 2009 1005 1055 2025 1037 2307 2143 1010 2664 2138 2045 2003 2210 9530 18886 7178 2055 2009 1010 1996 2466 2573 1012 11259 1010 1998 2035 2055 1996 10266 1997 1996 3494 1012 2941 1010 2045 2003 2028 9530 18886 21789 1999 1996 3098 5019 1010 2021 2009 2089 2031 2042 2872 2045 2000 3432 2275 1996 102


INFO:tensorflow:input_ids: 101 2070 6907 3152 2342 2000 2022 5102 2039 1012 2023 2028 2001 2019 6453 1012 2579 2006 2049 2219 7857 1010 2009 1005 1055 1037 5102 2091 2544 1997 1996 5469 6907 2143 1012 2007 10124 2569 3896 1010 2009 9020 2000 2022 1037 8317 2817 1997 11901 1010 2007 1037 3722 2664 25953 4818 4323 1011 2040 4152 2718 2011 1996 3902 1010 1998 2339 2014 1029 2009 1005 1055 2025 1037 2307 2143 1010 2664 2138 2045 2003 2210 9530 18886 7178 2055 2009 1010 1996 2466 2573 1012 11259 1010 1998 2035 2055 1996 10266 1997 1996 3494 1012 2941 1010 2045 2003 2028 9530 18886 21789 1999 1996 3098 5019 1010 2021 2009 2089 2031 2042 2872 2045 2000 3432 2275 1996 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] with the advent of the im ##db , this overlooked movie can now find an interested audience . why ? because users here who do a search on two - time academy award winner glen ##da jackson can find ' the return of the soldier ' among her credits . so can those checking out oscar winner julie christie . fans of ann - mar ##gre ##t can give the title a click , as will those looking into the career of the great alan bates . not to mention the added bonus of a movie with supporting heavyweight ##s ian holm and frank fin ##lay . any movie with so many notable ##s in it is rewarded by the im ##db , given all the [SEP]


INFO:tensorflow:tokens: [CLS] with the advent of the im ##db , this overlooked movie can now find an interested audience . why ? because users here who do a search on two - time academy award winner glen ##da jackson can find ' the return of the soldier ' among her credits . so can those checking out oscar winner julie christie . fans of ann - mar ##gre ##t can give the title a click , as will those looking into the career of the great alan bates . not to mention the added bonus of a movie with supporting heavyweight ##s ian holm and frank fin ##lay . any movie with so many notable ##s in it is rewarded by the im ##db , given all the [SEP]


INFO:tensorflow:input_ids: 101 2007 1996 13896 1997 1996 10047 18939 1010 2023 17092 3185 2064 2085 2424 2019 4699 4378 1012 2339 1029 2138 5198 2182 2040 2079 1037 3945 2006 2048 1011 2051 2914 2400 3453 8904 2850 4027 2064 2424 1005 1996 2709 1997 1996 5268 1005 2426 2014 6495 1012 2061 2064 2216 9361 2041 7436 3453 7628 13144 1012 4599 1997 5754 1011 9388 17603 2102 2064 2507 1996 2516 1037 11562 1010 2004 2097 2216 2559 2046 1996 2476 1997 1996 2307 5070 11205 1012 2025 2000 5254 1996 2794 6781 1997 1037 3185 2007 4637 8366 2015 4775 28925 1998 3581 10346 8485 1012 2151 3185 2007 2061 2116 3862 2015 1999 2009 2003 14610 2011 1996 10047 18939 1010 2445 2035 1996 102


INFO:tensorflow:input_ids: 101 2007 1996 13896 1997 1996 10047 18939 1010 2023 17092 3185 2064 2085 2424 2019 4699 4378 1012 2339 1029 2138 5198 2182 2040 2079 1037 3945 2006 2048 1011 2051 2914 2400 3453 8904 2850 4027 2064 2424 1005 1996 2709 1997 1996 5268 1005 2426 2014 6495 1012 2061 2064 2216 9361 2041 7436 3453 7628 13144 1012 4599 1997 5754 1011 9388 17603 2102 2064 2507 1996 2516 1037 11562 1010 2004 2097 2216 2559 2046 1996 2476 1997 1996 2307 5070 11205 1012 2025 2000 5254 1996 2794 6781 1997 1037 3185 2007 4637 8366 2015 4775 28925 1998 3581 10346 8485 1012 2151 3185 2007 2061 2116 3862 2015 1999 2009 2003 14610 2011 1996 10047 18939 1010 2445 2035 1996 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] it ' s very simple to qualify that movie : " a pure masterpiece " . this opinion is formulated for the following reasons : the performance of the actors , they seem to be citizens of that epoch , 1100 b . c . they personal ##ize perfectly the characters . a second reason is that the poetry expressed by homer ##e in his poem is well given by the production . among others the narration ##s made by the chorus give a particular atmosphere that makes us party of the artistic rendition . third reason : the rec ##ons ##ti ##tu ##tion of the decor is absolutely perfect , in mediterranean regions , where the action of the poem occurred . and most of [SEP]


INFO:tensorflow:tokens: [CLS] it ' s very simple to qualify that movie : " a pure masterpiece " . this opinion is formulated for the following reasons : the performance of the actors , they seem to be citizens of that epoch , 1100 b . c . they personal ##ize perfectly the characters . a second reason is that the poetry expressed by homer ##e in his poem is well given by the production . among others the narration ##s made by the chorus give a particular atmosphere that makes us party of the artistic rendition . third reason : the rec ##ons ##ti ##tu ##tion of the decor is absolutely perfect , in mediterranean regions , where the action of the poem occurred . and most of [SEP]


INFO:tensorflow:input_ids: 101 2009 1005 1055 2200 3722 2000 7515 2008 3185 1024 1000 1037 5760 17743 1000 1012 2023 5448 2003 19788 2005 1996 2206 4436 1024 1996 2836 1997 1996 5889 1010 2027 4025 2000 2022 4480 1997 2008 25492 1010 22096 1038 1012 1039 1012 2027 3167 4697 6669 1996 3494 1012 1037 2117 3114 2003 2008 1996 4623 5228 2011 11525 2063 1999 2010 5961 2003 2092 2445 2011 1996 2537 1012 2426 2500 1996 21283 2015 2081 2011 1996 7165 2507 1037 3327 7224 2008 3084 2149 2283 1997 1996 6018 19187 1012 2353 3114 1024 1996 28667 5644 3775 8525 3508 1997 1996 25545 2003 7078 3819 1010 1999 7095 4655 1010 2073 1996 2895 1997 1996 5961 4158 1012 1998 2087 1997 102


INFO:tensorflow:input_ids: 101 2009 1005 1055 2200 3722 2000 7515 2008 3185 1024 1000 1037 5760 17743 1000 1012 2023 5448 2003 19788 2005 1996 2206 4436 1024 1996 2836 1997 1996 5889 1010 2027 4025 2000 2022 4480 1997 2008 25492 1010 22096 1038 1012 1039 1012 2027 3167 4697 6669 1996 3494 1012 1037 2117 3114 2003 2008 1996 4623 5228 2011 11525 2063 1999 2010 5961 2003 2092 2445 2011 1996 2537 1012 2426 2500 1996 21283 2015 2081 2011 1996 7165 2507 1037 3327 7224 2008 3084 2149 2283 1997 1996 6018 19187 1012 2353 3114 1024 1996 28667 5644 3775 8525 3508 1997 1996 25545 2003 7078 3819 1010 1999 7095 4655 1010 2073 1996 2895 1997 1996 5961 4158 1012 1998 2087 1997 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] the 221 episodes of " the lone ranger " were originally broadcast on abc from 1949 to 1957 ; and then for many years they played in local syndication . for most of the original broadcast years the series was abc ' s most watched piece of programming . < br / > < br / > the new dvd set from pop fl ##ix contains the first 16 episodes ( 15 sept - 29 dec 1949 ) and for some reason unknown to me episode 22 from the fifth season , for a total of 17 episodes ( the same 17 available on last year ' s mill creek entertainment release so these are probably in the public domain ) . these sets pretty much [SEP]


INFO:tensorflow:tokens: [CLS] the 221 episodes of " the lone ranger " were originally broadcast on abc from 1949 to 1957 ; and then for many years they played in local syndication . for most of the original broadcast years the series was abc ' s most watched piece of programming . < br / > < br / > the new dvd set from pop fl ##ix contains the first 16 episodes ( 15 sept - 29 dec 1949 ) and for some reason unknown to me episode 22 from the fifth season , for a total of 17 episodes ( the same 17 available on last year ' s mill creek entertainment release so these are probably in the public domain ) . these sets pretty much [SEP]


INFO:tensorflow:input_ids: 101 1996 19594 4178 1997 1000 1996 10459 11505 1000 2020 2761 3743 2006 5925 2013 4085 2000 3890 1025 1998 2059 2005 2116 2086 2027 2209 1999 2334 26973 1012 2005 2087 1997 1996 2434 3743 2086 1996 2186 2001 5925 1005 1055 2087 3427 3538 1997 4730 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 2047 4966 2275 2013 3769 13109 7646 3397 1996 2034 2385 4178 1006 2321 17419 1011 2756 11703 4085 1007 1998 2005 2070 3114 4242 2000 2033 2792 2570 2013 1996 3587 2161 1010 2005 1037 2561 1997 2459 4178 1006 1996 2168 2459 2800 2006 2197 2095 1005 1055 4971 3636 4024 2713 2061 2122 2024 2763 1999 1996 2270 5884 1007 1012 2122 4520 3492 2172 102


INFO:tensorflow:input_ids: 101 1996 19594 4178 1997 1000 1996 10459 11505 1000 2020 2761 3743 2006 5925 2013 4085 2000 3890 1025 1998 2059 2005 2116 2086 2027 2209 1999 2334 26973 1012 2005 2087 1997 1996 2434 3743 2086 1996 2186 2001 5925 1005 1055 2087 3427 3538 1997 4730 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 2047 4966 2275 2013 3769 13109 7646 3397 1996 2034 2385 4178 1006 2321 17419 1011 2756 11703 4085 1007 1998 2005 2070 3114 4242 2000 2033 2792 2570 2013 1996 3587 2161 1010 2005 1037 2561 1997 2459 4178 1006 1996 2168 2459 2800 2006 2197 2095 1005 1055 4971 3636 4024 2713 2061 2122 2024 2763 1999 1996 2270 5884 1007 1012 2122 4520 3492 2172 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] mikhail kala ##to ##zo ##v ' s the cranes are flying is a superb film . winner of the golden palm at cannes film festival , it has an excellent cinematography and performance by ta ##tya ##na sam ##oj ##lova , the only russian actor ever to win an award in cannes for a performance . she plays ve ##ron ##ika , a teenager in love with her boyfriend , happy and without pre ##oc ##cup ##ations , with plans of getting married . her life will get upside down when world war ii strikes and her boyfriend volunteers to the army . the film depicts the effect of war on a teenager love and on the people that stayed and saw their loved ones go [SEP]


INFO:tensorflow:tokens: [CLS] mikhail kala ##to ##zo ##v ' s the cranes are flying is a superb film . winner of the golden palm at cannes film festival , it has an excellent cinematography and performance by ta ##tya ##na sam ##oj ##lova , the only russian actor ever to win an award in cannes for a performance . she plays ve ##ron ##ika , a teenager in love with her boyfriend , happy and without pre ##oc ##cup ##ations , with plans of getting married . her life will get upside down when world war ii strikes and her boyfriend volunteers to the army . the film depicts the effect of war on a teenager love and on the people that stayed and saw their loved ones go [SEP]


INFO:tensorflow:input_ids: 101 11318 26209 3406 6844 2615 1005 1055 1996 27083 2024 3909 2003 1037 21688 2143 1012 3453 1997 1996 3585 5340 2012 14775 2143 2782 1010 2009 2038 2019 6581 16434 1998 2836 2011 11937 21426 2532 3520 29147 24221 1010 1996 2069 2845 3364 2412 2000 2663 2019 2400 1999 14775 2005 1037 2836 1012 2016 3248 2310 4948 7556 1010 1037 10563 1999 2293 2007 2014 6898 1010 3407 1998 2302 3653 10085 15569 10708 1010 2007 3488 1997 2893 2496 1012 2014 2166 2097 2131 14961 2091 2043 2088 2162 2462 9326 1998 2014 6898 7314 2000 1996 2390 1012 1996 2143 11230 1996 3466 1997 2162 2006 1037 10563 2293 1998 2006 1996 2111 2008 4370 1998 2387 2037 3866 3924 2175 102


INFO:tensorflow:input_ids: 101 11318 26209 3406 6844 2615 1005 1055 1996 27083 2024 3909 2003 1037 21688 2143 1012 3453 1997 1996 3585 5340 2012 14775 2143 2782 1010 2009 2038 2019 6581 16434 1998 2836 2011 11937 21426 2532 3520 29147 24221 1010 1996 2069 2845 3364 2412 2000 2663 2019 2400 1999 14775 2005 1037 2836 1012 2016 3248 2310 4948 7556 1010 1037 10563 1999 2293 2007 2014 6898 1010 3407 1998 2302 3653 10085 15569 10708 1010 2007 3488 1997 2893 2496 1012 2014 2166 2097 2131 14961 2091 2043 2088 2162 2462 9326 1998 2014 6898 7314 2000 1996 2390 1012 1996 2143 11230 1996 3466 1997 2162 2006 1037 10563 2293 1998 2006 1996 2111 2008 4370 1998 2387 2037 3866 3924 2175 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] i just can ' t get it , freaks out on the planet are talking about this one as a must seen , they even dare to call it a ( s ) exploitation because the possessed girl is se ##du ##cing priests and is mast ##ur ##bat ##ing all time . don ' t let me laugh . i watched the movie , se ##du ##cing is only at the end of the movie and don ' t call it se ##du ##cing , it ' s just bad language that ' s she is talking . and the mast ##ur ##bation ##sc ##ene is a big laugh too , she tries to seduce her father while mast ##ur ##bat ##ing , let me be [SEP]


INFO:tensorflow:tokens: [CLS] i just can ' t get it , freaks out on the planet are talking about this one as a must seen , they even dare to call it a ( s ) exploitation because the possessed girl is se ##du ##cing priests and is mast ##ur ##bat ##ing all time . don ' t let me laugh . i watched the movie , se ##du ##cing is only at the end of the movie and don ' t call it se ##du ##cing , it ' s just bad language that ' s she is talking . and the mast ##ur ##bation ##sc ##ene is a big laugh too , she tries to seduce her father while mast ##ur ##bat ##ing , let me be [SEP]


INFO:tensorflow:input_ids: 101 1045 2074 2064 1005 1056 2131 2009 1010 29526 2041 2006 1996 4774 2024 3331 2055 2023 2028 2004 1037 2442 2464 1010 2027 2130 8108 2000 2655 2009 1037 1006 1055 1007 14427 2138 1996 8679 2611 2003 7367 8566 6129 8656 1998 2003 15429 3126 14479 2075 2035 2051 1012 2123 1005 1056 2292 2033 4756 1012 1045 3427 1996 3185 1010 7367 8566 6129 2003 2069 2012 1996 2203 1997 1996 3185 1998 2123 1005 1056 2655 2009 7367 8566 6129 1010 2009 1005 1055 2074 2919 2653 2008 1005 1055 2016 2003 3331 1012 1998 1996 15429 3126 23757 11020 8625 2003 1037 2502 4756 2205 1010 2016 5363 2000 23199 2014 2269 2096 15429 3126 14479 2075 1010 2292 2033 2022 102


INFO:tensorflow:input_ids: 101 1045 2074 2064 1005 1056 2131 2009 1010 29526 2041 2006 1996 4774 2024 3331 2055 2023 2028 2004 1037 2442 2464 1010 2027 2130 8108 2000 2655 2009 1037 1006 1055 1007 14427 2138 1996 8679 2611 2003 7367 8566 6129 8656 1998 2003 15429 3126 14479 2075 2035 2051 1012 2123 1005 1056 2292 2033 4756 1012 1045 3427 1996 3185 1010 7367 8566 6129 2003 2069 2012 1996 2203 1997 1996 3185 1998 2123 1005 1056 2655 2009 7367 8566 6129 1010 2009 1005 1055 2074 2919 2653 2008 1005 1055 2016 2003 3331 1012 1998 1996 15429 3126 23757 11020 8625 2003 1037 2502 4756 2205 1010 2016 5363 2000 23199 2014 2269 2096 15429 3126 14479 2075 1010 2292 2033 2022 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] a lot about usa the movie can be sum ##med up in its title . it draws parallels between the attitudes of this country in the face of war and a kind of hollywood - like false ##ness that g ##lor ##ifies things that shouldn ' t be g ##lor ##ified . i ' m not sure i agree with the filmmaker ' s take on recent events ( although , truth ##fully , i can ' t always tell exactly where he stands ) but i admire the unusual and artistic way of getting the point across . audio tracks of speeches , radio interviews , poetry etc . play as large a role here as visuals . most of the time the visuals of [SEP]


INFO:tensorflow:tokens: [CLS] a lot about usa the movie can be sum ##med up in its title . it draws parallels between the attitudes of this country in the face of war and a kind of hollywood - like false ##ness that g ##lor ##ifies things that shouldn ' t be g ##lor ##ified . i ' m not sure i agree with the filmmaker ' s take on recent events ( although , truth ##fully , i can ' t always tell exactly where he stands ) but i admire the unusual and artistic way of getting the point across . audio tracks of speeches , radio interviews , poetry etc . play as large a role here as visuals . most of the time the visuals of [SEP]


INFO:tensorflow:input_ids: 101 1037 2843 2055 3915 1996 3185 2064 2022 7680 7583 2039 1999 2049 2516 1012 2009 9891 18588 2090 1996 13818 1997 2023 2406 1999 1996 2227 1997 2162 1998 1037 2785 1997 5365 1011 2066 6270 2791 2008 1043 10626 14144 2477 2008 5807 1005 1056 2022 1043 10626 7810 1012 1045 1005 1049 2025 2469 1045 5993 2007 1996 12127 1005 1055 2202 2006 3522 2824 1006 2348 1010 3606 7699 1010 1045 2064 1005 1056 2467 2425 3599 2073 2002 4832 1007 2021 1045 19837 1996 5866 1998 6018 2126 1997 2893 1996 2391 2408 1012 5746 3162 1997 13867 1010 2557 7636 1010 4623 4385 1012 2377 2004 2312 1037 2535 2182 2004 26749 1012 2087 1997 1996 2051 1996 26749 1997 102


INFO:tensorflow:input_ids: 101 1037 2843 2055 3915 1996 3185 2064 2022 7680 7583 2039 1999 2049 2516 1012 2009 9891 18588 2090 1996 13818 1997 2023 2406 1999 1996 2227 1997 2162 1998 1037 2785 1997 5365 1011 2066 6270 2791 2008 1043 10626 14144 2477 2008 5807 1005 1056 2022 1043 10626 7810 1012 1045 1005 1049 2025 2469 1045 5993 2007 1996 12127 1005 1055 2202 2006 3522 2824 1006 2348 1010 3606 7699 1010 1045 2064 1005 1056 2467 2425 3599 2073 2002 4832 1007 2021 1045 19837 1996 5866 1998 6018 2126 1997 2893 1996 2391 2408 1012 5746 3162 1997 13867 1010 2557 7636 1010 4623 4385 1012 2377 2004 2312 1037 2535 2182 2004 26749 1012 2087 1997 1996 2051 1996 26749 1997 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] a single mom , her son and daughter and their hip ##pie chick friend are camping in the woods . a muscle bound , mach ##ete wielding mania ##c in a yellow ski mask appears . he starts terror ##izing and sexually violating the family before murdering them with a mach ##ete . " wet wilderness " is loaded with ugly hardcore sex , forced inc ##est and b ##lat ##ant racism . it ' s as politically incorrect as xx ##x rough ##ies get . the score is stolen from seminal hitchcock ' s horror classic " psycho " and also " jaws " . the acting is hilarious ##ly awful , the editing is bad and there are some huge laps ##es in logic [SEP]


INFO:tensorflow:tokens: [CLS] a single mom , her son and daughter and their hip ##pie chick friend are camping in the woods . a muscle bound , mach ##ete wielding mania ##c in a yellow ski mask appears . he starts terror ##izing and sexually violating the family before murdering them with a mach ##ete . " wet wilderness " is loaded with ugly hardcore sex , forced inc ##est and b ##lat ##ant racism . it ' s as politically incorrect as xx ##x rough ##ies get . the score is stolen from seminal hitchcock ' s horror classic " psycho " and also " jaws " . the acting is hilarious ##ly awful , the editing is bad and there are some huge laps ##es in logic [SEP]


INFO:tensorflow:input_ids: 101 1037 2309 3566 1010 2014 2365 1998 2684 1998 2037 5099 14756 14556 2767 2024 13215 1999 1996 5249 1012 1037 6740 5391 1010 24532 12870 26974 29310 2278 1999 1037 3756 8301 7308 3544 1012 2002 4627 7404 6026 1998 12581 20084 1996 2155 2077 21054 2068 2007 1037 24532 12870 1012 1000 4954 9917 1000 2003 8209 2007 9200 13076 3348 1010 3140 4297 4355 1998 1038 20051 4630 14398 1012 2009 1005 1055 2004 10317 16542 2004 22038 2595 5931 3111 2131 1012 1996 3556 2003 7376 2013 20603 19625 1005 1055 5469 4438 1000 18224 1000 1998 2036 1000 16113 1000 1012 1996 3772 2003 26316 2135 9643 1010 1996 9260 2003 2919 1998 2045 2024 2070 4121 10876 2229 1999 7961 102


INFO:tensorflow:input_ids: 101 1037 2309 3566 1010 2014 2365 1998 2684 1998 2037 5099 14756 14556 2767 2024 13215 1999 1996 5249 1012 1037 6740 5391 1010 24532 12870 26974 29310 2278 1999 1037 3756 8301 7308 3544 1012 2002 4627 7404 6026 1998 12581 20084 1996 2155 2077 21054 2068 2007 1037 24532 12870 1012 1000 4954 9917 1000 2003 8209 2007 9200 13076 3348 1010 3140 4297 4355 1998 1038 20051 4630 14398 1012 2009 1005 1055 2004 10317 16542 2004 22038 2595 5931 3111 2131 1012 1996 3556 2003 7376 2013 20603 19625 1005 1055 5469 4438 1000 18224 1000 1998 2036 1000 16113 1000 1012 1996 3772 2003 26316 2135 9643 1010 1996 9260 2003 2919 1998 2045 2024 2070 4121 10876 2229 1999 7961 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


#Creating a model

Now that we've prepared our data, let's focus on building a model. `create_model` does just this below. First, it loads the BERT tf hub module again (this time to extract the computation graph). Next, it creates a single new layer that will be trained to adapt BERT to our sentiment task (i.e. classifying whether a movie review is positive or negative). This strategy of using a mostly trained model is called [fine-tuning](http://wiki.fast.ai/index.php/Fine_tuning).

In [0]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)


Next we'll wrap our model function in a `model_fn_builder` function that adapts our model to work for training, evaluation, and prediction.

In [0]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        f1_score = tf.contrib.metrics.f1_score(
            label_ids,
            predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        return {
            "eval_accuracy": accuracy,
            "f1_score": f1_score,
            "auc": auc,
            "precision": precision,
            "recall": recall,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
        }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


In [0]:
# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 500
SAVE_SUMMARY_STEPS = 100

In [0]:
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [0]:
# Specify outpit directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [55]:
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})


INFO:tensorflow:Using config: {'_model_dir': 'gs://bert-tfhub/aclImdb_v1', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fcedb507be0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


Next we create an input builder function that takes our training feature set (`train_features`) and produces a generator. This is a pretty standard design pattern for working with Tensorflow [Estimators](https://www.tensorflow.org/guide/estimators).

In [0]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

Now we train our model! For me, using a Colab notebook running on Google's GPUs, my training time was about 14 minutes.

In [57]:
print(f'Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!
INFO:tensorflow:Skipping training since max_steps has already saved.
Training took time  0:00:00.759709


Now let's use our test data to see how well our model did:

In [0]:
test_input_fn = run_classifier.input_fn_builder(
    features=test_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

In [59]:
estimator.evaluate(input_fn=test_input_fn, steps=None)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore


  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2019-02-12T21:04:20Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from gs://bert-tfhub/aclImdb_v1/model.ckpt-468
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Finished evaluation at 2019-02-12-21:06:05
INFO:tensorflow:Saving dict for global step 468: auc = 0.86659324, eval_accuracy = 0.8664, f1_score = 0.8659711, false_negatives = 375.0, false_positives = 293.0, global_step = 468, loss = 0.51870537, precision = 0.880457, recall = 0.8519542, true_negatives = 2174.0, true_positives = 2158.0
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 468: gs://bert-tfhub/aclImdb_v1/model.ckpt-468


{'auc': 0.86659324,
 'eval_accuracy': 0.8664,
 'f1_score': 0.8659711,
 'false_negatives': 375.0,
 'false_positives': 293.0,
 'global_step': 468,
 'loss': 0.51870537,
 'precision': 0.880457,
 'recall': 0.8519542,
 'true_negatives': 2174.0,
 'true_positives': 2158.0}

Now let's write code to make predictions on new sentences:

In [0]:
def getPrediction(in_sentences):
  labels = ["Negative", "Positive"]
  input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, label = 0) for x in in_sentences] # here, "" is just a dummy label
  input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
  predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
  predictions = estimator.predict(predict_input_fn)
  return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]

In [0]:
pred_sentences = [
  "That movie was absolutely awful",
  "The acting was a bit lacking",
  "The film was creative and surprising",
  "Absolutely fantastic!"
]

In [72]:
predictions = getPrediction(pred_sentences)

INFO:tensorflow:Writing example 0 of 4
INFO:tensorflow:*** Example ***
INFO:tensorflow:guid: 
INFO:tensorflow:tokens: [CLS] that movie was absolutely awful [SEP]
INFO:tensorflow:input_ids: 101 2008 3185 2001 7078 9643 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Voila! We have a sentiment classifier!

In [73]:
predictions

[('That movie was absolutely awful',
  array([-4.9142293e-03, -5.3180690e+00], dtype=float32),
  'Negative'),
 ('The acting was a bit lacking',
  array([-0.03325794, -3.4200459 ], dtype=float32),
  'Negative'),
 ('The film was creative and surprising',
  array([-5.3589125e+00, -4.7171740e-03], dtype=float32),
  'Positive'),
 ('Absolutely fantastic!',
  array([-5.0434084 , -0.00647258], dtype=float32),
  'Positive')]