lab3_nlp2


In [0]:
# -*- coding: utf-8 -*-
# lab3_nl2p.py



Predicting Movie Reviews with BERT on TF Hub.ipynb

Automatically generated by Colaboratory.

Original file is located at
    https://colab.research.google.com/github/google-research/bert/blob/master/predicting_movie_reviews_with_bert_on_tf_hub.ipynb

 Copyright 2019 Google Inc.
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.

Predicting Movie Review Sentiment with BERT on TF Hub

If you’ve been following Natural Language Processing over the past year, you’ve probably heard of BERT: Bidirectional Encoder Representations from Transformers. It’s a neural network architecture designed by Google researchers that’s totally transformed what’s state-of-the-art for NLP tasks, like text classification, translation, summarization, and question answering.

Now that BERT's been added to [TF Hub](https://www.tensorflow.org/hub) as a loadable module, it's easy(ish) to add into existing Tensorflow text pipelines. In an existing pipeline, BERT can replace text embedding layers like ELMO and GloVE. Alternatively, [finetuning](http://wiki.fast.ai/index.php/Fine_tuning) BERT can provide both an accuracy boost and faster training time in many cases.

Here, we'll train a model to predict whether an IMDB movie review is positive or negative using BERT in Tensorflow with tf hub. Some code was adapted from [this colab notebook](https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb). Let's get started!


In [1]:


from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime

!rm -rf bert
!git clone https://github.com/google-research/bert

import sys

sys.path.append('bert/')

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import codecs
import collections
import json
import re
import os
import pprint
import numpy as np
import tensorflow as tf

import modeling
import tokenization
import run_classifier
import optimization


"""In addition to the standard libraries we imported above, we'll need to 
install BERT's python package.
!pip install bert-tensorflow
"""


"""
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization
"""


W0527 13:05:44.589041 139757240096640 __init__.py:56] Some hub symbols are not available because TensorFlow version is less than 1.14


Cloning into 'bert'...
remote: Enumerating objects: 325, done.[K
Receiving objects:   0% (1/325)   Receiving objects:   1% (4/325)   Receiving objects:   2% (7/325)   Receiving objects:   3% (10/325)   Receiving objects:   4% (13/325)   Receiving objects:   5% (17/325)   Receiving objects:   6% (20/325)   Receiving objects:   7% (23/325)   Receiving objects:   8% (26/325)   Receiving objects:   9% (30/325)   Receiving objects:  10% (33/325)   Receiving objects:  11% (36/325)   Receiving objects:  12% (39/325)   Receiving objects:  13% (43/325)   Receiving objects:  14% (46/325)   Receiving objects:  15% (49/325)   Receiving objects:  16% (52/325)   Receiving objects:  17% (56/325)   Receiving objects:  18% (59/325)   Receiving objects:  19% (62/325)   Receiving objects:  20% (65/325)   Receiving objects:  21% (69/325)   Receiving objects:  22% (72/325)   Receiving objects:  23% (75/325)   Receiving objects:  24% (78/325)   Receiving objects:  25% (82/325)   R

'\nimport bert\nfrom bert import run_classifier\nfrom bert import optimization\nfrom bert import tokenization\n'

In [2]:

"""Below, we'll set an output directory location to store our model output and checkpoints. This can be a local directory, in which case you'd set OUTPUT_DIR to the name of the directory you'd like to create. If you're running this code in Google's hosted Colab, the directory won't persist after the Colab session ends.

Alternatively, if you're a GCP user, you can store output in a GCP bucket. To do that, set a directory name in OUTPUT_DIR and the name of the GCP bucket in the BUCKET field.

Set DO_DELETE to rewrite the OUTPUT_DIR if it exists. Otherwise, Tensorflow will load existing model checkpoints from that directory (if they exist).
"""

# Set the output directory for saving model file
# Optionally, set a GCP bucket location

OUTPUT_DIR = 'bert0'#@param {type:"string"}
#@markdown Whether or not to clear/delete the directory and create a new one
DO_DELETE = False #@param {type:"boolean"}
#@markdown Set USE_BUCKET and BUCKET if you want to (optionally) store model output on GCP bucket.
USE_BUCKET = False #@param {type:"boolean"}
BUCKET = 'BUCKET_NAME' #@param {type:"string"}

if USE_BUCKET:
  OUTPUT_DIR = 'gs://{}/{}'.format(BUCKET, OUTPUT_DIR)
  from google.colab import auth
  auth.authenticate_user()

if DO_DELETE:
  try:
    tf.gfile.DeleteRecursively(OUTPUT_DIR)
  except:
    # Doesn't matter if the directory didn't exist
    pass
tf.gfile.MakeDirs(OUTPUT_DIR)
print('***** Model output directory: {} *****'.format(OUTPUT_DIR))




***** Model output directory: bert0 *****


In [3]:


"""#Data

First, let's download the dataset, hosted by Stanford. The code below, 
which downloads, extracts, and imports the IMDB Large Movie Review Dataset,
is borrowed from [this Tensorflow tutorial](https://www.tensorflow.org/hub/tutorials/text_classification_with_tf_hub).
"""

from tensorflow import keras
import os
import re

# Load all files from a directory in a DataFrame.
def load_directory_data(directory):
  data = {}
  data["sentence"] = []
  data["sentiment"] = []
  for file_path in os.listdir(directory):
    with tf.gfile.GFile(os.path.join(directory, file_path), "r") as f:
      data["sentence"].append(f.read())
      data["sentiment"].append(re.match("\d+_(\d+)\.txt", file_path).group(1))
  return pd.DataFrame.from_dict(data)

# Merge positive and negative examples, add a polarity column and shuffle.
def load_dataset(directory):
  pos_df = load_directory_data(os.path.join(directory, "pos"))
  neg_df = load_directory_data(os.path.join(directory, "neg"))
  pos_df["polarity"] = 1
  neg_df["polarity"] = 0
  return pd.concat([pos_df, neg_df]).sample(frac=1).reset_index(drop=True)

# Download and process the dataset files.
def download_and_load_datasets(force_download=False):
  dataset = tf.keras.utils.get_file(
      fname="aclImdb.tar.gz", 
      origin="http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz", 
      extract=True)
  
  train_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                       "aclImdb", "train"))
  test_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                      "aclImdb", "test"))
  
  return train_df, test_df

train, test = download_and_load_datasets()

"""To keep training fast, we'll take a sample of 5000 train and test examples,
respectively."""

train = train.sample(5000)
test = test.sample(5000)

train.columns

"""For us, our input data is the 'sentence' column and our label is the 
'polarity' column (0, 1 for negative and positive, respecitvely)"""

DATA_COLUMN = 'sentence'
LABEL_COLUMN = 'polarity'
# label_list is the list of labels, i.e. True, False or 0, 1 or 'dog', 'cat'
label_list = [0, 1]

"""#Data Preprocessing
We'll need to transform our data into a format BERT understands. This
involves two steps. First, we create  `InputExample`'s using the constructor
provided in the BERT library.

- `text_a` is the text we want to classify, which in this case, is the
`Request` field in our Dataframe. 
- `text_b` is used if we're training a model to understand the relationship
between sentences (i.e. is `text_b` a translation of `text_a`? Is 
`text_b` an answer to the question asked by `text_a`?). This doesn't
apply to our task, so we can leave `text_b` blank.
- `label` is the label for our example, i.e. True, False
"""


"#Data Preprocessing\nWe'll need to transform our data into a format BERT understands. This\ninvolves two steps. First, we create  `InputExample`'s using the constructor\nprovided in the BERT library.\n\n- `text_a` is the text we want to classify, which in this case, is the\n`Request` field in our Dataframe. \n- `text_b` is used if we're training a model to understand the relationship\nbetween sentences (i.e. is `text_b` a translation of `text_a`? Is \n`text_b` an answer to the question asked by `text_a`?). This doesn't\napply to our task, so we can leave `text_b` blank.\n- `label` is the label for our example, i.e. True, False\n"

In [0]:


# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train.apply(lambda x: run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this example
            text_a = x[DATA_COLUMN], 
            text_b = None, 
            label = x[LABEL_COLUMN]), axis = 1)

test_InputExamples = test.apply(lambda x: run_classifier.InputExample(guid=None, 
            text_a = x[DATA_COLUMN], 
            text_b = None, 
            label = x[LABEL_COLUMN]), axis = 1)



In [5]:


"""Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (but don't worry--this is also included in the Python library):


1. Lowercase our text (if we're using a BERT lowercase model)
2. Tokenize it (i.e. "sally says hi" -> ["sally", "says", "hi"])
3. Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
4. Map our words to indexes using a vocab file that BERT provides
5. Add special "CLS" and "SEP" tokens (see the [readme](https://github.com/google-research/bert))
6. Append "index" and "segment" tokens to each input (see the [BERT paper](https://arxiv.org/pdf/1810.04805.pdf))

Happily, we don't have to worry about most of these details.

To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:
"""

# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

"""Great--we just learned that the BERT model we're using expects lowercase data 
(that's what stored in tokenization_info["do_lower_case"]) and we also
loaded BERT's vocab file. We also created a tokenizer, which breaks words
into word pieces:"""

tokenizer.tokenize("This here's an example of using the BERT tokenizer")

"""Using our tokenizer, we'll call `run_classifier.convert_examples_to_features`
on our InputExamples to convert them into features BERT understands."""

# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128
# Convert our train and test features to InputFeatures that BERT understands.
train_features = run_classifier.convert_examples_to_features(
    train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = run_classifier.convert_examples_to_features(

    test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)



Instructions for updating:
Colocations handled automatically by placer.


W0527 13:07:40.602292 139757240096640 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py:3632: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0527 13:07:42.825060 139757240096640 saver.py:1483] Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Writing example 0 of 5000


I0527 13:07:43.483568 139757240096640 run_classifier.py:774] Writing example 0 of 5000


INFO:tensorflow:*** Example ***


I0527 13:07:43.506236 139757240096640 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0527 13:07:43.509577 139757240096640 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] as with most of the reviewers , i saw this on star ##z ! on ##de ##man ##d . after watching the preview with my girlfriend , she decided not to watch it from how bad the preview watched . i , on the other hand , thought it looked weird enough to warrant a watching . i mean , the design of dr . me ##so alone warrant ##ed at least a brief sweep over this title . after watching it , i can say that while there are some interesting aspects to it ( namely the brows ##ing over the notebook ##s and trying to figure out the inc ##omp ##re ##hen ##sible story ) , it ' s best to pass over this [SEP]


I0527 13:07:43.513714 139757240096640 run_classifier.py:464] tokens: [CLS] as with most of the reviewers , i saw this on star ##z ! on ##de ##man ##d . after watching the preview with my girlfriend , she decided not to watch it from how bad the preview watched . i , on the other hand , thought it looked weird enough to warrant a watching . i mean , the design of dr . me ##so alone warrant ##ed at least a brief sweep over this title . after watching it , i can say that while there are some interesting aspects to it ( namely the brows ##ing over the notebook ##s and trying to figure out the inc ##omp ##re ##hen ##sible story ) , it ' s best to pass over this [SEP]


INFO:tensorflow:input_ids: 101 2004 2007 2087 1997 1996 15814 1010 1045 2387 2023 2006 2732 2480 999 2006 3207 2386 2094 1012 2044 3666 1996 19236 2007 2026 6513 1010 2016 2787 2025 2000 3422 2009 2013 2129 2919 1996 19236 3427 1012 1045 1010 2006 1996 2060 2192 1010 2245 2009 2246 6881 2438 2000 10943 1037 3666 1012 1045 2812 1010 1996 2640 1997 2852 1012 2033 6499 2894 10943 2098 2012 2560 1037 4766 11740 2058 2023 2516 1012 2044 3666 2009 1010 1045 2064 2360 2008 2096 2045 2024 2070 5875 5919 2000 2009 1006 8419 1996 11347 2075 2058 1996 14960 2015 1998 2667 2000 3275 2041 1996 4297 25377 2890 10222 19307 2466 1007 1010 2009 1005 1055 2190 2000 3413 2058 2023 102


I0527 13:07:43.518305 139757240096640 run_classifier.py:465] input_ids: 101 2004 2007 2087 1997 1996 15814 1010 1045 2387 2023 2006 2732 2480 999 2006 3207 2386 2094 1012 2044 3666 1996 19236 2007 2026 6513 1010 2016 2787 2025 2000 3422 2009 2013 2129 2919 1996 19236 3427 1012 1045 1010 2006 1996 2060 2192 1010 2245 2009 2246 6881 2438 2000 10943 1037 3666 1012 1045 2812 1010 1996 2640 1997 2852 1012 2033 6499 2894 10943 2098 2012 2560 1037 4766 11740 2058 2023 2516 1012 2044 3666 2009 1010 1045 2064 2360 2008 2096 2045 2024 2070 5875 5919 2000 2009 1006 8419 1996 11347 2075 2058 1996 14960 2015 1998 2667 2000 3275 2041 1996 4297 25377 2890 10222 19307 2466 1007 1010 2009 1005 1055 2190 2000 3413 2058 2023 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


I0527 13:07:43.521541 139757240096640 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0527 13:07:43.525310 139757240096640 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0527 13:07:43.528647 139757240096640 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0527 13:07:43.534255 139757240096640 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0527 13:07:43.537409 139757240096640 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] simply put , this is the best movie to come out of michigan since . . . well , ever ! evil dead eat your heart out , hatred of a minute was some of the odd ##est , and best cinema to be seen by this reviewer in a long time . i recommend this movie to anyone who is in need of a head trip , or a good case of the willie ##s ! [SEP]


I0527 13:07:43.540393 139757240096640 run_classifier.py:464] tokens: [CLS] simply put , this is the best movie to come out of michigan since . . . well , ever ! evil dead eat your heart out , hatred of a minute was some of the odd ##est , and best cinema to be seen by this reviewer in a long time . i recommend this movie to anyone who is in need of a head trip , or a good case of the willie ##s ! [SEP]


INFO:tensorflow:input_ids: 101 3432 2404 1010 2023 2003 1996 2190 3185 2000 2272 2041 1997 4174 2144 1012 1012 1012 2092 1010 2412 999 4763 2757 4521 2115 2540 2041 1010 11150 1997 1037 3371 2001 2070 1997 1996 5976 4355 1010 1998 2190 5988 2000 2022 2464 2011 2023 12027 1999 1037 2146 2051 1012 1045 16755 2023 3185 2000 3087 2040 2003 1999 2342 1997 1037 2132 4440 1010 2030 1037 2204 2553 1997 1996 9893 2015 999 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0527 13:07:43.543315 139757240096640 run_classifier.py:465] input_ids: 101 3432 2404 1010 2023 2003 1996 2190 3185 2000 2272 2041 1997 4174 2144 1012 1012 1012 2092 1010 2412 999 4763 2757 4521 2115 2540 2041 1010 11150 1997 1037 3371 2001 2070 1997 1996 5976 4355 1010 1998 2190 5988 2000 2022 2464 2011 2023 12027 1999 1037 2146 2051 1012 1045 16755 2023 3185 2000 3087 2040 2003 1999 2342 1997 1037 2132 4440 1010 2030 1037 2204 2553 1997 1996 9893 2015 999 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0527 13:07:43.546776 139757240096640 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0527 13:07:43.549862 139757240096640 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0527 13:07:43.552854 139757240096640 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0527 13:07:43.574145 139757240096640 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0527 13:07:43.577268 139757240096640 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] the mor ##bid catholic writer gerard rev ##e ( je ##ro ##en k ##ra ##bbe ) that is homosexual , alcoholic and has frequent visions of death is invited to give a lecture in the literature club of v ##lis ##sing ##en . while in the railway station in amsterdam , he feels a non - corresponded attraction to a handsome man that embark ##s in another train . gerard is introduced to the treasurer of the club and beau ##tic ##ian christine hal ##ss ##lag ( renee so ##ute ##ndi ##jk ) , who is a wealthy widow that owns the beauty shop sphinx , and they have one night stand . on the next morning , gerard sees the picture of christine ' s [SEP]


I0527 13:07:43.580789 139757240096640 run_classifier.py:464] tokens: [CLS] the mor ##bid catholic writer gerard rev ##e ( je ##ro ##en k ##ra ##bbe ) that is homosexual , alcoholic and has frequent visions of death is invited to give a lecture in the literature club of v ##lis ##sing ##en . while in the railway station in amsterdam , he feels a non - corresponded attraction to a handsome man that embark ##s in another train . gerard is introduced to the treasurer of the club and beau ##tic ##ian christine hal ##ss ##lag ( renee so ##ute ##ndi ##jk ) , who is a wealthy widow that owns the beauty shop sphinx , and they have one night stand . on the next morning , gerard sees the picture of christine ' s [SEP]


INFO:tensorflow:input_ids: 101 1996 22822 17062 3234 3213 11063 7065 2063 1006 15333 3217 2368 1047 2527 19473 1007 2008 2003 15667 1010 14813 1998 2038 6976 12018 1997 2331 2003 4778 2000 2507 1037 8835 1999 1996 3906 2252 1997 1058 6856 7741 2368 1012 2096 1999 1996 2737 2276 1999 7598 1010 2002 5683 1037 2512 1011 27601 8432 2000 1037 8502 2158 2008 28866 2015 1999 2178 3345 1012 11063 2003 3107 2000 1996 10211 1997 1996 2252 1998 17935 4588 2937 10941 11085 4757 17802 1006 17400 2061 10421 16089 15992 1007 1010 2040 2003 1037 7272 7794 2008 8617 1996 5053 4497 27311 1010 1998 2027 2031 2028 2305 3233 1012 2006 1996 2279 2851 1010 11063 5927 1996 3861 1997 10941 1005 1055 102


I0527 13:07:43.583541 139757240096640 run_classifier.py:465] input_ids: 101 1996 22822 17062 3234 3213 11063 7065 2063 1006 15333 3217 2368 1047 2527 19473 1007 2008 2003 15667 1010 14813 1998 2038 6976 12018 1997 2331 2003 4778 2000 2507 1037 8835 1999 1996 3906 2252 1997 1058 6856 7741 2368 1012 2096 1999 1996 2737 2276 1999 7598 1010 2002 5683 1037 2512 1011 27601 8432 2000 1037 8502 2158 2008 28866 2015 1999 2178 3345 1012 11063 2003 3107 2000 1996 10211 1997 1996 2252 1998 17935 4588 2937 10941 11085 4757 17802 1006 17400 2061 10421 16089 15992 1007 1010 2040 2003 1037 7272 7794 2008 8617 1996 5053 4497 27311 1010 1998 2027 2031 2028 2305 3233 1012 2006 1996 2279 2851 1010 11063 5927 1996 3861 1997 10941 1005 1055 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


I0527 13:07:43.586739 139757240096640 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0527 13:07:43.590145 139757240096640 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0527 13:07:43.592491 139757240096640 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0527 13:07:43.600671 139757240096640 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0527 13:07:43.602866 139757240096640 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] great movie - i loved it . great editing and use of the soundtrack . captures the real feeling of indian life , yet we can all relate . a well chose cast with some great characters . the movie develops all the characters so that you real care about them all and you feel like you know them . the use of the indian music and drums in some of the soccer scenes is great and the direction really works as everyone comes off as real and natural . you just can ' t help but to root for jess in this film ! the acting was really good , even the tomb ##oy ##ish walk and body posture of both leading ladies is very [SEP]


I0527 13:07:43.606664 139757240096640 run_classifier.py:464] tokens: [CLS] great movie - i loved it . great editing and use of the soundtrack . captures the real feeling of indian life , yet we can all relate . a well chose cast with some great characters . the movie develops all the characters so that you real care about them all and you feel like you know them . the use of the indian music and drums in some of the soccer scenes is great and the direction really works as everyone comes off as real and natural . you just can ' t help but to root for jess in this film ! the acting was really good , even the tomb ##oy ##ish walk and body posture of both leading ladies is very [SEP]


INFO:tensorflow:input_ids: 101 2307 3185 1011 1045 3866 2009 1012 2307 9260 1998 2224 1997 1996 6050 1012 19566 1996 2613 3110 1997 2796 2166 1010 2664 2057 2064 2035 14396 1012 1037 2092 4900 3459 2007 2070 2307 3494 1012 1996 3185 11791 2035 1996 3494 2061 2008 2017 2613 2729 2055 2068 2035 1998 2017 2514 2066 2017 2113 2068 1012 1996 2224 1997 1996 2796 2189 1998 3846 1999 2070 1997 1996 4715 5019 2003 2307 1998 1996 3257 2428 2573 2004 3071 3310 2125 2004 2613 1998 3019 1012 2017 2074 2064 1005 1056 2393 2021 2000 7117 2005 12245 1999 2023 2143 999 1996 3772 2001 2428 2204 1010 2130 1996 8136 6977 4509 3328 1998 2303 16819 1997 2119 2877 6456 2003 2200 102


I0527 13:07:43.613145 139757240096640 run_classifier.py:465] input_ids: 101 2307 3185 1011 1045 3866 2009 1012 2307 9260 1998 2224 1997 1996 6050 1012 19566 1996 2613 3110 1997 2796 2166 1010 2664 2057 2064 2035 14396 1012 1037 2092 4900 3459 2007 2070 2307 3494 1012 1996 3185 11791 2035 1996 3494 2061 2008 2017 2613 2729 2055 2068 2035 1998 2017 2514 2066 2017 2113 2068 1012 1996 2224 1997 1996 2796 2189 1998 3846 1999 2070 1997 1996 4715 5019 2003 2307 1998 1996 3257 2428 2573 2004 3071 3310 2125 2004 2613 1998 3019 1012 2017 2074 2064 1005 1056 2393 2021 2000 7117 2005 12245 1999 2023 2143 999 1996 3772 2001 2428 2204 1010 2130 1996 8136 6977 4509 3328 1998 2303 16819 1997 2119 2877 6456 2003 2200 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


I0527 13:07:43.616868 139757240096640 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0527 13:07:43.621525 139757240096640 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0527 13:07:43.625504 139757240096640 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0527 13:07:43.643249 139757240096640 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0527 13:07:43.647411 139757240096640 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] what boo ##b at mgm thought it would be a good idea to place the stud ##ly clark gable in the role of a salvation army worker ? ? ironically enough , another handsome future star , cary grant , also played a salvation army guy just two years later in the highly over ##rated she done him wrong . i guess in hind ##sight it ' s pretty easy to see the folly of these roles , but i still wonder who thought that salvation army guys are " hot " and who could look at these dash ##ing men and see them as realistic representations of the parts they played . a long time ago , i used to work for a sister organization [SEP]


I0527 13:07:43.651366 139757240096640 run_classifier.py:464] tokens: [CLS] what boo ##b at mgm thought it would be a good idea to place the stud ##ly clark gable in the role of a salvation army worker ? ? ironically enough , another handsome future star , cary grant , also played a salvation army guy just two years later in the highly over ##rated she done him wrong . i guess in hind ##sight it ' s pretty easy to see the folly of these roles , but i still wonder who thought that salvation army guys are " hot " and who could look at these dash ##ing men and see them as realistic representations of the parts they played . a long time ago , i used to work for a sister organization [SEP]


INFO:tensorflow:input_ids: 101 2054 22017 2497 2012 15418 2245 2009 2052 2022 1037 2204 2801 2000 2173 1996 16054 2135 5215 13733 1999 1996 2535 1997 1037 12611 2390 7309 1029 1029 18527 2438 1010 2178 8502 2925 2732 1010 20533 3946 1010 2036 2209 1037 12611 2390 3124 2074 2048 2086 2101 1999 1996 3811 2058 9250 2016 2589 2032 3308 1012 1045 3984 1999 17666 25807 2009 1005 1055 3492 3733 2000 2156 1996 26272 1997 2122 4395 1010 2021 1045 2145 4687 2040 2245 2008 12611 2390 4364 2024 1000 2980 1000 1998 2040 2071 2298 2012 2122 11454 2075 2273 1998 2156 2068 2004 12689 15066 1997 1996 3033 2027 2209 1012 1037 2146 2051 3283 1010 1045 2109 2000 2147 2005 1037 2905 3029 102


I0527 13:07:43.655851 139757240096640 run_classifier.py:465] input_ids: 101 2054 22017 2497 2012 15418 2245 2009 2052 2022 1037 2204 2801 2000 2173 1996 16054 2135 5215 13733 1999 1996 2535 1997 1037 12611 2390 7309 1029 1029 18527 2438 1010 2178 8502 2925 2732 1010 20533 3946 1010 2036 2209 1037 12611 2390 3124 2074 2048 2086 2101 1999 1996 3811 2058 9250 2016 2589 2032 3308 1012 1045 3984 1999 17666 25807 2009 1005 1055 3492 3733 2000 2156 1996 26272 1997 2122 4395 1010 2021 1045 2145 4687 2040 2245 2008 12611 2390 4364 2024 1000 2980 1000 1998 2040 2071 2298 2012 2122 11454 2075 2273 1998 2156 2068 2004 12689 15066 1997 1996 3033 2027 2209 1012 1037 2146 2051 3283 1010 1045 2109 2000 2147 2005 1037 2905 3029 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


I0527 13:07:43.659954 139757240096640 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0527 13:07:43.662861 139757240096640 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0527 13:07:43.666211 139757240096640 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:Writing example 0 of 5000


I0527 13:08:10.483784 139757240096640 run_classifier.py:774] Writing example 0 of 5000


INFO:tensorflow:*** Example ***


I0527 13:08:10.493512 139757240096640 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0527 13:08:10.496396 139757240096640 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] " a texas community is be ##set with a rash of mysterious killings involving some of the students from the local college . the sheriff investigating the death discovers the startling identity of the killer responsible for the murders . a nasa experiment involving cosmic rays has mu ##tated an ape and turned it into an un ##sto ##ppa ##ble killing machine with a thirst for blood , " according to the dvd sleeve ' s syn ##opsis . < br / > < br / > or , could the creature really be a mu ##tated alligator returning from a space - bound " noah ' s ark " ? < br / > < br / > a long opening , with laugh ##ably [SEP]


I0527 13:08:10.498203 139757240096640 run_classifier.py:464] tokens: [CLS] " a texas community is be ##set with a rash of mysterious killings involving some of the students from the local college . the sheriff investigating the death discovers the startling identity of the killer responsible for the murders . a nasa experiment involving cosmic rays has mu ##tated an ape and turned it into an un ##sto ##ppa ##ble killing machine with a thirst for blood , " according to the dvd sleeve ' s syn ##opsis . < br / > < br / > or , could the creature really be a mu ##tated alligator returning from a space - bound " noah ' s ark " ? < br / > < br / > a long opening , with laugh ##ably [SEP]


INFO:tensorflow:input_ids: 101 1000 1037 3146 2451 2003 2022 13462 2007 1037 23438 1997 8075 16431 5994 2070 1997 1996 2493 2013 1996 2334 2267 1012 1996 6458 11538 1996 2331 9418 1996 19828 4767 1997 1996 6359 3625 2005 1996 9916 1012 1037 9274 7551 5994 14448 9938 2038 14163 16238 2019 23957 1998 2357 2009 2046 2019 4895 16033 13944 3468 4288 3698 2007 1037 21810 2005 2668 1010 1000 2429 2000 1996 4966 10353 1005 1055 19962 22599 1012 1026 7987 1013 1028 1026 7987 1013 1028 2030 1010 2071 1996 6492 2428 2022 1037 14163 16238 28833 4192 2013 1037 2686 1011 5391 1000 7240 1005 1055 15745 1000 1029 1026 7987 1013 1028 1026 7987 1013 1028 1037 2146 3098 1010 2007 4756 8231 102


I0527 13:08:10.499875 139757240096640 run_classifier.py:465] input_ids: 101 1000 1037 3146 2451 2003 2022 13462 2007 1037 23438 1997 8075 16431 5994 2070 1997 1996 2493 2013 1996 2334 2267 1012 1996 6458 11538 1996 2331 9418 1996 19828 4767 1997 1996 6359 3625 2005 1996 9916 1012 1037 9274 7551 5994 14448 9938 2038 14163 16238 2019 23957 1998 2357 2009 2046 2019 4895 16033 13944 3468 4288 3698 2007 1037 21810 2005 2668 1010 1000 2429 2000 1996 4966 10353 1005 1055 19962 22599 1012 1026 7987 1013 1028 1026 7987 1013 1028 2030 1010 2071 1996 6492 2428 2022 1037 14163 16238 28833 4192 2013 1037 2686 1011 5391 1000 7240 1005 1055 15745 1000 1029 1026 7987 1013 1028 1026 7987 1013 1028 1037 2146 3098 1010 2007 4756 8231 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


I0527 13:08:10.501436 139757240096640 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0527 13:08:10.504165 139757240096640 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0527 13:08:10.507714 139757240096640 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0527 13:08:10.521964 139757240096640 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0527 13:08:10.523954 139757240096640 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] who else other than tr ##oma can take the classic tragedy and change it around to today ##s standards ? ? ? ? no one . . . . in my opinion the leonardo di ##cap ##rio one sucked . tr ##ome ##on & juliet is a definite stretch from the original shakes ##per ##an tragedy , but it holds up well . its sick , dem ##ented , twisted , but yet insane ##ly funny and fulfilling . for the most part it follows the true romeo and juliet story , but many tr ##oma elements are added . will keenan gives a great performance as tr ##ome ##o . the acting is solid and the story is great . many people look past these [SEP]


I0527 13:08:10.526721 139757240096640 run_classifier.py:464] tokens: [CLS] who else other than tr ##oma can take the classic tragedy and change it around to today ##s standards ? ? ? ? no one . . . . in my opinion the leonardo di ##cap ##rio one sucked . tr ##ome ##on & juliet is a definite stretch from the original shakes ##per ##an tragedy , but it holds up well . its sick , dem ##ented , twisted , but yet insane ##ly funny and fulfilling . for the most part it follows the true romeo and juliet story , but many tr ##oma elements are added . will keenan gives a great performance as tr ##ome ##o . the acting is solid and the story is great . many people look past these [SEP]


INFO:tensorflow:input_ids: 101 2040 2842 2060 2084 19817 9626 2064 2202 1996 4438 10576 1998 2689 2009 2105 2000 2651 2015 4781 1029 1029 1029 1029 2053 2028 1012 1012 1012 1012 1999 2026 5448 1996 14720 4487 17695 9488 2028 8631 1012 19817 8462 2239 1004 13707 2003 1037 15298 7683 2013 1996 2434 10854 4842 2319 10576 1010 2021 2009 4324 2039 2092 1012 2049 5305 1010 17183 14088 1010 6389 1010 2021 2664 9577 2135 6057 1998 21570 1012 2005 1996 2087 2112 2009 4076 1996 2995 12390 1998 13707 2466 1010 2021 2116 19817 9626 3787 2024 2794 1012 2097 26334 3957 1037 2307 2836 2004 19817 8462 2080 1012 1996 3772 2003 5024 1998 1996 2466 2003 2307 1012 2116 2111 2298 2627 2122 102


I0527 13:08:10.528858 139757240096640 run_classifier.py:465] input_ids: 101 2040 2842 2060 2084 19817 9626 2064 2202 1996 4438 10576 1998 2689 2009 2105 2000 2651 2015 4781 1029 1029 1029 1029 2053 2028 1012 1012 1012 1012 1999 2026 5448 1996 14720 4487 17695 9488 2028 8631 1012 19817 8462 2239 1004 13707 2003 1037 15298 7683 2013 1996 2434 10854 4842 2319 10576 1010 2021 2009 4324 2039 2092 1012 2049 5305 1010 17183 14088 1010 6389 1010 2021 2664 9577 2135 6057 1998 21570 1012 2005 1996 2087 2112 2009 4076 1996 2995 12390 1998 13707 2466 1010 2021 2116 19817 9626 3787 2024 2794 1012 2097 26334 3957 1037 2307 2836 2004 19817 8462 2080 1012 1996 3772 2003 5024 1998 1996 2466 2003 2307 1012 2116 2111 2298 2627 2122 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


I0527 13:08:10.531694 139757240096640 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0527 13:08:10.537033 139757240096640 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0527 13:08:10.539916 139757240096640 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0527 13:08:10.551032 139757240096640 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0527 13:08:10.553193 139757240096640 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] one would think that a film based on the life of the japanese author yuki ##o mis ##hima would be a da ##unt ##ing if not impossible task . however paul sc ##hra ##der has indeed made a film " about " mis ##hima that is both superb & complex . while it is not a literal biography , sc ##hra ##der & his co - screenwriter leonard sc ##hard ##er ( his brother ) have taken several incidents from his life , including his su ##cide and crafted what can best be described as incident ##al table ##aus that are visually sparse and stunning . mis ##hima ' s homosexuality is almost not there , due to legal threats from his widow , but in [SEP]


I0527 13:08:10.555801 139757240096640 run_classifier.py:464] tokens: [CLS] one would think that a film based on the life of the japanese author yuki ##o mis ##hima would be a da ##unt ##ing if not impossible task . however paul sc ##hra ##der has indeed made a film " about " mis ##hima that is both superb & complex . while it is not a literal biography , sc ##hra ##der & his co - screenwriter leonard sc ##hard ##er ( his brother ) have taken several incidents from his life , including his su ##cide and crafted what can best be described as incident ##al table ##aus that are visually sparse and stunning . mis ##hima ' s homosexuality is almost not there , due to legal threats from his widow , but in [SEP]


INFO:tensorflow:input_ids: 101 2028 2052 2228 2008 1037 2143 2241 2006 1996 2166 1997 1996 2887 3166 24924 2080 28616 16369 2052 2022 1037 4830 16671 2075 2065 2025 5263 4708 1012 2174 2703 8040 13492 4063 2038 5262 2081 1037 2143 1000 2055 1000 28616 16369 2008 2003 2119 21688 1004 3375 1012 2096 2009 2003 2025 1037 18204 8308 1010 8040 13492 4063 1004 2010 2522 1011 11167 7723 8040 11783 2121 1006 2010 2567 1007 2031 2579 2195 10444 2013 2010 2166 1010 2164 2010 10514 27082 1998 19275 2054 2064 2190 2022 2649 2004 5043 2389 2795 20559 2008 2024 17453 20288 1998 14726 1012 28616 16369 1005 1055 15949 2003 2471 2025 2045 1010 2349 2000 3423 8767 2013 2010 7794 1010 2021 1999 102


I0527 13:08:10.558618 139757240096640 run_classifier.py:465] input_ids: 101 2028 2052 2228 2008 1037 2143 2241 2006 1996 2166 1997 1996 2887 3166 24924 2080 28616 16369 2052 2022 1037 4830 16671 2075 2065 2025 5263 4708 1012 2174 2703 8040 13492 4063 2038 5262 2081 1037 2143 1000 2055 1000 28616 16369 2008 2003 2119 21688 1004 3375 1012 2096 2009 2003 2025 1037 18204 8308 1010 8040 13492 4063 1004 2010 2522 1011 11167 7723 8040 11783 2121 1006 2010 2567 1007 2031 2579 2195 10444 2013 2010 2166 1010 2164 2010 10514 27082 1998 19275 2054 2064 2190 2022 2649 2004 5043 2389 2795 20559 2008 2024 17453 20288 1998 14726 1012 28616 16369 1005 1055 15949 2003 2471 2025 2045 1010 2349 2000 3423 8767 2013 2010 7794 1010 2021 1999 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


I0527 13:08:10.565415 139757240096640 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0527 13:08:10.567356 139757240096640 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0527 13:08:10.571854 139757240096640 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0527 13:08:10.578521 139757240096640 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0527 13:08:10.584171 139757240096640 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] the acting is excellent in this film , with some great actors . it was fun to see fred mc ##mur ##ray as a young man . this is not a comedy . it ' s a drama and the apparently comedic instances are pit ##iful . this is not a comedy . it ' s a drama and the apparently comedic instances are pit ##iful , and some of them appear forced and con ##tri ##ved . it ' s in the script , though , not the fault of the acting . < br / > < br / > the 10 line requirement forces me to write some more . . . hmm ##m . loved carole lombard ' s my man godfrey [SEP]


I0527 13:08:10.587460 139757240096640 run_classifier.py:464] tokens: [CLS] the acting is excellent in this film , with some great actors . it was fun to see fred mc ##mur ##ray as a young man . this is not a comedy . it ' s a drama and the apparently comedic instances are pit ##iful . this is not a comedy . it ' s a drama and the apparently comedic instances are pit ##iful , and some of them appear forced and con ##tri ##ved . it ' s in the script , though , not the fault of the acting . < br / > < br / > the 10 line requirement forces me to write some more . . . hmm ##m . loved carole lombard ' s my man godfrey [SEP]


INFO:tensorflow:input_ids: 101 1996 3772 2003 6581 1999 2023 2143 1010 2007 2070 2307 5889 1012 2009 2001 4569 2000 2156 5965 11338 20136 9447 2004 1037 2402 2158 1012 2023 2003 2025 1037 4038 1012 2009 1005 1055 1037 3689 1998 1996 4593 21699 12107 2024 6770 18424 1012 2023 2003 2025 1037 4038 1012 2009 1005 1055 1037 3689 1998 1996 4593 21699 12107 2024 6770 18424 1010 1998 2070 1997 2068 3711 3140 1998 9530 18886 7178 1012 2009 1005 1055 1999 1996 5896 1010 2295 1010 2025 1996 6346 1997 1996 3772 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 2184 2240 9095 2749 2033 2000 4339 2070 2062 1012 1012 1012 17012 2213 1012 3866 24348 23441 1005 1055 2026 2158 18238 102


I0527 13:08:10.592128 139757240096640 run_classifier.py:465] input_ids: 101 1996 3772 2003 6581 1999 2023 2143 1010 2007 2070 2307 5889 1012 2009 2001 4569 2000 2156 5965 11338 20136 9447 2004 1037 2402 2158 1012 2023 2003 2025 1037 4038 1012 2009 1005 1055 1037 3689 1998 1996 4593 21699 12107 2024 6770 18424 1012 2023 2003 2025 1037 4038 1012 2009 1005 1055 1037 3689 1998 1996 4593 21699 12107 2024 6770 18424 1010 1998 2070 1997 2068 3711 3140 1998 9530 18886 7178 1012 2009 1005 1055 1999 1996 5896 1010 2295 1010 2025 1996 6346 1997 1996 3772 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 2184 2240 9095 2749 2033 2000 4339 2070 2062 1012 1012 1012 17012 2213 1012 3866 24348 23441 1005 1055 2026 2158 18238 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


I0527 13:08:10.596144 139757240096640 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0527 13:08:10.599484 139757240096640 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0527 13:08:10.604682 139757240096640 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0527 13:08:10.616205 139757240096640 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0527 13:08:10.619487 139757240096640 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] simply i just watched this movie just because of sarah & am also giving these 4 stars just because of her , on the other side this movie was easily one of the worst movies i have ever seen . the ##act ##ing was horrible . the script was un ##ins ##pired . this was a movie that kept contra ##dict ##ing itself . the film was sloppy and uno ##ri ##ginal . its not like i was expecting a good film . just something to give me a jump or two . this did not even do that . < br / > < br / > he worst thing is that , the more i think about the overall plot , the less sense [SEP]


I0527 13:08:10.621941 139757240096640 run_classifier.py:464] tokens: [CLS] simply i just watched this movie just because of sarah & am also giving these 4 stars just because of her , on the other side this movie was easily one of the worst movies i have ever seen . the ##act ##ing was horrible . the script was un ##ins ##pired . this was a movie that kept contra ##dict ##ing itself . the film was sloppy and uno ##ri ##ginal . its not like i was expecting a good film . just something to give me a jump or two . this did not even do that . < br / > < br / > he worst thing is that , the more i think about the overall plot , the less sense [SEP]


INFO:tensorflow:input_ids: 101 3432 1045 2074 3427 2023 3185 2074 2138 1997 4532 1004 2572 2036 3228 2122 1018 3340 2074 2138 1997 2014 1010 2006 1996 2060 2217 2023 3185 2001 4089 2028 1997 1996 5409 5691 1045 2031 2412 2464 1012 1996 18908 2075 2001 9202 1012 1996 5896 2001 4895 7076 21649 1012 2023 2001 1037 3185 2008 2921 24528 29201 2075 2993 1012 1996 2143 2001 28810 1998 27776 3089 24965 1012 2049 2025 2066 1045 2001 8074 1037 2204 2143 1012 2074 2242 2000 2507 2033 1037 5376 2030 2048 1012 2023 2106 2025 2130 2079 2008 1012 1026 7987 1013 1028 1026 7987 1013 1028 2002 5409 2518 2003 2008 1010 1996 2062 1045 2228 2055 1996 3452 5436 1010 1996 2625 3168 102


I0527 13:08:10.625213 139757240096640 run_classifier.py:465] input_ids: 101 3432 1045 2074 3427 2023 3185 2074 2138 1997 4532 1004 2572 2036 3228 2122 1018 3340 2074 2138 1997 2014 1010 2006 1996 2060 2217 2023 3185 2001 4089 2028 1997 1996 5409 5691 1045 2031 2412 2464 1012 1996 18908 2075 2001 9202 1012 1996 5896 2001 4895 7076 21649 1012 2023 2001 1037 3185 2008 2921 24528 29201 2075 2993 1012 1996 2143 2001 28810 1998 27776 3089 24965 1012 2049 2025 2066 1045 2001 8074 1037 2204 2143 1012 2074 2242 2000 2507 2033 1037 5376 2030 2048 1012 2023 2106 2025 2130 2079 2008 1012 1026 7987 1013 1028 1026 7987 1013 1028 2002 5409 2518 2003 2008 1010 1996 2062 1045 2228 2055 1996 3452 5436 1010 1996 2625 3168 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


I0527 13:08:10.628087 139757240096640 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0527 13:08:10.630860 139757240096640 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0527 13:08:10.633194 139757240096640 run_classifier.py:468] label: 0 (id = 0)


In [6]:


"""#Creating a model

Now that we've prepared our data, let's focus on building a model.
`create_model` does just this below. First, it loads the BERT tf
hub module again (this time to extract the computation graph).
Next, it creates a single new layer that will be trained to adapt BERT 
to our sentiment task (i.e. classifying whether a movie review is 
positive or negative). This strategy of using a mostly trained model 
is called [fine-tuning](http://wiki.fast.ai/index.php/Fine_tuning).
"""

def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)

"""Next we'll wrap our model function in a `model_fn_builder` function that 
adapts our model to work for training, evaluation, and prediction."""

# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        f1_score = tf.contrib.metrics.f1_score(
            label_ids,
            predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        return {
            "eval_accuracy": accuracy,
            "f1_score": f1_score,
            "auc": auc,
            "precision": precision,
            "recall": recall,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
        }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn

# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 500
SAVE_SUMMARY_STEPS = 100

# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

# Specify output directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})

"""Next we create an input builder function that takes our training feature set (`train_features`) and produces a generator. This is a pretty standard design pattern for working with Tensorflow [Estimators](https://www.tensorflow.org/guide/estimators)."""

# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

"""Now we train our model! For me, using a Colab notebook running on Google's
GPUs, my training time was about 14 minutes."""


INFO:tensorflow:Using config: {'_model_dir': 'bert0', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f1b7e6935f8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


I0527 13:08:52.314161 139757240096640 estimator.py:201] Using config: {'_model_dir': 'bert0', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f1b7e6935f8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


"Now we train our model! For me, using a Colab notebook running on Google's\nGPUs, my training time was about 14 minutes."

In [0]:


print(f'Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)


Beginning Training!
INFO:tensorflow:Calling model_fn.


I0527 13:09:04.599845 139757240096640 estimator.py:1111] Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0527 13:09:07.950111 139757240096640 saver.py:1483] Saver not created because there are no variables in the graph to restore


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


W0527 13:09:08.090862 139757240096640 deprecation.py:506] From <ipython-input-6-3ae63f565c4f>:47: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


W0527 13:09:08.142363 139757240096640 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/learning_rate_decay_v2.py:321: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


Instructions for updating:
Use tf.cast instead.


W0527 13:09:08.236835 139757240096640 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


Instructions for updating:
Use tf.cast instead.


W0527 13:09:17.460424 139757240096640 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/metrics_impl.py:455: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.



For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

INFO:tensorflow:Done calling model_fn.


I0527 13:09:18.798507 139757240096640 estimator.py:1113] Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


I0527 13:09:18.805979 139757240096640 basic_session_run_hooks.py:527] Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


I0527 13:09:22.967493 139757240096640 monitored_session.py:222] Graph was finalized.


Instructions for updating:
Use standard file APIs to check for files with this prefix.


W0527 13:09:22.970812 139757240096640 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.


INFO:tensorflow:Restoring parameters from bert0/model.ckpt-0


I0527 13:09:22.979394 139757240096640 saver.py:1270] Restoring parameters from bert0/model.ckpt-0


Instructions for updating:
Use standard file utilities to get mtimes.


W0527 13:09:34.382104 139757240096640 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:1070: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file utilities to get mtimes.


INFO:tensorflow:Running local_init_op.


I0527 13:09:34.928727 139757240096640 session_manager.py:491] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0527 13:09:35.261353 139757240096640 session_manager.py:493] Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into bert0/model.ckpt.


I0527 13:09:46.510717 139757240096640 basic_session_run_hooks.py:594] Saving checkpoints for 0 into bert0/model.ckpt.


INFO:tensorflow:loss = 0.7691344, step = 1


I0527 13:10:46.788367 139757240096640 basic_session_run_hooks.py:249] loss = 0.7691344, step = 1


INFO:tensorflow:global_step/sec: 0.0220635


I0527 14:26:19.153298 139757240096640 basic_session_run_hooks.py:680] global_step/sec: 0.0220635


INFO:tensorflow:loss = 0.6038867, step = 101 (4532.371 sec)


I0527 14:26:19.159367 139757240096640 basic_session_run_hooks.py:247] loss = 0.6038867, step = 101 (4532.371 sec)


In [0]:


"""Now let's use our test data to see how well our model did:"""

test_input_fn = run_classifier.input_fn_builder(
    features=test_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

estimator.evaluate(input_fn=test_input_fn, steps=None)

"""Now let's write code to make predictions on new sentences:"""

def getPrediction(in_sentences):
  labels = ["Negative", "Positive"]
  input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, label = 0) for x in in_sentences] # here, "" is just a dummy label
  input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
  predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
  predictions = estimator.predict(predict_input_fn)
  return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]

pred_sentences = [
  "That movie was absolutely awful",
  "The acting was a bit lacking",
  "The film was creative and surprising",
  "Absolutely fantastic!"
]

predictions = getPrediction(pred_sentences)

"""Voila! We have a sentiment classifier!"""

predictions

end notebook