<a href="https://colab.research.google.com/github/kod11/bert/blob/master/target_as_stance_b.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
# Copyright 2019 Google Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#Predicting Movie Review Sentiment with BERT on TF Hub

If you’ve been following Natural Language Processing over the past year, you’ve probably heard of BERT: Bidirectional Encoder Representations from Transformers. It’s a neural network architecture designed by Google researchers that’s totally transformed what’s state-of-the-art for NLP tasks, like text classification, translation, summarization, and question answering.

Now that BERT's been added to [TF Hub](https://www.tensorflow.org/hub) as a loadable module, it's easy(ish) to add into existing Tensorflow text pipelines. In an existing pipeline, BERT can replace text embedding layers like ELMO and GloVE. Alternatively, [finetuning](http://wiki.fast.ai/index.php/Fine_tuning) BERT can provide both an accuracy boost and faster training time in many cases.

Here, we'll train a model to predict whether an IMDB movie review is positive or negative using BERT in Tensorflow with tf hub. Some code was adapted from [this colab notebook](https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb). Let's get started!

EDIT: all 3 labels

In [2]:
from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime

W0319 13:28:56.674965 140377472046976 __init__.py:56] Some hub symbols are not available because TensorFlow version is less than 1.14


In addition to the standard libraries we imported above, we'll need to install BERT's python package.

In [3]:
!pip install bert-tensorflow



In [0]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization

Below, we'll set an output directory location to store our model output and checkpoints. This can be a local directory, in which case you'd set OUTPUT_DIR to the name of the directory you'd like to create. If you're running this code in Google's hosted Colab, the directory won't persist after the Colab session ends.

Alternatively, if you're a GCP user, you can store output in a GCP bucket. To do that, set a directory name in OUTPUT_DIR and the name of the GCP bucket in the BUCKET field.

Set DO_DELETE to rewrite the OUTPUT_DIR if it exists. Otherwise, Tensorflow will load existing model checkpoints from that directory (if they exist).

In [5]:
# Set the output directory for saving model file
# Optionally, set a GCP bucket location

OUTPUT_DIR = 'output_directory'#@param {type:"string"}
#@markdown Whether or not to clear/delete the directory and create a new one
DO_DELETE = True #@param {type:"boolean"}
#@markdown Set USE_BUCKET and BUCKET if you want to (optionally) store model output on GCP bucket.
USE_BUCKET = False #@param {type:"boolean"}
BUCKET = 'BUCKET_NAME' #@param {type:"string"}

if USE_BUCKET:
  OUTPUT_DIR = 'gs://{}/{}'.format(BUCKET, OUTPUT_DIR)
  from google.colab import auth
  auth.authenticate_user()

if DO_DELETE:
  try:
    tf.gfile.DeleteRecursively(OUTPUT_DIR)
  except:
    # Doesn't matter if the directory didn't exist
    pass
tf.gfile.MakeDirs(OUTPUT_DIR)
print('***** Model output directory: {} *****'.format(OUTPUT_DIR))


***** Model output directory: output_directory *****


#Data

First, let's download the dataset, hosted by Stanford. The code below, which downloads, extracts, and imports the IMDB Large Movie Review Dataset, is borrowed from [this Tensorflow tutorial](https://www.tensorflow.org/hub/tutorials/text_classification_with_tf_hub).

In [0]:
from tensorflow import keras
import os
import re
BERT_CASE = "uncased"

# Load all files from a directory in a DataFrame.
def load_directory_data(directory):
  data = {}
  data["tweet"] = []
  data["stance"] = []
  for file_path in os.listdir(directory):
    with tf.gfile.GFile(os.path.join(directory, file_path), "r") as f:
      data["tweet"].append(f.read())
      data["stance"].append(re.match("\d+_(\d+)\.txt", file_path).group(1))
  return pd.DataFrame.from_dict(data)

# Merge positive and negative examples, add a polarity column and shuffle.
def load_dataset(directory):
  #pos_df = load_directory_data(os.path.join(directory, "pos"))
  #neg_df = load_directory_data(os.path.join(directory, "neg"))
  #pos_df["polarity"] = 1
  #neg_df["polarity"] = 0
  return pd.concat([pos_df, neg_df]).sample(frac=1).reset_index(drop=True)

# Download and process the dataset files.
#def download_and_load_datasets(force_download=False):
  #dataset = tf.keras.utils.get_file(
      #fname="aclImdb.tar.gz", 
      #origin="http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz", 
      #extract=True)
def download_and_load_datasets():
  #train_df = load_dataset("content/hillaryTrain.tsv")
  #test_df = load_dataset("content/hillaryTest.tsv")
  #train_df = pd.read_csv("hillaryTrain.tsv", sep="\t")
  #test_df = pd.read_csv("hillaryTest.tsv", sep="\t")
  train_df = pd.read_csv("trainCopyAll.tsv", sep="\t",encoding='latin1')
  test_df = pd.read_csv("testdataAll.tsv", sep="\t",encoding='latin1')
  train_df.columns = ["id","target","tweets","stance","againstwho","sentiment"]
  test_df.columns = ["id","target","tweets","stance","againstwho","sentiment"]
  #remove rows with no stance
  #train_df = train_df[train_df.stance != "NONE"]
  #test_df = test_df[test_df.stance != "NONE"]
  #shuffle
  train_df = train_df.sample(frac=1)
  #test_df = test_df.sample(frac=1)
  return train_df, test_df


In [0]:
train, test = download_and_load_datasets()

In [36]:
train.columns
test.columns
print(test.to_string())

         id                            target                                             tweets   stance againstwho sentiment
0     10002                           Atheism  RT @prayerbullets: I remove Nehushtan -previou...  AGAINST     TARGET   NEITHER
1     10003                           Atheism  @Brainman365 @heidtjj @BenjaminLives I have so...  AGAINST     TARGET  POSITIVE
2     10004                           Atheism  #God is utterly powerless without Human interv...  AGAINST     TARGET  NEGATIVE
3     10005                           Atheism  @David_Cameron   Miracles of #Multiculturalism...  AGAINST      OTHER  NEGATIVE
4     10006                           Atheism  This world needs a tight group hug. Tight enou...  AGAINST     TARGET  POSITIVE
5     10007                           Atheism  Morality is not derived from religion, it prec...  AGAINST     TARGET  POSITIVE
6     10008                           Atheism  A Godly husband  - knows you - trusts you - lo...  AGAINST     T

For us, our input data is the 'sentence' column and our label is the 'polarity' column (0, 1 for negative and positive, respecitvely)

In [0]:
DATA_COLUMN = 'tweets'
LABEL_COLUMN = 'stance'
TARGET_COLUMN = 'target'
ID_COLUMN = 'id'
# label_list is the list of labels, i.e. True, False or 0, 1 or 'dog', 'cat'
label_list = ["FAVOR","AGAINST", "NONE"]

#Data Preprocessing
We'll need to transform our data into a format BERT understands. This involves two steps. First, we create  `InputExample`'s using the constructor provided in the BERT library.

- `text_a` is the text we want to classify, which in this case, is the `Request` field in our Dataframe. 
- `text_b` is used if we're training a model to understand the relationship between sentences (i.e. is `text_b` a translation of `text_a`? Is `text_b` an answer to the question asked by `text_a`?). This doesn't apply to our task, so we can leave `text_b` blank.
- `label` is the label for our example, i.e. True, False

In [0]:
# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid= x[ID_COLUMN], # Globally unique ID for bookkeeping, unused in this example
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = x[TARGET_COLUMN], 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

test_InputExamples = test.apply(lambda x: bert.run_classifier.InputExample(guid=x[ID_COLUMN], 
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = x[TARGET_COLUMN], 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (but don't worry--this is also included in the Python library):


1. Lowercase our text (if we're using a BERT lowercase model)
2. Tokenize it (i.e. "sally says hi" -> ["sally", "says", "hi"])
3. Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
4. Map our words to indexes using a vocab file that BERT provides
5. Add special "CLS" and "SEP" tokens (see the [readme](https://github.com/google-research/bert))
6. Append "index" and "segment" tokens to each input (see the [BERT paper](https://arxiv.org/pdf/1810.04805.pdf))

Happily, we don't have to worry about most of these details.




To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:

In [11]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_"+BERT_CASE+"_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

Instructions for updating:
Colocations handled automatically by placer.


W0319 13:29:02.118002 140377472046976 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py:3632: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0319 13:29:04.752399 140377472046976 saver.py:1483] Saver not created because there are no variables in the graph to restore


Great--we just learned that the BERT model we're using expects lowercase data (that's what stored in tokenization_info["do_lower_case"]) and we also loaded BERT's vocab file. We also created a tokenizer, which breaks words into word pieces:

In [12]:
tokenizer.tokenize("Hillary Clinton is a big fat hag")

['hillary', 'clinton', 'is', 'a', 'big', 'fat', 'ha', '##g']

Using our tokenizer, we'll call `run_classifier.convert_examples_to_features` on our InputExamples to convert them into features BERT understands.

In [13]:
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128
# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)

INFO:tensorflow:Writing example 0 of 2812


I0319 13:29:05.856532 140377472046976 run_classifier.py:774] Writing example 0 of 2812


INFO:tensorflow:*** Example ***


I0319 13:29:05.861096 140377472046976 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: 176


I0319 13:29:05.864059 140377472046976 run_classifier.py:462] guid: 176


INFO:tensorflow:tokens: [CLS] i will not allow the accuse ##r to accuse me , for i am washed and clean ##sed by the blood of the lamb - rev . 1 : 5 ; 7 : 14 [SEP] at ##hei ##sm [SEP]


I0319 13:29:05.869336 140377472046976 run_classifier.py:464] tokens: [CLS] i will not allow the accuse ##r to accuse me , for i am washed and clean ##sed by the blood of the lamb - rev . 1 : 5 ; 7 : 14 [SEP] at ##hei ##sm [SEP]


INFO:tensorflow:input_ids: 101 1045 2097 2025 3499 1996 26960 2099 2000 26960 2033 1010 2005 1045 2572 8871 1998 4550 6924 2011 1996 2668 1997 1996 12559 1011 7065 1012 1015 1024 1019 1025 1021 1024 2403 102 2012 26036 6491 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:05.875054 140377472046976 run_classifier.py:465] input_ids: 101 1045 2097 2025 3499 1996 26960 2099 2000 26960 2033 1010 2005 1045 2572 8871 1998 4550 6924 2011 1996 2668 1997 1996 12559 1011 7065 1012 1015 1024 1019 1025 1021 1024 2403 102 2012 26036 6491 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:05.879089 140377472046976 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:05.884479 140377472046976 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: AGAINST (id = 1)


I0319 13:29:05.887686 140377472046976 run_classifier.py:468] label: AGAINST (id = 1)


INFO:tensorflow:*** Example ***


I0319 13:29:05.893321 140377472046976 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: 119


I0319 13:29:05.897302 140377472046976 run_classifier.py:462] guid: 119


INFO:tensorflow:tokens: [CLS] leaving christianity enables you to love the people you once rejected . # free ##thi ##nk ##er # christianity [SEP] at ##hei ##sm [SEP]


I0319 13:29:05.901647 140377472046976 run_classifier.py:464] tokens: [CLS] leaving christianity enables you to love the people you once rejected . # free ##thi ##nk ##er # christianity [SEP] at ##hei ##sm [SEP]


INFO:tensorflow:input_ids: 101 2975 7988 12939 2017 2000 2293 1996 2111 2017 2320 5837 1012 1001 2489 15222 8950 2121 1001 7988 102 2012 26036 6491 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:05.906362 140377472046976 run_classifier.py:465] input_ids: 101 2975 7988 12939 2017 2000 2293 1996 2111 2017 2320 5837 1012 1001 2489 15222 8950 2121 1001 7988 102 2012 26036 6491 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:05.909573 140377472046976 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:05.914241 140377472046976 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: FAVOR (id = 0)


I0319 13:29:05.918443 140377472046976 run_classifier.py:468] label: FAVOR (id = 0)


INFO:tensorflow:*** Example ***


I0319 13:29:05.923579 140377472046976 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: 2358


I0319 13:29:05.927140 140377472046976 run_classifier.py:462] guid: 2358


INFO:tensorflow:tokens: [CLS] good morning patriots , let us continue to pray for this great nation and the # un ##born on this great day god has given us ! ! # cc ##ot [SEP] legal ##ization of abortion [SEP]


I0319 13:29:05.930578 140377472046976 run_classifier.py:464] tokens: [CLS] good morning patriots , let us continue to pray for this great nation and the # un ##born on this great day god has given us ! ! # cc ##ot [SEP] legal ##ization of abortion [SEP]


INFO:tensorflow:input_ids: 101 2204 2851 11579 1010 2292 2149 3613 2000 11839 2005 2023 2307 3842 1998 1996 1001 4895 10280 2006 2023 2307 2154 2643 2038 2445 2149 999 999 1001 10507 4140 102 3423 3989 1997 11324 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:05.934892 140377472046976 run_classifier.py:465] input_ids: 101 2204 2851 11579 1010 2292 2149 3613 2000 11839 2005 2023 2307 3842 1998 1996 1001 4895 10280 2006 2023 2307 2154 2643 2038 2445 2149 999 999 1001 10507 4140 102 3423 3989 1997 11324 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:05.938611 140377472046976 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:05.942537 140377472046976 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: AGAINST (id = 1)


I0319 13:29:05.946422 140377472046976 run_classifier.py:468] label: AGAINST (id = 1)


INFO:tensorflow:*** Example ***


I0319 13:29:05.951275 140377472046976 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: 2208


I0319 13:29:05.955018 140377472046976 run_classifier.py:462] guid: 2208


INFO:tensorflow:tokens: [CLS] @ mi ##zer ##ello @ meet ##the ##press at least he didn ' t lose an embassy , cover up a murder , or burn govt emails . [SEP] hillary clinton [SEP]


I0319 13:29:05.958804 140377472046976 run_classifier.py:464] tokens: [CLS] @ mi ##zer ##ello @ meet ##the ##press at least he didn ' t lose an embassy , cover up a murder , or burn govt emails . [SEP] hillary clinton [SEP]


INFO:tensorflow:input_ids: 101 1030 2771 6290 15350 1030 3113 10760 20110 2012 2560 2002 2134 1005 1056 4558 2019 8408 1010 3104 2039 1037 4028 1010 2030 6402 22410 22028 1012 102 18520 7207 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:05.962613 140377472046976 run_classifier.py:465] input_ids: 101 1030 2771 6290 15350 1030 3113 10760 20110 2012 2560 2002 2134 1005 1056 4558 2019 8408 1010 3104 2039 1037 4028 1010 2030 6402 22410 22028 1012 102 18520 7207 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:05.966193 140377472046976 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:05.970216 140377472046976 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: AGAINST (id = 1)


I0319 13:29:05.974113 140377472046976 run_classifier.py:468] label: AGAINST (id = 1)


INFO:tensorflow:*** Example ***


I0319 13:29:05.980345 140377472046976 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: 2658


I0319 13:29:05.984205 140377472046976 run_classifier.py:462] guid: 2658


INFO:tensorflow:tokens: [CLS] why do we think we can outs ##mart # satan when he has been at it for over thousands of years ? # sc ##ot ##us # marriage # isis # love ##win ##s [SEP] legal ##ization of abortion [SEP]


I0319 13:29:05.987562 140377472046976 run_classifier.py:464] tokens: [CLS] why do we think we can outs ##mart # satan when he has been at it for over thousands of years ? # sc ##ot ##us # marriage # isis # love ##win ##s [SEP] legal ##ization of abortion [SEP]


INFO:tensorflow:input_ids: 101 2339 2079 2057 2228 2057 2064 21100 22345 1001 16795 2043 2002 2038 2042 2012 2009 2005 2058 5190 1997 2086 1029 1001 8040 4140 2271 1001 3510 1001 18301 1001 2293 10105 2015 102 3423 3989 1997 11324 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:05.991350 140377472046976 run_classifier.py:465] input_ids: 101 2339 2079 2057 2228 2057 2064 21100 22345 1001 16795 2043 2002 2038 2042 2012 2009 2005 2058 5190 1997 2086 1029 1001 8040 4140 2271 1001 3510 1001 18301 1001 2293 10105 2015 102 3423 3989 1997 11324 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:05.995161 140377472046976 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:05.998864 140377472046976 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: NONE (id = 2)


I0319 13:29:06.002651 140377472046976 run_classifier.py:468] label: NONE (id = 2)


INFO:tensorflow:Writing example 0 of 1248


I0319 13:29:07.820673 140377472046976 run_classifier.py:774] Writing example 0 of 1248


INFO:tensorflow:*** Example ***


I0319 13:29:07.824116 140377472046976 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: 10002


I0319 13:29:07.828776 140377472046976 run_classifier.py:462] guid: 10002


INFO:tensorflow:tokens: [CLS] rt @ prayer ##bu ##llet ##s : i remove ne ##hus ##hta ##n - previous moves of god that have become idols , from the high places - 2 kings 18 : 4 [SEP] at ##hei ##sm [SEP]


I0319 13:29:07.833074 140377472046976 run_classifier.py:464] tokens: [CLS] rt @ prayer ##bu ##llet ##s : i remove ne ##hus ##hta ##n - previous moves of god that have become idols , from the high places - 2 kings 18 : 4 [SEP] at ##hei ##sm [SEP]


INFO:tensorflow:input_ids: 101 19387 1030 7083 8569 22592 2015 1024 1045 6366 11265 9825 22893 2078 1011 3025 5829 1997 2643 2008 2031 2468 24438 1010 2013 1996 2152 3182 1011 1016 5465 2324 1024 1018 102 2012 26036 6491 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:07.838102 140377472046976 run_classifier.py:465] input_ids: 101 19387 1030 7083 8569 22592 2015 1024 1045 6366 11265 9825 22893 2078 1011 3025 5829 1997 2643 2008 2031 2468 24438 1010 2013 1996 2152 3182 1011 1016 5465 2324 1024 1018 102 2012 26036 6491 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:07.841362 140377472046976 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:07.846309 140377472046976 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: AGAINST (id = 1)


I0319 13:29:07.849675 140377472046976 run_classifier.py:468] label: AGAINST (id = 1)


INFO:tensorflow:*** Example ***


I0319 13:29:07.857541 140377472046976 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: 10003


I0319 13:29:07.861251 140377472046976 run_classifier.py:462] guid: 10003


INFO:tensorflow:tokens: [CLS] @ brain ##man ##36 ##5 @ he ##id ##t ##j ##j @ benjamin ##li ##ves i have sought the truth of my soul and found it strong enough to stand on its own merits . [SEP] at ##hei ##sm [SEP]


I0319 13:29:07.865844 140377472046976 run_classifier.py:464] tokens: [CLS] @ brain ##man ##36 ##5 @ he ##id ##t ##j ##j @ benjamin ##li ##ves i have sought the truth of my soul and found it strong enough to stand on its own merits . [SEP] at ##hei ##sm [SEP]


INFO:tensorflow:input_ids: 101 1030 4167 2386 21619 2629 1030 2002 3593 2102 3501 3501 1030 6425 3669 6961 1045 2031 4912 1996 3606 1997 2026 3969 1998 2179 2009 2844 2438 2000 3233 2006 2049 2219 22617 1012 102 2012 26036 6491 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:07.869471 140377472046976 run_classifier.py:465] input_ids: 101 1030 4167 2386 21619 2629 1030 2002 3593 2102 3501 3501 1030 6425 3669 6961 1045 2031 4912 1996 3606 1997 2026 3969 1998 2179 2009 2844 2438 2000 3233 2006 2049 2219 22617 1012 102 2012 26036 6491 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:07.873925 140377472046976 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:07.877590 140377472046976 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: AGAINST (id = 1)


I0319 13:29:07.881922 140377472046976 run_classifier.py:468] label: AGAINST (id = 1)


INFO:tensorflow:*** Example ***


I0319 13:29:07.886687 140377472046976 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: 10004


I0319 13:29:07.891685 140377472046976 run_classifier.py:462] guid: 10004


INFO:tensorflow:tokens: [CLS] # god is utterly powerless without human intervention . . . [SEP] at ##hei ##sm [SEP]


I0319 13:29:07.896266 140377472046976 run_classifier.py:464] tokens: [CLS] # god is utterly powerless without human intervention . . . [SEP] at ##hei ##sm [SEP]


INFO:tensorflow:input_ids: 101 1001 2643 2003 12580 25192 2302 2529 8830 1012 1012 1012 102 2012 26036 6491 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:07.900022 140377472046976 run_classifier.py:465] input_ids: 101 1001 2643 2003 12580 25192 2302 2529 8830 1012 1012 1012 102 2012 26036 6491 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:07.904724 140377472046976 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:07.908785 140377472046976 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: AGAINST (id = 1)


I0319 13:29:07.912444 140377472046976 run_classifier.py:468] label: AGAINST (id = 1)


INFO:tensorflow:*** Example ***


I0319 13:29:07.918241 140377472046976 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: 10005


I0319 13:29:07.922297 140377472046976 run_classifier.py:462] guid: 10005


INFO:tensorflow:tokens: [CLS] @ david _ cameron miracles of # multicultural ##ism miracles of shady 78 ##6 # ta ##qi ##ya # ta ##wr ##iya # ja ##zi ##ya # ka ##fi ##rs # dh ##im ##mi # jihad # allah [SEP] at ##hei ##sm [SEP]


I0319 13:29:07.926833 140377472046976 run_classifier.py:464] tokens: [CLS] @ david _ cameron miracles of # multicultural ##ism miracles of shady 78 ##6 # ta ##qi ##ya # ta ##wr ##iya # ja ##zi ##ya # ka ##fi ##rs # dh ##im ##mi # jihad # allah [SEP] at ##hei ##sm [SEP]


INFO:tensorflow:input_ids: 101 1030 2585 1035 7232 17861 1997 1001 27135 2964 17861 1997 22824 6275 2575 1001 11937 14702 3148 1001 11937 13088 8717 1001 14855 5831 3148 1001 10556 8873 2869 1001 28144 5714 4328 1001 24815 1001 16455 102 2012 26036 6491 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:07.931389 140377472046976 run_classifier.py:465] input_ids: 101 1030 2585 1035 7232 17861 1997 1001 27135 2964 17861 1997 22824 6275 2575 1001 11937 14702 3148 1001 11937 13088 8717 1001 14855 5831 3148 1001 10556 8873 2869 1001 28144 5714 4328 1001 24815 1001 16455 102 2012 26036 6491 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:07.935290 140377472046976 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:07.939787 140377472046976 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: AGAINST (id = 1)


I0319 13:29:07.943652 140377472046976 run_classifier.py:468] label: AGAINST (id = 1)


INFO:tensorflow:*** Example ***


I0319 13:29:07.949282 140377472046976 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: 10006


I0319 13:29:07.952734 140377472046976 run_classifier.py:462] guid: 10006


INFO:tensorflow:tokens: [CLS] this world needs a tight group hug . tight enough to relieve them from all this anger and hate . # make ##pe ##ace ##with ##ea ##cho ##ther [SEP] at ##hei ##sm [SEP]


I0319 13:29:07.955913 140377472046976 run_classifier.py:464] tokens: [CLS] this world needs a tight group hug . tight enough to relieve them from all this anger and hate . # make ##pe ##ace ##with ##ea ##cho ##ther [SEP] at ##hei ##sm [SEP]


INFO:tensorflow:input_ids: 101 2023 2088 3791 1037 4389 2177 8549 1012 4389 2438 2000 15804 2068 2013 2035 2023 4963 1998 5223 1012 1001 2191 5051 10732 24415 5243 9905 12399 102 2012 26036 6491 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:07.959254 140377472046976 run_classifier.py:465] input_ids: 101 2023 2088 3791 1037 4389 2177 8549 1012 4389 2438 2000 15804 2068 2013 2035 2023 4963 1998 5223 1012 1001 2191 5051 10732 24415 5243 9905 12399 102 2012 26036 6491 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:07.962310 140377472046976 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0319 13:29:07.965765 140377472046976 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: AGAINST (id = 1)


I0319 13:29:07.968827 140377472046976 run_classifier.py:468] label: AGAINST (id = 1)


#Creating a model

Now that we've prepared our data, let's focus on building a model. `create_model` does just this below. First, it loads the BERT tf hub module again (this time to extract the computation graph). Next, it creates a single new layer that will be trained to adapt BERT to our sentiment task (i.e. classifying whether a movie review is positive or negative). This strategy of using a mostly trained model is called [fine-tuning](http://wiki.fast.ai/index.php/Fine_tuning).

In [0]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)


Next we'll wrap our model function in a `model_fn_builder` function that adapts our model to work for training, evaluation, and prediction.

In [0]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        '''f1_score = tf.contrib.metrics.f1_score(
            label_ids,
            predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)'''
        return {"eval_accuracy": accuracy}
      
            
      
      #try to only consider labels for and against
      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


In [0]:
# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 500
SAVE_SUMMARY_STEPS = 100

In [0]:
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [0]:
# Specify outpit directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [19]:
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})


INFO:tensorflow:Using config: {'_model_dir': 'output_directory', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fabb26122e8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


I0319 13:29:08.986926 140377472046976 estimator.py:201] Using config: {'_model_dir': 'output_directory', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fabb26122e8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


Next we create an input builder function that takes our training feature set (`train_features`) and produces a generator. This is a pretty standard design pattern for working with Tensorflow [Estimators](https://www.tensorflow.org/guide/estimators).

In [0]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

Now we train our model! For me, using a Colab notebook running on Google's GPUs, my training time was about 14 minutes.

In [21]:
print(f'Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!
INFO:tensorflow:Calling model_fn.


I0319 13:29:10.733525 140377472046976 estimator.py:1111] Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0319 13:29:14.669907 140377472046976 saver.py:1483] Saver not created because there are no variables in the graph to restore


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


W0319 13:29:14.878632 140377472046976 deprecation.py:506] From <ipython-input-14-ca03218f28a6>:34: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


W0319 13:29:14.938968 140377472046976 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/learning_rate_decay_v2.py:321: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


Instructions for updating:
Use tf.cast instead.


W0319 13:29:15.038617 140377472046976 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


Instructions for updating:
Use tf.cast instead.


W0319 13:29:26.177613 140377472046976 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/metrics_impl.py:455: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.


INFO:tensorflow:Done calling model_fn.


I0319 13:29:26.207888 140377472046976 estimator.py:1113] Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


I0319 13:29:26.212858 140377472046976 basic_session_run_hooks.py:527] Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


I0319 13:29:30.846536 140377472046976 monitored_session.py:222] Graph was finalized.


INFO:tensorflow:Running local_init_op.


I0319 13:29:36.388651 140377472046976 session_manager.py:491] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0319 13:29:36.623283 140377472046976 session_manager.py:493] Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into output_directory/model.ckpt.


I0319 13:29:47.246006 140377472046976 basic_session_run_hooks.py:594] Saving checkpoints for 0 into output_directory/model.ckpt.


INFO:tensorflow:loss = 1.1723322, step = 0


I0319 13:30:04.897311 140377472046976 basic_session_run_hooks.py:249] loss = 1.1723322, step = 0


INFO:tensorflow:global_step/sec: 0.598064


I0319 13:32:52.103021 140377472046976 basic_session_run_hooks.py:680] global_step/sec: 0.598064


INFO:tensorflow:loss = 0.7733457, step = 100 (167.209 sec)


I0319 13:32:52.106717 140377472046976 basic_session_run_hooks.py:247] loss = 0.7733457, step = 100 (167.209 sec)


INFO:tensorflow:global_step/sec: 0.64498


I0319 13:35:27.146566 140377472046976 basic_session_run_hooks.py:680] global_step/sec: 0.64498


INFO:tensorflow:loss = 0.3187404, step = 200 (155.044 sec)


I0319 13:35:27.150665 140377472046976 basic_session_run_hooks.py:247] loss = 0.3187404, step = 200 (155.044 sec)


INFO:tensorflow:Saving checkpoints for 263 into output_directory/model.ckpt.


I0319 13:37:03.201908 140377472046976 basic_session_run_hooks.py:594] Saving checkpoints for 263 into output_directory/model.ckpt.


INFO:tensorflow:Loss for final step: 0.22025728.


I0319 13:37:13.050372 140377472046976 estimator.py:359] Loss for final step: 0.22025728.


Training took time  0:08:04.019836


Now let's use our test data to see how well our model did:

In [0]:
test_input_fn = run_classifier.input_fn_builder(
    features=test_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

In [23]:
estimator.evaluate(input_fn=test_input_fn, steps=None)

INFO:tensorflow:Calling model_fn.


I0319 13:37:14.020242 140377472046976 estimator.py:1111] Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0319 13:37:18.324711 140377472046976 saver.py:1483] Saver not created because there are no variables in the graph to restore
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


INFO:tensorflow:Done calling model_fn.


I0319 13:37:29.862331 140377472046976 estimator.py:1113] Done calling model_fn.


INFO:tensorflow:Starting evaluation at 2019-03-19T13:37:29Z


I0319 13:37:29.893911 140377472046976 evaluation.py:257] Starting evaluation at 2019-03-19T13:37:29Z


INFO:tensorflow:Graph was finalized.


I0319 13:37:31.850620 140377472046976 monitored_session.py:222] Graph was finalized.


Instructions for updating:
Use standard file APIs to check for files with this prefix.


W0319 13:37:31.861311 140377472046976 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.


INFO:tensorflow:Restoring parameters from output_directory/model.ckpt-263


I0319 13:37:31.869475 140377472046976 saver.py:1270] Restoring parameters from output_directory/model.ckpt-263


INFO:tensorflow:Running local_init_op.


I0319 13:37:34.371518 140377472046976 session_manager.py:491] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0319 13:37:34.641219 140377472046976 session_manager.py:493] Done running local_init_op.


INFO:tensorflow:Finished evaluation at 2019-03-19-13:37:55


I0319 13:37:55.857282 140377472046976 evaluation.py:277] Finished evaluation at 2019-03-19-13:37:55


INFO:tensorflow:Saving dict for global step 263: eval_accuracy = 0.7219551, global_step = 263, loss = 0.6675589


I0319 13:37:55.860318 140377472046976 estimator.py:1979] Saving dict for global step 263: eval_accuracy = 0.7219551, global_step = 263, loss = 0.6675589


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 263: output_directory/model.ckpt-263


I0319 13:37:58.910659 140377472046976 estimator.py:2039] Saving 'checkpoint_path' summary for global step 263: output_directory/model.ckpt-263


{'eval_accuracy': 0.7219551, 'global_step': 263, 'loss': 0.6675589}

Now let's write code to make predictions on new sentences:

In [0]:
def getPrediction(in_sentences,targets):
  labels = ["FAVOR","AGAINST", "NONE"]
  input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = y, label = "NONE") for x,y in zip(in_sentences,targets)] # here, "" is just a dummy label
  input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
  predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
  predictions = estimator.predict(predict_input_fn)
  return [(sentence,target, prediction['probabilities'], labels[prediction['labels']]) for sentence,target, prediction in zip(in_sentences,targets, predictions)]

In [0]:
def printTestDataForCompScript():
  labels = ["FAVOR","AGAINST", "NONE"]
  predictions = estimator.predict(test_input_fn)
  #print(test.tweets)
  pred = [(testId,testTarget,sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence,testId,testTarget, prediction in zip(test.tweets,test.id,test.target, predictions)]
  return(pred)

In [0]:
pred_sentences = [
  "our country is ready for a female prez, not ever hillary",
  "my vote is for hillary",
  "where are the emails hillary?",
  "she is a fraud",
  "retribution for benghazi",
  "#hillaryclinton have you ever told the truth?",
  "million bogus followers on twitter #hillaryclinton",
  "I like my hamburgers rare"
]

In [30]:
#predictions = getPrediction(pred_sentences)
#predictions = getPrediction(test.tweets,test.target)
predictions = printTestDataForCompScript()

INFO:tensorflow:Calling model_fn.


I0319 13:39:46.674027 140377472046976 estimator.py:1111] Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0319 13:39:50.684072 140377472046976 saver.py:1483] Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Done calling model_fn.


I0319 13:39:50.935175 140377472046976 estimator.py:1113] Done calling model_fn.


INFO:tensorflow:Graph was finalized.


I0319 13:39:51.823332 140377472046976 monitored_session.py:222] Graph was finalized.


INFO:tensorflow:Restoring parameters from output_directory/model.ckpt-263


I0319 13:39:51.835805 140377472046976 saver.py:1270] Restoring parameters from output_directory/model.ckpt-263


INFO:tensorflow:Running local_init_op.


I0319 13:39:52.694003 140377472046976 session_manager.py:491] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0319 13:39:52.789979 140377472046976 session_manager.py:493] Done running local_init_op.


Voila! We have a sentiment classifier!

In [37]:
print("ID\tTarget\tTweet\tStance")
for x in predictions:
  print(str(x[0]) + "\t" + x[1] + "\t" + x[2] + "\t" + x[4])

ID	Target	Tweet	Stance
10002	Atheism	RT @prayerbullets: I remove Nehushtan -previous moves of God that have become idols, from the high places -2 Kings 18:4 	AGAINST
10003	Atheism	@Brainman365 @heidtjj @BenjaminLives I have sought the truth of my soul and found it strong enough to stand on its own merits. 	AGAINST
10004	Atheism	#God is utterly powerless without Human intervention... 	AGAINST
10005	Atheism	@David_Cameron   Miracles of #Multiculturalism   Miracles of shady 786  #Taqiya #Tawriya #Jaziya #Kafirs #Dhimmi #Jihad #Allah 	AGAINST
10006	Atheism	This world needs a tight group hug. Tight enough to relieve them from all this anger and hate. #MakePeaceWithEachOther 	NONE
10007	Atheism	Morality is not derived from religion, it precedes it. -Christopher 'The Hitch' Hitchens #freethinkers 	FAVOR
10008	Atheism	A Godly husband  - knows you - trusts you - loves you - respects you - honors you - supports you - wants you - appreciates you #God 	AGAINST
10009	Atheism	@SecularDutchess I'll b