<a href="https://colab.research.google.com/github/Jabzilla/news-with-bert/blob/master/Predicting_News_Articles_based_on_Movie_Reviews_with_BERT_on_TF_Hub.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
# Copyright 2019 Google Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#Predicting Movie Review Sentiment with BERT on TF Hub

Much of this code was taken from [this tutorial by Google](https://colab.research.google.com/github/google-research/bert/blob/master/predicting_movie_reviews_with_bert_on_tf_hub.ipynb). Edits have been made to change the output to be between 0 and 1 and also to test the classifier on news articles.

In [0]:
%tensorflow_version 1.x
from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime


TensorFlow 1.x selected.


In addition to the standard libraries we imported above, we'll need to install BERT's python package.

In [0]:
!pip install bert-tensorflow

Collecting bert-tensorflow
[?25l  Downloading https://files.pythonhosted.org/packages/a6/66/7eb4e8b6ea35b7cc54c322c816f976167a43019750279a8473d355800a93/bert_tensorflow-1.0.1-py2.py3-none-any.whl (67kB)
[K     |████▉                           | 10kB 25.7MB/s eta 0:00:01[K     |█████████▊                      | 20kB 3.0MB/s eta 0:00:01[K     |██████████████▋                 | 30kB 4.4MB/s eta 0:00:01[K     |███████████████████▍            | 40kB 2.9MB/s eta 0:00:01[K     |████████████████████████▎       | 51kB 3.6MB/s eta 0:00:01[K     |█████████████████████████████▏  | 61kB 4.3MB/s eta 0:00:01[K     |████████████████████████████████| 71kB 3.5MB/s 
Installing collected packages: bert-tensorflow
Successfully installed bert-tensorflow-1.0.1


In [0]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization




Below, we'll set an output directory location to store our model output and checkpoints. This can be a local directory, in which case you'd set OUTPUT_DIR to the name of the directory you'd like to create. If you're running this code in Google's hosted Colab, the directory won't persist after the Colab session ends.

Alternatively, if you're a GCP user, you can store output in a GCP bucket. To do that, set a directory name in OUTPUT_DIR and the name of the GCP bucket in the BUCKET field.

Set DO_DELETE to rewrite the OUTPUT_DIR if it exists. Otherwise, Tensorflow will load existing model checkpoints from that directory (if they exist).

In [0]:
# Set the output directory for saving model file
# Optionally, set a GCP bucket location

OUTPUT_DIR = 'OUTPUT_DIR_NAME'#@param {type:"string"}
#@markdown Whether or not to clear/delete the directory and create a new one
DO_DELETE = False #@param {type:"boolean"}
#@markdown Set USE_BUCKET and BUCKET if you want to (optionally) store model output on GCP bucket.
USE_BUCKET = False #@param {type:"boolean"}
BUCKET = 'BUCKET_NAME' #@param {type:"string"}

if USE_BUCKET:
  OUTPUT_DIR = 'gs://{}/{}'.format(BUCKET, OUTPUT_DIR)
  from google.colab import auth
  auth.authenticate_user()

if DO_DELETE:
  try:
    tf.gfile.DeleteRecursively(OUTPUT_DIR)
  except:
    # Doesn't matter if the directory didn't exist
    pass
tf.gfile.MakeDirs(OUTPUT_DIR)
print('***** Model output directory: {} *****'.format(OUTPUT_DIR))


***** Model output directory: OUTPUT_DIR_NAME *****


#Data

First, let's download the dataset, hosted by Stanford. The code below, which downloads, extracts, and imports the IMDB Large Movie Review Dataset, is borrowed from [this Tensorflow tutorial](https://www.tensorflow.org/hub/tutorials/text_classification_with_tf_hub).

In [0]:
from tensorflow import keras
import os
import re

# Load all files from a directory in a DataFrame.
def load_directory_data(directory):
  data = {}
  data["sentence"] = []
  data["sentiment"] = []
  for file_path in os.listdir(directory):
    with tf.gfile.GFile(os.path.join(directory, file_path), "r") as f:
      data["sentence"].append(f.read())
      data["sentiment"].append(re.match("\d+_(\d+)\.txt", file_path).group(1))
  return pd.DataFrame.from_dict(data)

# Merge positive and negative examples, add a polarity column and shuffle.
def load_dataset(directory):
  pos_df = load_directory_data(os.path.join(directory, "pos"))
  neg_df = load_directory_data(os.path.join(directory, "neg"))
  pos_df["polarity"] = 1
  neg_df["polarity"] = 0
  return pd.concat([pos_df, neg_df]).sample(frac=1).reset_index(drop=True)

# Download and process the dataset files.
def download_and_load_datasets(force_download=False):
  dataset = tf.keras.utils.get_file(
      fname="aclImdb.tar.gz", 
      origin="http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz", 
      extract=True)
  
  train_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                       "aclImdb", "train"))
  test_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                      "aclImdb", "test"))
  
  return train_df, test_df


In [0]:
train, test = download_and_load_datasets()

Downloading data from http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz


To keep training fast, we'll take a sample of 5000 train and test examples, respectively.

In [0]:
train = train.sample(5000)
test = test.sample(5000)

In [0]:
train.columns

Index(['sentence', 'sentiment', 'polarity'], dtype='object')

For us, our input data is the 'sentence' column and our label is the 'polarity' column (0, 1 for negative and positive, respecitvely)

In [0]:
DATA_COLUMN = 'sentence'
LABEL_COLUMN = 'polarity'
# label_list is the list of labels, i.e. True, False or 0, 1 or 'dog', 'cat'
label_list = [0, 1]

#Data Preprocessing
We'll need to transform our data into a format BERT understands. This involves two steps. First, we create  `InputExample`'s using the constructor provided in the BERT library.

- `text_a` is the text we want to classify, which in this case, is the `Request` field in our Dataframe. 
- `text_b` is used if we're training a model to understand the relationship between sentences (i.e. is `text_b` a translation of `text_a`? Is `text_b` an answer to the question asked by `text_a`?). This doesn't apply to our task, so we can leave `text_b` blank.
- `label` is the label for our example, i.e. True, False

In [0]:
# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this example
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

test_InputExamples = test.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (but don't worry--this is also included in the Python library):


1. Lowercase our text (if we're using a BERT lowercase model)
2. Tokenize it (i.e. "sally says hi" -> ["sally", "says", "hi"])
3. Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
4. Map our words to indexes using a vocab file that BERT provides
5. Add special "CLS" and "SEP" tokens (see the [readme](https://github.com/google-research/bert))
6. Append "index" and "segment" tokens to each input (see the [BERT paper](https://arxiv.org/pdf/1810.04805.pdf))

Happily, we don't have to worry about most of these details.




To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:

In [0]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


Great--we just learned that the BERT model we're using expects lowercase data (that's what stored in tokenization_info["do_lower_case"]) and we also loaded BERT's vocab file. We also created a tokenizer, which breaks words into word pieces:

In [0]:
tokenizer.tokenize("This here's an example of using the BERT tokenizer")

['this',
 'here',
 "'",
 's',
 'an',
 'example',
 'of',
 'using',
 'the',
 'bert',
 'token',
 '##izer']

Using our tokenizer, we'll call `run_classifier.convert_examples_to_features` on our InputExamples to convert them into features BERT understands.

In [0]:
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128
# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)







INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] this is the one in which the dim ##in ##utive ruth gordon plays an agatha - christie type of murder mystery author who locks her nephew by marriage into a safe . gordon believes that he murdered her niece and the young fellow dies of su ##ff ##ocation , while gordon is traveling back and forth to new york . he manages , however , to leave behind some clues , scratches on a couple of black safe deposit boxes and an improvised and well - hidden note . col ##umb ##o enters the case , suspects her at once , and solve ##s the mystery by simply using his supernatural mystical intuitive powers . oh , and marie ##tte hartley is on hand as gordon [SEP]


INFO:tensorflow:tokens: [CLS] this is the one in which the dim ##in ##utive ruth gordon plays an agatha - christie type of murder mystery author who locks her nephew by marriage into a safe . gordon believes that he murdered her niece and the young fellow dies of su ##ff ##ocation , while gordon is traveling back and forth to new york . he manages , however , to leave behind some clues , scratches on a couple of black safe deposit boxes and an improvised and well - hidden note . col ##umb ##o enters the case , suspects her at once , and solve ##s the mystery by simply using his supernatural mystical intuitive powers . oh , and marie ##tte hartley is on hand as gordon [SEP]


INFO:tensorflow:input_ids: 101 2023 2003 1996 2028 1999 2029 1996 11737 2378 28546 7920 5146 3248 2019 23863 1011 13144 2828 1997 4028 6547 3166 2040 11223 2014 7833 2011 3510 2046 1037 3647 1012 5146 7164 2008 2002 7129 2014 12286 1998 1996 2402 3507 8289 1997 10514 4246 23909 1010 2096 5146 2003 7118 2067 1998 5743 2000 2047 2259 1012 2002 9020 1010 2174 1010 2000 2681 2369 2070 15774 1010 25980 2006 1037 3232 1997 2304 3647 12816 8378 1998 2019 19641 1998 2092 1011 5023 3602 1012 8902 25438 2080 8039 1996 2553 1010 13172 2014 2012 2320 1010 1998 9611 2015 1996 6547 2011 3432 2478 2010 11189 17529 29202 4204 1012 2821 1010 1998 5032 4674 20955 2003 2006 2192 2004 5146 102


INFO:tensorflow:input_ids: 101 2023 2003 1996 2028 1999 2029 1996 11737 2378 28546 7920 5146 3248 2019 23863 1011 13144 2828 1997 4028 6547 3166 2040 11223 2014 7833 2011 3510 2046 1037 3647 1012 5146 7164 2008 2002 7129 2014 12286 1998 1996 2402 3507 8289 1997 10514 4246 23909 1010 2096 5146 2003 7118 2067 1998 5743 2000 2047 2259 1012 2002 9020 1010 2174 1010 2000 2681 2369 2070 15774 1010 25980 2006 1037 3232 1997 2304 3647 12816 8378 1998 2019 19641 1998 2092 1011 5023 3602 1012 8902 25438 2080 8039 1996 2553 1010 13172 2014 2012 2320 1010 1998 9611 2015 1996 6547 2011 3432 2478 2010 11189 17529 29202 4204 1012 2821 1010 1998 5032 4674 20955 2003 2006 2192 2004 5146 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] when you look back at another bad nightmare sequel like freddy ' s revenge , you have to at least give it some credit for trying something new . and although the dream child is more enjoyable it offers absolutely nothing new to the series . yes , there ' s the creative deaths as usual , like a kid becoming part of a comic book and facing " super freddy " but even scenes like that aren ' t used to their full potential and the parts without freddy are just boring . < br / > < br / > this marked the official death of scar ##iness to the series . freddy seems to be the comedic relief now . . . but [SEP]


INFO:tensorflow:tokens: [CLS] when you look back at another bad nightmare sequel like freddy ' s revenge , you have to at least give it some credit for trying something new . and although the dream child is more enjoyable it offers absolutely nothing new to the series . yes , there ' s the creative deaths as usual , like a kid becoming part of a comic book and facing " super freddy " but even scenes like that aren ' t used to their full potential and the parts without freddy are just boring . < br / > < br / > this marked the official death of scar ##iness to the series . freddy seems to be the comedic relief now . . . but [SEP]


INFO:tensorflow:input_ids: 101 2043 2017 2298 2067 2012 2178 2919 10103 8297 2066 19343 1005 1055 7195 1010 2017 2031 2000 2012 2560 2507 2009 2070 4923 2005 2667 2242 2047 1012 1998 2348 1996 3959 2775 2003 2062 22249 2009 4107 7078 2498 2047 2000 1996 2186 1012 2748 1010 2045 1005 1055 1996 5541 6677 2004 5156 1010 2066 1037 4845 3352 2112 1997 1037 5021 2338 1998 5307 1000 3565 19343 1000 2021 2130 5019 2066 2008 4995 1005 1056 2109 2000 2037 2440 4022 1998 1996 3033 2302 19343 2024 2074 11771 1012 1026 7987 1013 1028 1026 7987 1013 1028 2023 4417 1996 2880 2331 1997 11228 9961 2000 1996 2186 1012 19343 3849 2000 2022 1996 21699 4335 2085 1012 1012 1012 2021 102


INFO:tensorflow:input_ids: 101 2043 2017 2298 2067 2012 2178 2919 10103 8297 2066 19343 1005 1055 7195 1010 2017 2031 2000 2012 2560 2507 2009 2070 4923 2005 2667 2242 2047 1012 1998 2348 1996 3959 2775 2003 2062 22249 2009 4107 7078 2498 2047 2000 1996 2186 1012 2748 1010 2045 1005 1055 1996 5541 6677 2004 5156 1010 2066 1037 4845 3352 2112 1997 1037 5021 2338 1998 5307 1000 3565 19343 1000 2021 2130 5019 2066 2008 4995 1005 1056 2109 2000 2037 2440 4022 1998 1996 3033 2302 19343 2024 2074 11771 1012 1026 7987 1013 1028 1026 7987 1013 1028 2023 4417 1996 2880 2331 1997 11228 9961 2000 1996 2186 1012 19343 3849 2000 2022 1996 21699 4335 2085 1012 1012 1012 2021 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] oliver ! the musical is a favorite of mine . the music , the characters , the story . it all just seems perfect . in this rendition of the timeless classic novel turned stage musical , director carol reed brings the broadway hit to life on the movie screen . < br / > < br / > the transition from musical to movie musical is not an easy one . you have to have the right voices , the right set , the right script , and the right play . all signs point to yes for this play . it almost appears that it was written for the screen ! < br / > < br / > our story takes place in [SEP]


INFO:tensorflow:tokens: [CLS] oliver ! the musical is a favorite of mine . the music , the characters , the story . it all just seems perfect . in this rendition of the timeless classic novel turned stage musical , director carol reed brings the broadway hit to life on the movie screen . < br / > < br / > the transition from musical to movie musical is not an easy one . you have to have the right voices , the right set , the right script , and the right play . all signs point to yes for this play . it almost appears that it was written for the screen ! < br / > < br / > our story takes place in [SEP]


INFO:tensorflow:input_ids: 101 6291 999 1996 3315 2003 1037 5440 1997 3067 1012 1996 2189 1010 1996 3494 1010 1996 2466 1012 2009 2035 2074 3849 3819 1012 1999 2023 19187 1997 1996 27768 4438 3117 2357 2754 3315 1010 2472 8594 7305 7545 1996 5934 2718 2000 2166 2006 1996 3185 3898 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 6653 2013 3315 2000 3185 3315 2003 2025 2019 3733 2028 1012 2017 2031 2000 2031 1996 2157 5755 1010 1996 2157 2275 1010 1996 2157 5896 1010 1998 1996 2157 2377 1012 2035 5751 2391 2000 2748 2005 2023 2377 1012 2009 2471 3544 2008 2009 2001 2517 2005 1996 3898 999 1026 7987 1013 1028 1026 7987 1013 1028 2256 2466 3138 2173 1999 102


INFO:tensorflow:input_ids: 101 6291 999 1996 3315 2003 1037 5440 1997 3067 1012 1996 2189 1010 1996 3494 1010 1996 2466 1012 2009 2035 2074 3849 3819 1012 1999 2023 19187 1997 1996 27768 4438 3117 2357 2754 3315 1010 2472 8594 7305 7545 1996 5934 2718 2000 2166 2006 1996 3185 3898 1012 1026 7987 1013 1028 1026 7987 1013 1028 1996 6653 2013 3315 2000 3185 3315 2003 2025 2019 3733 2028 1012 2017 2031 2000 2031 1996 2157 5755 1010 1996 2157 2275 1010 1996 2157 5896 1010 1998 1996 2157 2377 1012 2035 5751 2391 2000 2748 2005 2023 2377 1012 2009 2471 3544 2008 2009 2001 2517 2005 1996 3898 999 1026 7987 1013 1028 1026 7987 1013 1028 2256 2466 3138 2173 1999 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] when you go at an open air cinema under the greek summer night you usually don ' t care what the movie is ! edison started really good with some good effort from the singers - who - want - to - be actors and a once again great morgan freeman but . . . ( in a movie there is usually a good start to catch audience , done , a bit boring yet story filling middle of the movie that is more about characters and less about action , done , and the third part is something really good so that you can remember the movie . . . ) when you see 30 elite police officers ( packed with weapons that can demo [SEP]


INFO:tensorflow:tokens: [CLS] when you go at an open air cinema under the greek summer night you usually don ' t care what the movie is ! edison started really good with some good effort from the singers - who - want - to - be actors and a once again great morgan freeman but . . . ( in a movie there is usually a good start to catch audience , done , a bit boring yet story filling middle of the movie that is more about characters and less about action , done , and the third part is something really good so that you can remember the movie . . . ) when you see 30 elite police officers ( packed with weapons that can demo [SEP]


INFO:tensorflow:input_ids: 101 2043 2017 2175 2012 2019 2330 2250 5988 2104 1996 3306 2621 2305 2017 2788 2123 1005 1056 2729 2054 1996 3185 2003 999 17046 2318 2428 2204 2007 2070 2204 3947 2013 1996 8453 1011 2040 1011 2215 1011 2000 1011 2022 5889 1998 1037 2320 2153 2307 5253 11462 2021 1012 1012 1012 1006 1999 1037 3185 2045 2003 2788 1037 2204 2707 2000 4608 4378 1010 2589 1010 1037 2978 11771 2664 2466 8110 2690 1997 1996 3185 2008 2003 2062 2055 3494 1998 2625 2055 2895 1010 2589 1010 1998 1996 2353 2112 2003 2242 2428 2204 2061 2008 2017 2064 3342 1996 3185 1012 1012 1012 1007 2043 2017 2156 2382 7069 2610 3738 1006 8966 2007 4255 2008 2064 9703 102


INFO:tensorflow:input_ids: 101 2043 2017 2175 2012 2019 2330 2250 5988 2104 1996 3306 2621 2305 2017 2788 2123 1005 1056 2729 2054 1996 3185 2003 999 17046 2318 2428 2204 2007 2070 2204 3947 2013 1996 8453 1011 2040 1011 2215 1011 2000 1011 2022 5889 1998 1037 2320 2153 2307 5253 11462 2021 1012 1012 1012 1006 1999 1037 3185 2045 2003 2788 1037 2204 2707 2000 4608 4378 1010 2589 1010 1037 2978 11771 2664 2466 8110 2690 1997 1996 3185 2008 2003 2062 2055 3494 1998 2625 2055 2895 1010 2589 1010 1998 1996 2353 2112 2003 2242 2428 2204 2061 2008 2017 2064 3342 1996 3185 1012 1012 1012 1007 2043 2017 2156 2382 7069 2610 3738 1006 8966 2007 4255 2008 2064 9703 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] a good film , and one i ' ll watch a number of times . rich ( the previous comment ##er ) is right : there is much more going on here than is clear from the title boards , and i have to wonder how much has suffered in translation . were there more in the original ? or was a native - language audience expected to lip - read more ? or - - since the screenplay was written by the author of the novel on which this was based - - was this a currently popular story with which the audience was already very familiar ? in short , very worth a look , but it probably requires more work from contemporary viewers [SEP]


INFO:tensorflow:tokens: [CLS] a good film , and one i ' ll watch a number of times . rich ( the previous comment ##er ) is right : there is much more going on here than is clear from the title boards , and i have to wonder how much has suffered in translation . were there more in the original ? or was a native - language audience expected to lip - read more ? or - - since the screenplay was written by the author of the novel on which this was based - - was this a currently popular story with which the audience was already very familiar ? in short , very worth a look , but it probably requires more work from contemporary viewers [SEP]


INFO:tensorflow:input_ids: 101 1037 2204 2143 1010 1998 2028 1045 1005 2222 3422 1037 2193 1997 2335 1012 4138 1006 1996 3025 7615 2121 1007 2003 2157 1024 2045 2003 2172 2062 2183 2006 2182 2084 2003 3154 2013 1996 2516 7923 1010 1998 1045 2031 2000 4687 2129 2172 2038 4265 1999 5449 1012 2020 2045 2062 1999 1996 2434 1029 2030 2001 1037 3128 1011 2653 4378 3517 2000 5423 1011 3191 2062 1029 2030 1011 1011 2144 1996 9000 2001 2517 2011 1996 3166 1997 1996 3117 2006 2029 2023 2001 2241 1011 1011 2001 2023 1037 2747 2759 2466 2007 2029 1996 4378 2001 2525 2200 5220 1029 1999 2460 1010 2200 4276 1037 2298 1010 2021 2009 2763 5942 2062 2147 2013 3824 7193 102


INFO:tensorflow:input_ids: 101 1037 2204 2143 1010 1998 2028 1045 1005 2222 3422 1037 2193 1997 2335 1012 4138 1006 1996 3025 7615 2121 1007 2003 2157 1024 2045 2003 2172 2062 2183 2006 2182 2084 2003 3154 2013 1996 2516 7923 1010 1998 1045 2031 2000 4687 2129 2172 2038 4265 1999 5449 1012 2020 2045 2062 1999 1996 2434 1029 2030 2001 1037 3128 1011 2653 4378 3517 2000 5423 1011 3191 2062 1029 2030 1011 1011 2144 1996 9000 2001 2517 2011 1996 3166 1997 1996 3117 2006 2029 2023 2001 2241 1011 1011 2001 2023 1037 2747 2759 2466 2007 2029 1996 4378 2001 2525 2200 5220 1029 1999 2460 1010 2200 4276 1037 2298 1010 2021 2009 2763 5942 2062 2147 2013 3824 7193 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] by far this is the worst halloween movie ever made . the acting is bad , except for paul rudd , and donald please ##nce . the girl who played kara ( forgot her name ) was ok , but overall this movie was basically a big let ##down . nothing moved the story forward , it lacked substance , and the scares that made halloween and h ##20 so good . all and all , skip this movie , it ' s not worth the price of rental . [SEP]


INFO:tensorflow:tokens: [CLS] by far this is the worst halloween movie ever made . the acting is bad , except for paul rudd , and donald please ##nce . the girl who played kara ( forgot her name ) was ok , but overall this movie was basically a big let ##down . nothing moved the story forward , it lacked substance , and the scares that made halloween and h ##20 so good . all and all , skip this movie , it ' s not worth the price of rental . [SEP]


INFO:tensorflow:input_ids: 101 2011 2521 2023 2003 1996 5409 14414 3185 2412 2081 1012 1996 3772 2003 2919 1010 3272 2005 2703 25298 1010 1998 6221 3531 5897 1012 1996 2611 2040 2209 13173 1006 9471 2014 2171 1007 2001 7929 1010 2021 3452 2023 3185 2001 10468 1037 2502 2292 7698 1012 2498 2333 1996 2466 2830 1010 2009 10858 9415 1010 1998 1996 29421 2008 2081 14414 1998 1044 11387 2061 2204 1012 2035 1998 2035 1010 13558 2023 3185 1010 2009 1005 1055 2025 4276 1996 3976 1997 12635 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 2011 2521 2023 2003 1996 5409 14414 3185 2412 2081 1012 1996 3772 2003 2919 1010 3272 2005 2703 25298 1010 1998 6221 3531 5897 1012 1996 2611 2040 2209 13173 1006 9471 2014 2171 1007 2001 7929 1010 2021 3452 2023 3185 2001 10468 1037 2502 2292 7698 1012 2498 2333 1996 2466 2830 1010 2009 10858 9415 1010 1998 1996 29421 2008 2081 14414 1998 1044 11387 2061 2204 1012 2035 1998 2035 1010 13558 2023 3185 1010 2009 1005 1055 2025 4276 1996 3976 1997 12635 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] . . . the first ? kill ##joy 1 . but here ' s the review of kill ##joy 2 : < br / > < br / > ( contains spoil ##ers , so be ##ware readers ) < br / > < br / > oh my . oh , my , my , my . i ' ll start off with telling you that i had no hopes in the least bit that this movie would be good . considering that kill ##joy ( the first movie ) is without a doubt the worst movie ever made , the sequel didn ' t have much promise . < br / > < br / > as expected , it didn ' t deliver [SEP]


INFO:tensorflow:tokens: [CLS] . . . the first ? kill ##joy 1 . but here ' s the review of kill ##joy 2 : < br / > < br / > ( contains spoil ##ers , so be ##ware readers ) < br / > < br / > oh my . oh , my , my , my . i ' ll start off with telling you that i had no hopes in the least bit that this movie would be good . considering that kill ##joy ( the first movie ) is without a doubt the worst movie ever made , the sequel didn ' t have much promise . < br / > < br / > as expected , it didn ' t deliver [SEP]


INFO:tensorflow:input_ids: 101 1012 1012 1012 1996 2034 1029 3102 24793 1015 1012 2021 2182 1005 1055 1996 3319 1997 3102 24793 1016 1024 1026 7987 1013 1028 1026 7987 1013 1028 1006 3397 27594 2545 1010 2061 2022 8059 8141 1007 1026 7987 1013 1028 1026 7987 1013 1028 2821 2026 1012 2821 1010 2026 1010 2026 1010 2026 1012 1045 1005 2222 2707 2125 2007 4129 2017 2008 1045 2018 2053 8069 1999 1996 2560 2978 2008 2023 3185 2052 2022 2204 1012 6195 2008 3102 24793 1006 1996 2034 3185 1007 2003 2302 1037 4797 1996 5409 3185 2412 2081 1010 1996 8297 2134 1005 1056 2031 2172 4872 1012 1026 7987 1013 1028 1026 7987 1013 1028 2004 3517 1010 2009 2134 1005 1056 8116 102


INFO:tensorflow:input_ids: 101 1012 1012 1012 1996 2034 1029 3102 24793 1015 1012 2021 2182 1005 1055 1996 3319 1997 3102 24793 1016 1024 1026 7987 1013 1028 1026 7987 1013 1028 1006 3397 27594 2545 1010 2061 2022 8059 8141 1007 1026 7987 1013 1028 1026 7987 1013 1028 2821 2026 1012 2821 1010 2026 1010 2026 1010 2026 1012 1045 1005 2222 2707 2125 2007 4129 2017 2008 1045 2018 2053 8069 1999 1996 2560 2978 2008 2023 3185 2052 2022 2204 1012 6195 2008 3102 24793 1006 1996 2034 3185 1007 2003 2302 1037 4797 1996 5409 3185 2412 2081 1010 1996 8297 2134 1005 1056 2031 2172 4872 1012 1026 7987 1013 1028 1026 7987 1013 1028 2004 3517 1010 2009 2134 1005 1056 8116 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] this is a pretty obscure , dumb horror movie set in the 1970s ever ##gl ##ades . it is really stupid and lame for the first half , then it actually starts to get good for the last half . there is a scene with the hero running to save his friends interspersed with shots of a church group singing , i don ' t know . it is me ##sm ##eri ##zing . i was impressed with the night time scenes , because it actually looked like night , unlike most low budget horror films where it still looks like daytime . i feel like the director was really talented but was working with a mini ##scu ##le budget and a tough schedule . there [SEP]


INFO:tensorflow:tokens: [CLS] this is a pretty obscure , dumb horror movie set in the 1970s ever ##gl ##ades . it is really stupid and lame for the first half , then it actually starts to get good for the last half . there is a scene with the hero running to save his friends interspersed with shots of a church group singing , i don ' t know . it is me ##sm ##eri ##zing . i was impressed with the night time scenes , because it actually looked like night , unlike most low budget horror films where it still looks like daytime . i feel like the director was really talented but was working with a mini ##scu ##le budget and a tough schedule . there [SEP]


INFO:tensorflow:input_ids: 101 2023 2003 1037 3492 14485 1010 12873 5469 3185 2275 1999 1996 3955 2412 23296 18673 1012 2009 2003 2428 5236 1998 20342 2005 1996 2034 2431 1010 2059 2009 2941 4627 2000 2131 2204 2005 1996 2197 2431 1012 2045 2003 1037 3496 2007 1996 5394 2770 2000 3828 2010 2814 25338 2007 7171 1997 1037 2277 2177 4823 1010 1045 2123 1005 1056 2113 1012 2009 2003 2033 6491 11124 6774 1012 1045 2001 7622 2007 1996 2305 2051 5019 1010 2138 2009 2941 2246 2066 2305 1010 4406 2087 2659 5166 5469 3152 2073 2009 2145 3504 2066 12217 1012 1045 2514 2066 1996 2472 2001 2428 10904 2021 2001 2551 2007 1037 7163 28817 2571 5166 1998 1037 7823 6134 1012 2045 102


INFO:tensorflow:input_ids: 101 2023 2003 1037 3492 14485 1010 12873 5469 3185 2275 1999 1996 3955 2412 23296 18673 1012 2009 2003 2428 5236 1998 20342 2005 1996 2034 2431 1010 2059 2009 2941 4627 2000 2131 2204 2005 1996 2197 2431 1012 2045 2003 1037 3496 2007 1996 5394 2770 2000 3828 2010 2814 25338 2007 7171 1997 1037 2277 2177 4823 1010 1045 2123 1005 1056 2113 1012 2009 2003 2033 6491 11124 6774 1012 1045 2001 7622 2007 1996 2305 2051 5019 1010 2138 2009 2941 2246 2066 2305 1010 4406 2087 2659 5166 5469 3152 2073 2009 2145 3504 2066 12217 1012 1045 2514 2066 1996 2472 2001 2428 10904 2021 2001 2551 2007 1037 7163 28817 2571 5166 1998 1037 7823 6134 1012 2045 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] craig brewer grew up in tennessee , it is evident in his movie . forget the black guy on white girl action . it happens , but it isn ' t samuel l . jackson on christina ric ##ci . more importantly this movie is about the values and culture of the people in this tennessee town . how they deal with divorce , abandonment , sexual abuse and psychological disorders . while shrink ##s make millions in the cities of the north , midwest and west coast , the town minister , who also gr ##apple ##s with his own problems , becomes the counselor and media ##tor . it is a interesting concept and one that may not settle well with everyone . < [SEP]


INFO:tensorflow:tokens: [CLS] craig brewer grew up in tennessee , it is evident in his movie . forget the black guy on white girl action . it happens , but it isn ' t samuel l . jackson on christina ric ##ci . more importantly this movie is about the values and culture of the people in this tennessee town . how they deal with divorce , abandonment , sexual abuse and psychological disorders . while shrink ##s make millions in the cities of the north , midwest and west coast , the town minister , who also gr ##apple ##s with his own problems , becomes the counselor and media ##tor . it is a interesting concept and one that may not settle well with everyone . < [SEP]


INFO:tensorflow:input_ids: 101 7010 18710 3473 2039 1999 5298 1010 2009 2003 10358 1999 2010 3185 1012 5293 1996 2304 3124 2006 2317 2611 2895 1012 2009 6433 1010 2021 2009 3475 1005 1056 5212 1048 1012 4027 2006 12657 26220 6895 1012 2062 14780 2023 3185 2003 2055 1996 5300 1998 3226 1997 1996 2111 1999 2023 5298 2237 1012 2129 2027 3066 2007 8179 1010 22290 1010 4424 6905 1998 8317 10840 1012 2096 22802 2015 2191 8817 1999 1996 3655 1997 1996 2167 1010 13608 1998 2225 3023 1010 1996 2237 2704 1010 2040 2036 24665 23804 2015 2007 2010 2219 3471 1010 4150 1996 17220 1998 2865 4263 1012 2009 2003 1037 5875 4145 1998 2028 2008 2089 2025 7392 2092 2007 3071 1012 1026 102


INFO:tensorflow:input_ids: 101 7010 18710 3473 2039 1999 5298 1010 2009 2003 10358 1999 2010 3185 1012 5293 1996 2304 3124 2006 2317 2611 2895 1012 2009 6433 1010 2021 2009 3475 1005 1056 5212 1048 1012 4027 2006 12657 26220 6895 1012 2062 14780 2023 3185 2003 2055 1996 5300 1998 3226 1997 1996 2111 1999 2023 5298 2237 1012 2129 2027 3066 2007 8179 1010 22290 1010 4424 6905 1998 8317 10840 1012 2096 22802 2015 2191 8817 1999 1996 3655 1997 1996 2167 1010 13608 1998 2225 3023 1010 1996 2237 2704 1010 2040 2036 24665 23804 2015 2007 2010 2219 3471 1010 4150 1996 17220 1998 2865 4263 1012 2009 2003 1037 5875 4145 1998 2028 2008 2089 2025 7392 2092 2007 3071 1012 1026 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] sam elliot is brilliant as a tough san francisco detective charlie fallon . when his partner is killed while meeting with an informant fallon snaps , beats the informant to death , and dump ##s his body in a river . the next day fallon is assigned a rookie partner , and given the task of investigating the informant ##s murder . sam elliot does a good job of portraying a man who tortured by the guilt of his own murderous actions , and grief over the death of his partner who may have been involved in police corruption . [SEP]


INFO:tensorflow:tokens: [CLS] sam elliot is brilliant as a tough san francisco detective charlie fallon . when his partner is killed while meeting with an informant fallon snaps , beats the informant to death , and dump ##s his body in a river . the next day fallon is assigned a rookie partner , and given the task of investigating the informant ##s murder . sam elliot does a good job of portraying a man who tortured by the guilt of his own murderous actions , and grief over the death of his partner who may have been involved in police corruption . [SEP]


INFO:tensorflow:input_ids: 101 3520 11759 2003 8235 2004 1037 7823 2624 3799 6317 4918 16443 1012 2043 2010 4256 2003 2730 2096 3116 2007 2019 28694 16443 20057 1010 10299 1996 28694 2000 2331 1010 1998 15653 2015 2010 2303 1999 1037 2314 1012 1996 2279 2154 16443 2003 4137 1037 8305 4256 1010 1998 2445 1996 4708 1997 11538 1996 28694 2015 4028 1012 3520 11759 2515 1037 2204 3105 1997 17274 1037 2158 2040 12364 2011 1996 8056 1997 2010 2219 25303 4506 1010 1998 9940 2058 1996 2331 1997 2010 4256 2040 2089 2031 2042 2920 1999 2610 7897 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 3520 11759 2003 8235 2004 1037 7823 2624 3799 6317 4918 16443 1012 2043 2010 4256 2003 2730 2096 3116 2007 2019 28694 16443 20057 1010 10299 1996 28694 2000 2331 1010 1998 15653 2015 2010 2303 1999 1037 2314 1012 1996 2279 2154 16443 2003 4137 1037 8305 4256 1010 1998 2445 1996 4708 1997 11538 1996 28694 2015 4028 1012 3520 11759 2515 1037 2204 3105 1997 17274 1037 2158 2040 12364 2011 1996 8056 1997 2010 2219 25303 4506 1010 1998 9940 2058 1996 2331 1997 2010 4256 2040 2089 2031 2042 2920 1999 2610 7897 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


#Creating a model

Now that we've prepared our data, let's focus on building a model. `create_model` does just this below. First, it loads the BERT tf hub module again (this time to extract the computation graph). Next, it creates a single new layer that will be trained to adapt BERT to our sentiment task (i.e. classifying whether a movie review is positive or negative). This strategy of using a mostly trained model is called [fine-tuning](http://wiki.fast.ai/index.php/Fine_tuning).

In [0]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)


Next we'll wrap our model function in a `model_fn_builder` function that adapts our model to work for training, evaluation, and prediction.

In [0]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        f1_score = tf.contrib.metrics.f1_score(
            label_ids,
            predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        return {
            "eval_accuracy": accuracy,
            "f1_score": f1_score,
            "auc": auc,
            "precision": precision,
            "recall": recall,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
        }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


In [0]:
# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 500
SAVE_SUMMARY_STEPS = 100

In [0]:
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [0]:
# Specify outpit directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [0]:
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})


INFO:tensorflow:Using config: {'_model_dir': 'OUTPUT_DIR_NAME', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f0db4f3ca90>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


INFO:tensorflow:Using config: {'_model_dir': 'OUTPUT_DIR_NAME', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f0db4f3ca90>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


Next we create an input builder function that takes our training feature set (`train_features`) and produces a generator. This is a pretty standard design pattern for working with Tensorflow [Estimators](https://www.tensorflow.org/guide/estimators).

In [0]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

Now we train our model! For me, using a Colab notebook running on Google's GPUs, my training time was about 14 minutes.

In [0]:
print('Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.


Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.




















Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where


Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into OUTPUT_DIR_NAME/model.ckpt.


INFO:tensorflow:Saving checkpoints for 0 into OUTPUT_DIR_NAME/model.ckpt.


INFO:tensorflow:loss = 0.74320954, step = 0


INFO:tensorflow:loss = 0.74320954, step = 0


INFO:tensorflow:global_step/sec: 1.6589


INFO:tensorflow:global_step/sec: 1.6589


INFO:tensorflow:loss = 0.43701357, step = 100 (60.282 sec)


INFO:tensorflow:loss = 0.43701357, step = 100 (60.282 sec)


INFO:tensorflow:global_step/sec: 2.11205


INFO:tensorflow:global_step/sec: 2.11205


INFO:tensorflow:loss = 0.29057312, step = 200 (47.350 sec)


INFO:tensorflow:loss = 0.29057312, step = 200 (47.350 sec)


INFO:tensorflow:global_step/sec: 2.11216


INFO:tensorflow:global_step/sec: 2.11216


INFO:tensorflow:loss = 0.041003473, step = 300 (47.345 sec)


INFO:tensorflow:loss = 0.041003473, step = 300 (47.345 sec)


INFO:tensorflow:global_step/sec: 2.11221


INFO:tensorflow:global_step/sec: 2.11221


INFO:tensorflow:loss = 0.006000277, step = 400 (47.342 sec)


INFO:tensorflow:loss = 0.006000277, step = 400 (47.342 sec)


INFO:tensorflow:Saving checkpoints for 468 into OUTPUT_DIR_NAME/model.ckpt.


INFO:tensorflow:Saving checkpoints for 468 into OUTPUT_DIR_NAME/model.ckpt.


INFO:tensorflow:Loss for final step: 0.0019985293.


INFO:tensorflow:Loss for final step: 0.0019985293.


Training took time  0:04:47.645333


Now let's use our test data to see how well our model did:

In [0]:
test_input_fn = run_classifier.input_fn_builder(
    features=test_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

In [0]:
estimator.evaluate(input_fn=test_input_fn, steps=None)

INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Starting evaluation at 2020-04-24T13:17:07Z


INFO:tensorflow:Starting evaluation at 2020-04-24T13:17:07Z


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Restoring parameters from OUTPUT_DIR_NAME/model.ckpt-468


INFO:tensorflow:Restoring parameters from OUTPUT_DIR_NAME/model.ckpt-468


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Finished evaluation at 2020-04-24-13:17:37


INFO:tensorflow:Finished evaluation at 2020-04-24-13:17:37


INFO:tensorflow:Saving dict for global step 468: auc = 0.8652201, eval_accuracy = 0.8652, f1_score = 0.8655227, false_negatives = 347.0, false_positives = 327.0, global_step = 468, loss = 0.53787345, precision = 0.86899036, recall = 0.86208266, true_negatives = 2157.0, true_positives = 2169.0


INFO:tensorflow:Saving dict for global step 468: auc = 0.8652201, eval_accuracy = 0.8652, f1_score = 0.8655227, false_negatives = 347.0, false_positives = 327.0, global_step = 468, loss = 0.53787345, precision = 0.86899036, recall = 0.86208266, true_negatives = 2157.0, true_positives = 2169.0


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 468: OUTPUT_DIR_NAME/model.ckpt-468


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 468: OUTPUT_DIR_NAME/model.ckpt-468


{'auc': 0.8652201,
 'eval_accuracy': 0.8652,
 'f1_score': 0.8655227,
 'false_negatives': 347.0,
 'false_positives': 327.0,
 'global_step': 468,
 'loss': 0.53787345,
 'precision': 0.86899036,
 'recall': 0.86208266,
 'true_negatives': 2157.0,
 'true_positives': 2169.0}

Now let's write code to make predictions on new sentences:

In [0]:
import numpy as np #to convert to between 0 and 1
def getPrediction(in_sentences):
  labels = ["Negative", "Positive"]
  input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, label = 0) for x in in_sentences] # here, "" is just a dummy label
  input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
  predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
  predictions = estimator.predict(predict_input_fn)
  return [(sentence, np.exp(prediction['probabilities']), labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]

In [0]:
pred_sentences = [
  "That movie was absolutely awful",
  "The acting was a bit lacking",
  "The film was creative and surprising",
  "Absolutely fantastic!"
]

In [0]:
predictions = getPrediction(pred_sentences)

INFO:tensorflow:Writing example 0 of 4


INFO:tensorflow:Writing example 0 of 4


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] that movie was absolutely awful [SEP]


INFO:tensorflow:tokens: [CLS] that movie was absolutely awful [SEP]


INFO:tensorflow:input_ids: 101 2008 3185 2001 7078 9643 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 2008 3185 2001 7078 9643 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] the acting was a bit lacking [SEP]


INFO:tensorflow:tokens: [CLS] the acting was a bit lacking [SEP]


INFO:tensorflow:input_ids: 101 1996 3772 2001 1037 2978 11158 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 1996 3772 2001 1037 2978 11158 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] the film was creative and surprising [SEP]


INFO:tensorflow:tokens: [CLS] the film was creative and surprising [SEP]


INFO:tensorflow:input_ids: 101 1996 2143 2001 5541 1998 11341 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 1996 2143 2001 5541 1998 11341 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] absolutely fantastic ! [SEP]


INFO:tensorflow:tokens: [CLS] absolutely fantastic ! [SEP]


INFO:tensorflow:input_ids: 101 7078 10392 999 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 7078 10392 999 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Restoring parameters from OUTPUT_DIR_NAME/model.ckpt-468


INFO:tensorflow:Restoring parameters from OUTPUT_DIR_NAME/model.ckpt-468


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


Voila! We have a sentiment classifier!

In [0]:
predictions

[('That movie was absolutely awful',
  array([0.99871325, 0.00128672], dtype=float32),
  'Negative'),
 ('The acting was a bit lacking',
  array([0.9929542, 0.0070458], dtype=float32),
  'Negative'),
 ('The film was creative and surprising',
  array([0.00162481, 0.9983752 ], dtype=float32),
  'Positive'),
 ('Absolutely fantastic!',
  array([0.00249779, 0.99750227], dtype=float32),
  'Positive')]

Okay cool it all works great, so now lets get our news dataset from Google Drive for use with our classifier!

In [0]:
from google.colab import drive
drive.mount('/content/drive')


Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


In [0]:
import json

with open('/content/drive/My Drive/all-data2.json', 'r') as f:
  data = f.read()
obj = json.loads(data)

body_list = []
for a in obj:
  string = a['body']
  body_list.append(string)

predictions = getPrediction(body_list)


INFO:tensorflow:Writing example 0 of 20778


INFO:tensorflow:Writing example 0 of 20778


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] a powerful group of institutional investors is urging aviv ##a to accelerate an overhaul of its strategy as the ft ##se - 100 ins ##urer faces renewed discontent in the city over its performance . sky news has learnt that the investor forum whose members collectively manage assets worth about £2 ##1 ##t ##n has told aviv ##a that it must set out a credible long - term plan to increase its value later this week . the forum which has established itself during the last five years as one of the city ' s most important platforms for engagement between large shareholders and major companies is understood to have written to aviv ##a within the last few weeks . the move is a bold one [SEP]


INFO:tensorflow:tokens: [CLS] a powerful group of institutional investors is urging aviv ##a to accelerate an overhaul of its strategy as the ft ##se - 100 ins ##urer faces renewed discontent in the city over its performance . sky news has learnt that the investor forum whose members collectively manage assets worth about £2 ##1 ##t ##n has told aviv ##a that it must set out a credible long - term plan to increase its value later this week . the forum which has established itself during the last five years as one of the city ' s most important platforms for engagement between large shareholders and major companies is understood to have written to aviv ##a within the last few weeks . the move is a bold one [SEP]


INFO:tensorflow:input_ids: 101 1037 3928 2177 1997 12148 9387 2003 14328 12724 2050 2000 23306 2019 18181 1997 2049 5656 2004 1996 3027 3366 1011 2531 16021 27595 5344 9100 27648 1999 1996 2103 2058 2049 2836 1012 3712 2739 2038 20215 2008 1996 14316 7057 3005 2372 13643 6133 7045 4276 2055 21853 2487 2102 2078 2038 2409 12724 2050 2008 2009 2442 2275 2041 1037 23411 2146 1011 2744 2933 2000 3623 2049 3643 2101 2023 2733 1012 1996 7057 2029 2038 2511 2993 2076 1996 2197 2274 2086 2004 2028 1997 1996 2103 1005 1055 2087 2590 7248 2005 8147 2090 2312 15337 1998 2350 3316 2003 5319 2000 2031 2517 2000 12724 2050 2306 1996 2197 2261 3134 1012 1996 2693 2003 1037 7782 2028 102


INFO:tensorflow:input_ids: 101 1037 3928 2177 1997 12148 9387 2003 14328 12724 2050 2000 23306 2019 18181 1997 2049 5656 2004 1996 3027 3366 1011 2531 16021 27595 5344 9100 27648 1999 1996 2103 2058 2049 2836 1012 3712 2739 2038 20215 2008 1996 14316 7057 3005 2372 13643 6133 7045 4276 2055 21853 2487 2102 2078 2038 2409 12724 2050 2008 2009 2442 2275 2041 1037 23411 2146 1011 2744 2933 2000 3623 2049 3643 2101 2023 2733 1012 1996 7057 2029 2038 2511 2993 2076 1996 2197 2274 2086 2004 2028 1997 1996 2103 1005 1055 2087 2590 7248 2005 8147 2090 2312 15337 1998 2350 3316 2003 5319 2000 2031 2517 2000 12724 2050 2306 1996 2197 2261 3134 1012 1996 2693 2003 1037 7782 2028 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] a powerful group of institutional investors is urging aviv ##a to accelerate an overhaul of its strategy as the ft ##se - 100 ins ##urer faces renewed discontent in the city over its performance . sky news has learnt that the investor forum whose members collectively manage assets worth about £2 ##1 ##tr ##n has told aviv ##a that it must set out a credible long - term plan to increase its value later this week . the forum which has established itself during the last five years as one of the city ' s most important platforms for engagement between large shareholders and major companies is understood to have written to aviv ##a within the last few weeks . the move is a bold one [SEP]


INFO:tensorflow:tokens: [CLS] a powerful group of institutional investors is urging aviv ##a to accelerate an overhaul of its strategy as the ft ##se - 100 ins ##urer faces renewed discontent in the city over its performance . sky news has learnt that the investor forum whose members collectively manage assets worth about £2 ##1 ##tr ##n has told aviv ##a that it must set out a credible long - term plan to increase its value later this week . the forum which has established itself during the last five years as one of the city ' s most important platforms for engagement between large shareholders and major companies is understood to have written to aviv ##a within the last few weeks . the move is a bold one [SEP]


INFO:tensorflow:input_ids: 101 1037 3928 2177 1997 12148 9387 2003 14328 12724 2050 2000 23306 2019 18181 1997 2049 5656 2004 1996 3027 3366 1011 2531 16021 27595 5344 9100 27648 1999 1996 2103 2058 2049 2836 1012 3712 2739 2038 20215 2008 1996 14316 7057 3005 2372 13643 6133 7045 4276 2055 21853 2487 16344 2078 2038 2409 12724 2050 2008 2009 2442 2275 2041 1037 23411 2146 1011 2744 2933 2000 3623 2049 3643 2101 2023 2733 1012 1996 7057 2029 2038 2511 2993 2076 1996 2197 2274 2086 2004 2028 1997 1996 2103 1005 1055 2087 2590 7248 2005 8147 2090 2312 15337 1998 2350 3316 2003 5319 2000 2031 2517 2000 12724 2050 2306 1996 2197 2261 3134 1012 1996 2693 2003 1037 7782 2028 102


INFO:tensorflow:input_ids: 101 1037 3928 2177 1997 12148 9387 2003 14328 12724 2050 2000 23306 2019 18181 1997 2049 5656 2004 1996 3027 3366 1011 2531 16021 27595 5344 9100 27648 1999 1996 2103 2058 2049 2836 1012 3712 2739 2038 20215 2008 1996 14316 7057 3005 2372 13643 6133 7045 4276 2055 21853 2487 16344 2078 2038 2409 12724 2050 2008 2009 2442 2275 2041 1037 23411 2146 1011 2744 2933 2000 3623 2049 3643 2101 2023 2733 1012 1996 7057 2029 2038 2511 2993 2076 1996 2197 2274 2086 2004 2028 1997 1996 2103 1005 1055 2087 2590 7248 2005 8147 2090 2312 15337 1998 2350 3316 2003 5319 2000 2031 2517 2000 12724 2050 2306 1996 2197 2261 3134 1012 1996 2693 2003 1037 7782 2028 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] a 310 - year - old violin worth £2 ##500 ##00 which was stolen after its owner accidentally left it on a train is now back with the musician . the antique instrument was left by stephen morris on the london victoria to or ##ping ##ton service when he got off at peng ##e east on tuesday 22 october . professional musician mr morris had put out an appeal for its safe return and said it was devastating to lose the instrument i ' ve been playing for 20 years . now he has t ##wee ##ted : my violin is home safe and sound ! thanks for the overwhelming support x . following last month ' s incident british transport police said another man had [SEP]


INFO:tensorflow:tokens: [CLS] a 310 - year - old violin worth £2 ##500 ##00 which was stolen after its owner accidentally left it on a train is now back with the musician . the antique instrument was left by stephen morris on the london victoria to or ##ping ##ton service when he got off at peng ##e east on tuesday 22 october . professional musician mr morris had put out an appeal for its safe return and said it was devastating to lose the instrument i ' ve been playing for 20 years . now he has t ##wee ##ted : my violin is home safe and sound ! thanks for the overwhelming support x . following last month ' s incident british transport police said another man had [SEP]


INFO:tensorflow:input_ids: 101 1037 17196 1011 2095 1011 2214 6710 4276 21853 29345 8889 2029 2001 7376 2044 2049 3954 9554 2187 2009 2006 1037 3345 2003 2085 2067 2007 1996 5455 1012 1996 14361 6602 2001 2187 2011 4459 6384 2006 1996 2414 3848 2000 2030 4691 2669 2326 2043 2002 2288 2125 2012 26473 2063 2264 2006 9857 2570 2255 1012 2658 5455 2720 6384 2018 2404 2041 2019 5574 2005 2049 3647 2709 1998 2056 2009 2001 14886 2000 4558 1996 6602 1045 1005 2310 2042 2652 2005 2322 2086 1012 2085 2002 2038 1056 28394 3064 1024 2026 6710 2003 2188 3647 1998 2614 999 4283 2005 1996 10827 2490 1060 1012 2206 2197 3204 1005 1055 5043 2329 3665 2610 2056 2178 2158 2018 102


INFO:tensorflow:input_ids: 101 1037 17196 1011 2095 1011 2214 6710 4276 21853 29345 8889 2029 2001 7376 2044 2049 3954 9554 2187 2009 2006 1037 3345 2003 2085 2067 2007 1996 5455 1012 1996 14361 6602 2001 2187 2011 4459 6384 2006 1996 2414 3848 2000 2030 4691 2669 2326 2043 2002 2288 2125 2012 26473 2063 2264 2006 9857 2570 2255 1012 2658 5455 2720 6384 2018 2404 2041 2019 5574 2005 2049 3647 2709 1998 2056 2009 2001 14886 2000 4558 1996 6602 1045 1005 2310 2042 2652 2005 2322 2086 1012 2085 2002 2038 1056 28394 3064 1024 2026 6710 2003 2188 3647 1998 2614 999 4283 2005 1996 10827 2490 1060 1012 2206 2197 3204 1005 1055 5043 2329 3665 2610 2056 2178 2158 2018 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] seventeen people have been arrested on suspicion of being involved with an international human trafficking group as almost 30 women were led to safety . those arrested are between 17 and 50 years of age and include 14 men and three women the metropolitan police said . all are in custody at a central london police station . scotland yard raided properties in red ##bridge have ##ring barking and da ##gen ##ham new ##ham brent ##wood and tower hamlets in the early hours of thursday . those held were arrested suspicion of modern slavery controlling prostitution class a drug offences and section five firearm offences relating to a stung ##un . an additional four warrant ##s were executed simultaneously in romania with one man being attested [SEP]


INFO:tensorflow:tokens: [CLS] seventeen people have been arrested on suspicion of being involved with an international human trafficking group as almost 30 women were led to safety . those arrested are between 17 and 50 years of age and include 14 men and three women the metropolitan police said . all are in custody at a central london police station . scotland yard raided properties in red ##bridge have ##ring barking and da ##gen ##ham new ##ham brent ##wood and tower hamlets in the early hours of thursday . those held were arrested suspicion of modern slavery controlling prostitution class a drug offences and section five firearm offences relating to a stung ##un . an additional four warrant ##s were executed simultaneously in romania with one man being attested [SEP]


INFO:tensorflow:input_ids: 101 9171 2111 2031 2042 4727 2006 10928 1997 2108 2920 2007 2019 2248 2529 11626 2177 2004 2471 2382 2308 2020 2419 2000 3808 1012 2216 4727 2024 2090 2459 1998 2753 2086 1997 2287 1998 2421 2403 2273 1998 2093 2308 1996 4956 2610 2056 1012 2035 2024 1999 9968 2012 1037 2430 2414 2610 2276 1012 3885 4220 18784 5144 1999 2417 6374 2031 4892 19372 1998 4830 6914 3511 2047 3511 12895 3702 1998 3578 21631 1999 1996 2220 2847 1997 9432 1012 2216 2218 2020 4727 10928 1997 2715 8864 9756 15016 2465 1037 4319 18421 1998 2930 2274 23646 18421 8800 2000 1037 19280 4609 1012 2019 3176 2176 10943 2015 2020 6472 7453 1999 6339 2007 2028 2158 2108 18470 102


INFO:tensorflow:input_ids: 101 9171 2111 2031 2042 4727 2006 10928 1997 2108 2920 2007 2019 2248 2529 11626 2177 2004 2471 2382 2308 2020 2419 2000 3808 1012 2216 4727 2024 2090 2459 1998 2753 2086 1997 2287 1998 2421 2403 2273 1998 2093 2308 1996 4956 2610 2056 1012 2035 2024 1999 9968 2012 1037 2430 2414 2610 2276 1012 3885 4220 18784 5144 1999 2417 6374 2031 4892 19372 1998 4830 6914 3511 2047 3511 12895 3702 1998 3578 21631 1999 1996 2220 2847 1997 9432 1012 2216 2218 2020 4727 10928 1997 2715 8864 9756 15016 2465 1037 4319 18421 1998 2930 2274 23646 18421 8800 2000 1037 19280 4609 1012 2019 3176 2176 10943 2015 2020 6472 7453 1999 6339 2007 2028 2158 2108 18470 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] at least 20 people have been hurt in a serious crash in cambridgeshire between a mini ##bus and a car emergency services have said . more than 20 firefighters were at the scene along with ambulance ##s and police . in a statement cambridgeshire police said : multiple people are involved and some are seriously injured . the roads going in either direction at the junction have been closed and motor ##ists are advised to avoid the area . casualties are being taken to add ##en ##brook ##e ' s and hi ##nch ##ing ##brook ##e hospitals a spokesperson for the east of england ambulance service said . police have said the crash happened on the b1 ##0 ##40 some ##rs ##ham road at the junction [SEP]


INFO:tensorflow:tokens: [CLS] at least 20 people have been hurt in a serious crash in cambridgeshire between a mini ##bus and a car emergency services have said . more than 20 firefighters were at the scene along with ambulance ##s and police . in a statement cambridgeshire police said : multiple people are involved and some are seriously injured . the roads going in either direction at the junction have been closed and motor ##ists are advised to avoid the area . casualties are being taken to add ##en ##brook ##e ' s and hi ##nch ##ing ##brook ##e hospitals a spokesperson for the east of england ambulance service said . police have said the crash happened on the b1 ##0 ##40 some ##rs ##ham road at the junction [SEP]


INFO:tensorflow:input_ids: 101 2012 2560 2322 2111 2031 2042 3480 1999 1037 3809 5823 1999 24197 2090 1037 7163 8286 1998 1037 2482 5057 2578 2031 2056 1012 2062 2084 2322 21767 2020 2012 1996 3496 2247 2007 10771 2015 1998 2610 1012 1999 1037 4861 24197 2610 2056 1024 3674 2111 2024 2920 1998 2070 2024 5667 5229 1012 1996 4925 2183 1999 2593 3257 2012 1996 5098 2031 2042 2701 1998 5013 5130 2024 9449 2000 4468 1996 2181 1012 8664 2024 2108 2579 2000 5587 2368 9697 2063 1005 1055 1998 7632 12680 2075 9697 2063 8323 1037 15974 2005 1996 2264 1997 2563 10771 2326 2056 1012 2610 2031 2056 1996 5823 3047 2006 1996 29491 2692 12740 2070 2869 3511 2346 2012 1996 5098 102


INFO:tensorflow:input_ids: 101 2012 2560 2322 2111 2031 2042 3480 1999 1037 3809 5823 1999 24197 2090 1037 7163 8286 1998 1037 2482 5057 2578 2031 2056 1012 2062 2084 2322 21767 2020 2012 1996 3496 2247 2007 10771 2015 1998 2610 1012 1999 1037 4861 24197 2610 2056 1024 3674 2111 2024 2920 1998 2070 2024 5667 5229 1012 1996 4925 2183 1999 2593 3257 2012 1996 5098 2031 2042 2701 1998 5013 5130 2024 9449 2000 4468 1996 2181 1012 8664 2024 2108 2579 2000 5587 2368 9697 2063 1005 1055 1998 7632 12680 2075 9697 2063 8323 1037 15974 2005 1996 2264 1997 2563 10771 2326 2056 1012 2610 2031 2056 1996 5823 3047 2006 1996 29491 2692 12740 2070 2869 3511 2346 2012 1996 5098 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:Writing example 10000 of 20778


INFO:tensorflow:Writing example 10000 of 20778


INFO:tensorflow:Writing example 20000 of 20778


INFO:tensorflow:Writing example 20000 of 20778


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Restoring parameters from OUTPUT_DIR_NAME/model.ckpt-468


INFO:tensorflow:Restoring parameters from OUTPUT_DIR_NAME/model.ckpt-468


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


In [0]:
predictions

In [0]:
from google.colab import auth
auth.authenticate_user()
from googleapiclient.discovery import build
drive_service = build('drive', 'v3')

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/googleapiclient/discovery_cache/__init__.py", line 36, in autodetect
    from google.appengine.api import memcache
ModuleNotFoundError: No module named 'google.appengine'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/googleapiclient/discovery_cache/file_cache.py", line 33, in <module>
    from oauth2client.contrib.locked_file import LockedFile
ModuleNotFoundError: No module named 'oauth2client.contrib.locked_file'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/googleapiclient/discovery_cache/file_cache.py", line 37, in <module>
    from oauth2client.locked_file import LockedFile
ModuleNotFoundError: No module named 'oauth2client.locked_file'

During handling of the above exception, another exceptio

In [0]:
import numpy as geek
with open('/tmp/to_upload.txt', 'w') as f:
  for each in predictions:
    #body_data = geek.array_str(each[1])
    #body_data=each[2]
    #body_data = ','.join(body_data)
    body_data = str(each)
    f.write(body_data)
    f.write("\n")

In [0]:
from googleapiclient.http import MediaFileUpload

file_metadata = {
  'name': 'rawSentiTest3.txt',
  'mimeType': 'text/plain'
}
media = MediaFileUpload('/tmp/to_upload.txt', 
                        mimetype='text/plain',
                        resumable=True)
created = drive_service.files().create(body=file_metadata,
                                       media_body=media,
                                       fields='id').execute()
print('File ID: {}'.format(created.get('id')))

File ID: 1nmLCw13Hdy0o8VIrFi7wxhXydKcdmDDr
