# Predicting News Category With BERT IN Tensorflow

---

Bidirectional Encoder Representations from Transformers or BERT for short is a very popular NLP model from Google known for producing state-of-the-art results in a wide variety of NLP tasks.

The importance of Natural Language Processing(NLP) is profound in the Artificial Intelligence domain. The most abundant data in the world today is in the form of texts and having a powerful text processing system is critical and is more than  just a necessity.

In this article we look at implementing a multi-class classification using the state-of-the-art model, BERT.

---

##### Pre-Requisites:

##### An Understanding of BERT
---

## About Dataset

For this article, we will use MachineHack’s Predict The News Category Hackathon data. The data  consists of a collection of news articles which are categorized into four sections. The features of the datasets are as follows:

Size of training set: 7,628 records
Size of test set: 2,748 records

FEATURES:

STORY:  A part of the main content of the article to be published as a piece of news.
SECTION: The genre/category the STORY falls in.

There are four distinct sections where each story may fall in to. The Sections are labelled as follows :
Politics: 0
Technology: 1
Entertainment: 2
Business: 3


## Importing Necessary Libraries

In [1]:
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime
from sklearn.model_selection import train_test_split

print("tensorflow version : ", tf.__version__)
print("tensorflow_hub version : ", hub.__version__)

tensorflow version :  1.15.0
tensorflow_hub version :  0.7.0


In [2]:
#Importing BERT modules
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization




## Setting The Output Directory
---
While fine-tuning the model, we will save the training checkpoints and the model in an output directory so that we can use the trained model for our predictions later.

The following code block sets an output directory :



In [3]:
# Set the output directory for saving model file
OUTPUT_DIR = 'bert'

## Loading The Data
---
We will now load the data from a Google Drive directory and will also split the training set in to training and validation sets.


In [4]:
data = pd.read_csv("../nlp-for-future-data/reviews1000.csv")
train, test = train_test_split(data, test_size=0.2, random_state=100)

train, val =  train_test_split(train, test_size = 0.2, random_state = 100)

In [5]:
#Training set sample
train

Unnamed: 0.1,Unnamed: 0,score,review
581,95206,1.0,Just couldn't bring myself to finish this extr...
920,407239,3.0,I'll preface this by saying I have not read Al...
113,682747,5.0,One may take issue with some of Schaeffer's ph...
657,281468,2.0,I listen to audio books frequently.The audio q...
609,414007,3.0,I had no idea what this book was about when I ...
...,...,...,...
501,235146,2.0,Though I have been a large fan of AD&amp;D for...
491,724526,5.0,I read this book many years ago.. Today it rem...
750,573046,4.0,It was a brutal time with brutal people living...
906,625280,5.0,The downside of any audio book is that it does...


In [6]:
#Test set sample
test

Unnamed: 0.1,Unnamed: 0,score,review
249,289062,2.0,Fletcher is a talented writer with a knack for...
353,168337,2.0,After just finishing Herbert's Santaroga Barri...
537,209421,2.0,I prefer the REAL Madonna; the one she sold to...
424,123432,1.0,The Queen of NE brings her completely slanted ...
564,200094,2.0,In trying to get books imported to my newly-pu...
...,...,...,...
684,126696,1.0,"This item was purchased on October 18, and was..."
644,224084,2.0,Limiting my comments to the audio version of E...
110,375914,3.0,"While not the first book of the series, this i..."
28,309657,3.0,Tailchaser's Song was an okay book. Nothing ex...


In [7]:
print("Training Set Shape :", train.shape)
print("Validation Set Shape :", val.shape)
print("Test Set Shape :", test.shape)

Training Set Shape : (640, 3)
Validation Set Shape : (160, 3)
Test Set Shape : (200, 3)


In [8]:
#Features in the dataset
train.columns

Index(['Unnamed: 0', 'score', 'review'], dtype='object')

In [9]:
#unique classes
list(train.score.unique())

[1.0, 3.0, 5.0, 2.0, 4.0]

In [11]:
DATA_COLUMN = 'review'
LABEL_COLUMN = 'score'
label_list = list(train.score.unique())

## Data Preprocessing

BERT model accept only a specific type of input and the datasets are usually structured to have have the following four features:

* guid : A unique id that represents an observation.
* text_a : The text we need to classify into given categories
* text_b: It is used when we're training a model to understand the relationship between sentences and it does not apply for classification problems.
* label: It consists of the labels or classes or categories that a given text belongs to.
 
In our dataset we have text_a and label. The following code block will create objects for each of the above mentioned features for all the records in our dataset using the InputExample class provided in the BERT library.


In [12]:
train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid=None,
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

val_InputExamples = val.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

In [13]:
train_InputExamples

581    <bert.run_classifier.InputExample object at 0x...
920    <bert.run_classifier.InputExample object at 0x...
113    <bert.run_classifier.InputExample object at 0x...
657    <bert.run_classifier.InputExample object at 0x...
609    <bert.run_classifier.InputExample object at 0x...
                             ...                        
501    <bert.run_classifier.InputExample object at 0x...
491    <bert.run_classifier.InputExample object at 0x...
750    <bert.run_classifier.InputExample object at 0x...
906    <bert.run_classifier.InputExample object at 0x...
893    <bert.run_classifier.InputExample object at 0x...
Length: 640, dtype: object

In [None]:
print("Row 0 - guid of training set : ", train_InputExamples.iloc[0].guid)
print("\n__________\nRow 0 - text_a of training set : ", train_InputExamples.iloc[0].text_a)
print("\n__________\nRow 0 - text_b of training set : ", train_InputExamples.iloc[0].text_b)
print("\n__________\nRow 0 - label of training set : ", train_InputExamples.iloc[0].label)

We will now get down to business with the pretrained BERT.  In this example we will use the ```bert_uncased_L-12_H-768_A-12/1``` model. To check all available versions click [here](https://tfhub.dev/s?network-architecture=transformer&publisher=google).

We will be using the vocab.txt file in the model to map the words in the dataset to indexes. Also the loaded BERT model is trained on uncased/lowercase data and hence the data we feed to train the model should also be of lowercase.

---

The following code block loads the pre-trained BERT model and initializers a tokenizer object for tokenizing the texts.


In [15]:

# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore














In [16]:
#Here is what the tokenised sample of the first training set observation looks like
print(tokenizer.tokenize(train_InputExamples.iloc[0].text_a))

['just', 'couldn', "'", 't', 'bring', 'myself', 'to', 'finish', 'this', 'extremely', 'disturbing', 'read', '.', 'yes', 'the', 'strange', 'dialect', 'of', 'funky', 'slang', 'is', 'very', 'off', 'putting', 'at', 'first', ',', 'but', 'that', 'is', 'the', 'least', 'of', 'my', 'problems', 'with', 'this', 'book', 'and', 'something', 'a', 'chapter', 'or', 'so', 'in', 'you', 'start', 'to', 'become', 'acc', '##ust', '##om', 'to', '.', 'now', ',', 'by', 'the', 'time', 'i', 'got', 'to', 'chapter', '4', ',', 'wherein', 'main', 'character', '15', 'year', 'old', 'alex', ',', 'rape', '##s', 'two', '10', 'year', 'old', 'girls', '(', 'mind', 'you', 'this', 'takes', 'place', 'the', 'day', 'following', 'a', 'home', 'invasion', 'and', 'gang', 'rape', 'performed', 'by', 'the', 'young', 'narrator', 'and', 'his', 'pal', '##s', ')', 'i', 'was', 'done', '.', 'i', 'know', 'clock', '##work', 'orange', 'is', 'a', 'cult', 'classic', 'and', 'all', ',', 'but', 'this', 'is', 'just', 'not', 'entertainment', ',', 'at',

We will now format out text in to input features which the BERT model expects. We will also set a sequence length which will be the length of the input features.

In [None]:
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128

# Convert our train and validation features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)

val_features = bert.run_classifier.convert_examples_to_features(val_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)

In [63]:
#Example on first observation in the training set
print("Sentence : ", train_InputExamples.iloc[0].text_a)
print("-"*30)
print("Tokens : ", tokenizer.tokenize(train_InputExamples.iloc[0].text_a))
print("-"*30)
print("Input IDs : ", train_features[0].input_ids)
print("-"*30)
print("Input Masks : ", train_features[0].input_mask)
print("-"*30)
print("Segment IDs : ", train_features[0].segment_ids)

Sentence :  Just couldn't bring myself to finish this extremely disturbing read. Yes the strange dialect of funky slang is very off putting at first, but that is the least of my problems with this book and something a chapter or so in you start to become accustom to. Now, by the time I got to chapter 4, wherein main character 15 year old Alex, rapes two 10 year old girls (mind you this takes place the day following a home invasion and gang rape performed by the young narrator and his pals) I was done. I know Clockwork Orange is a cult classic and all, but this is just not entertainment, at least not in my opinion.
------------------------------
Tokens :  ['just', 'couldn', "'", 't', 'bring', 'myself', 'to', 'finish', 'this', 'extremely', 'disturbing', 'read', '.', 'yes', 'the', 'strange', 'dialect', 'of', 'funky', 'slang', 'is', 'very', 'off', 'putting', 'at', 'first', ',', 'but', 'that', 'is', 'the', 'least', 'of', 'my', 'problems', 'with', 'this', 'book', 'and', 'something', 'a', 'ch

##Creating A Multi-Class Classifier Model


In [64]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  
  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)


In [65]:
#A function that adapts our model to work for training, evaluation, and prediction.

# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        
        return {
            "eval_accuracy": accuracy,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
            }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


In [66]:
# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where the learning rate is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 300
SAVE_SUMMARY_STEPS = 100

# Compute train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

# Specify output directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

# Specify output directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [67]:
#Initializing the model and the estimator
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})


INFO:tensorflow:Using config: {'_model_dir': 'bert', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 300, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x0000017E280A5D88>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


INFO:tensorflow:Using config: {'_model_dir': 'bert', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 300, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x0000017E280A5D88>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


we will now create an input builder function that takes our training feature set (`train_features`) and produces a generator. This is a pretty standard design pattern for working with Tensorflow [Estimators](https://www.tensorflow.org/guide/estimators).

In [68]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

# Create an input function for validating. drop_remainder = True for using TPUs.
val_input_fn = run_classifier.input_fn_builder(
    features=val_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

## Training & Evaluating

In [69]:
#Training the model
print(f'Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!
INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Restoring parameters from bert\model.ckpt-0


INFO:tensorflow:Restoring parameters from bert\model.ckpt-0


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into bert\model.ckpt.


INFO:tensorflow:Saving checkpoints for 0 into bert\model.ckpt.


INFO:tensorflow:loss = 1.6408644, step = 1


INFO:tensorflow:loss = 1.6408644, step = 1






















INFO:tensorflow:Saving checkpoints for 60 into bert\model.ckpt.


INFO:tensorflow:Saving checkpoints for 60 into bert\model.ckpt.


INFO:tensorflow:Loss for final step: 1.1739153.


INFO:tensorflow:Loss for final step: 1.1739153.


Training took time  0:57:26.763550


In [70]:
#Evaluating the model with Validation set
estimator.evaluate(input_fn=val_input_fn, steps=None)

INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Starting evaluation at 2022-10-31T11:10:44Z


INFO:tensorflow:Starting evaluation at 2022-10-31T11:10:44Z


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Restoring parameters from bert\model.ckpt-60


INFO:tensorflow:Restoring parameters from bert\model.ckpt-60


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Finished evaluation at 2022-10-31-11:11:45


INFO:tensorflow:Finished evaluation at 2022-10-31-11:11:45


INFO:tensorflow:Saving dict for global step 60: eval_accuracy = 0.4125, false_negatives = 22.0, false_positives = 8.0, global_step = 60, loss = 1.3597462, true_negatives = 25.0, true_positives = 105.0


INFO:tensorflow:Saving dict for global step 60: eval_accuracy = 0.4125, false_negatives = 22.0, false_positives = 8.0, global_step = 60, loss = 1.3597462, true_negatives = 25.0, true_positives = 105.0


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 60: bert\model.ckpt-60


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 60: bert\model.ckpt-60


{'eval_accuracy': 0.4125,
 'false_negatives': 22.0,
 'false_positives': 8.0,
 'loss': 1.3597462,
 'true_negatives': 25.0,
 'true_positives': 105.0,
 'global_step': 60}

## Vola !! We got an evaluation accuracy of 98% on the validation set by just having trained the model for 3 epochs and a few hundred steps.

## Predicting For Test Set

In [106]:
"""Politics: 0
Technology: 1
Entertainment: 2
Business: 3"""

# A method to get predictions
def getPrediction(in_sentences):

  #Transforming the test data into BERT accepted form
  input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, label = 1) for x in in_sentences]
  
  #Creating input features for Test data
  input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)

  #Predicting the classes 
  predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
  predicts = estimator.predict(predict_input_fn)
  return [(sentence, prediction['probabilities'],prediction['labels'], label_list[prediction['labels']]) for sentence, prediction in zip(in_sentences, predicts)]

In [107]:
pred_sentences = list(test['review'])

In [108]:
# It won't work without this because of graphs used for preprocessing.
tf.Graph().as_default()

<contextlib._GeneratorContextManager at 0x17e2a66dcc8>

In [109]:
predictions = getPrediction(pred_sentences)

INFO:tensorflow:Writing example 0 of 200


INFO:tensorflow:Writing example 0 of 200


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] fletcher is a talented writer with a kn ##ack for finding unique adventures to part ##ake in . he has some excellent earlier work . this book had a lot of promise and i rather enjoyed the first half of the book . i figured if the first 50 % of the river can be an interesting read for me , the grand canyon would surely be a special adventure . during the first half , we are treated to descriptions of beautiful areas , wildlife and so ##jo ##rn ##ing . but vo ##ila , by the time he traveled the grand canyon , all he could talk about was 1 ) the other raft ##ers ( bad and good ) that were floating in [SEP]


INFO:tensorflow:tokens: [CLS] fletcher is a talented writer with a kn ##ack for finding unique adventures to part ##ake in . he has some excellent earlier work . this book had a lot of promise and i rather enjoyed the first half of the book . i figured if the first 50 % of the river can be an interesting read for me , the grand canyon would surely be a special adventure . during the first half , we are treated to descriptions of beautiful areas , wildlife and so ##jo ##rn ##ing . but vo ##ila , by the time he traveled the grand canyon , all he could talk about was 1 ) the other raft ##ers ( bad and good ) that were floating in [SEP]


INFO:tensorflow:input_ids: 101 10589 2003 1037 10904 3213 2007 1037 14161 8684 2005 4531 4310 7357 2000 2112 13808 1999 1012 2002 2038 2070 6581 3041 2147 1012 2023 2338 2018 1037 2843 1997 4872 1998 1045 2738 5632 1996 2034 2431 1997 1996 2338 1012 1045 6618 2065 1996 2034 2753 1003 1997 1996 2314 2064 2022 2019 5875 3191 2005 2033 1010 1996 2882 8399 2052 7543 2022 1037 2569 6172 1012 2076 1996 2034 2431 1010 2057 2024 5845 2000 13271 1997 3376 2752 1010 6870 1998 2061 5558 6826 2075 1012 2021 29536 11733 1010 2011 1996 2051 2002 6158 1996 2882 8399 1010 2035 2002 2071 2831 2055 2001 1015 1007 1996 2060 21298 2545 1006 2919 1998 2204 1007 2008 2020 8274 1999 102


INFO:tensorflow:input_ids: 101 10589 2003 1037 10904 3213 2007 1037 14161 8684 2005 4531 4310 7357 2000 2112 13808 1999 1012 2002 2038 2070 6581 3041 2147 1012 2023 2338 2018 1037 2843 1997 4872 1998 1045 2738 5632 1996 2034 2431 1997 1996 2338 1012 1045 6618 2065 1996 2034 2753 1003 1997 1996 2314 2064 2022 2019 5875 3191 2005 2033 1010 1996 2882 8399 2052 7543 2022 1037 2569 6172 1012 2076 1996 2034 2431 1010 2057 2024 5845 2000 13271 1997 3376 2752 1010 6870 1998 2061 5558 6826 2075 1012 2021 29536 11733 1010 2011 1996 2051 2002 6158 1996 2882 8399 1010 2035 2002 2071 2831 2055 2001 1015 1007 1996 2060 21298 2545 1006 2919 1998 2204 1007 2008 2020 8274 1999 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 0)


INFO:tensorflow:label: 1 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] after just finishing herbert ' s santa ##ro ##ga barrier and jumping straight into this one , i had high expectations , especially after reading others ' reviews . throughout the book i found there was far too much time given to detail - latch ##es , lever ##s , knob ##s , control panels , etc . . . unless you ' re someone who ' s served aboard a nuclear sub , it ' s impossible to paint a mental picture of what was going on half the time in this book . and the ending was . . well , not much of a bang . i was expecting a big ' twist ' - something i ' d never expect , which [SEP]


INFO:tensorflow:tokens: [CLS] after just finishing herbert ' s santa ##ro ##ga barrier and jumping straight into this one , i had high expectations , especially after reading others ' reviews . throughout the book i found there was far too much time given to detail - latch ##es , lever ##s , knob ##s , control panels , etc . . . unless you ' re someone who ' s served aboard a nuclear sub , it ' s impossible to paint a mental picture of what was going on half the time in this book . and the ending was . . well , not much of a bang . i was expecting a big ' twist ' - something i ' d never expect , which [SEP]


INFO:tensorflow:input_ids: 101 2044 2074 5131 7253 1005 1055 4203 3217 3654 8803 1998 8660 3442 2046 2023 2028 1010 1045 2018 2152 10908 1010 2926 2044 3752 2500 1005 4391 1012 2802 1996 2338 1045 2179 2045 2001 2521 2205 2172 2051 2445 2000 6987 1011 25635 2229 1010 15929 2015 1010 16859 2015 1010 2491 9320 1010 4385 1012 1012 1012 4983 2017 1005 2128 2619 2040 1005 1055 2366 7548 1037 4517 4942 1010 2009 1005 1055 5263 2000 6773 1037 5177 3861 1997 2054 2001 2183 2006 2431 1996 2051 1999 2023 2338 1012 1998 1996 4566 2001 1012 1012 2092 1010 2025 2172 1997 1037 9748 1012 1045 2001 8074 1037 2502 1005 9792 1005 1011 2242 1045 1005 1040 2196 5987 1010 2029 102


INFO:tensorflow:input_ids: 101 2044 2074 5131 7253 1005 1055 4203 3217 3654 8803 1998 8660 3442 2046 2023 2028 1010 1045 2018 2152 10908 1010 2926 2044 3752 2500 1005 4391 1012 2802 1996 2338 1045 2179 2045 2001 2521 2205 2172 2051 2445 2000 6987 1011 25635 2229 1010 15929 2015 1010 16859 2015 1010 2491 9320 1010 4385 1012 1012 1012 4983 2017 1005 2128 2619 2040 1005 1055 2366 7548 1037 4517 4942 1010 2009 1005 1055 5263 2000 6773 1037 5177 3861 1997 2054 2001 2183 2006 2431 1996 2051 1999 2023 2338 1012 1998 1996 4566 2001 1012 1012 2092 1010 2025 2172 1997 1037 9748 1012 1045 2001 8074 1037 2502 1005 9792 1005 1011 2242 1045 1005 1040 2196 5987 1010 2029 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 0)


INFO:tensorflow:label: 1 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] i prefer the real madonna ; the one she sold to us for so many years don ##ning another disguise . she has become one trend ##y dog ##matic woman . gee ##z , reading this book leaves me wondering why she is taking the fame away from other children ' s book writers ? people mind ##lessly go buy her book because of who she is , not due to the substance . the english roses is very white / pri ##ss ##y and preach ##y . give your hard earned cash to marginal , unknown brilliant children ' s book writers that will never have a chance next to madonna . give your kids the gift of learning what true literature really is . [SEP]


INFO:tensorflow:tokens: [CLS] i prefer the real madonna ; the one she sold to us for so many years don ##ning another disguise . she has become one trend ##y dog ##matic woman . gee ##z , reading this book leaves me wondering why she is taking the fame away from other children ' s book writers ? people mind ##lessly go buy her book because of who she is , not due to the substance . the english roses is very white / pri ##ss ##y and preach ##y . give your hard earned cash to marginal , unknown brilliant children ' s book writers that will never have a chance next to madonna . give your kids the gift of learning what true literature really is . [SEP]


INFO:tensorflow:input_ids: 101 1045 9544 1996 2613 11284 1025 1996 2028 2016 2853 2000 2149 2005 2061 2116 2086 2123 5582 2178 14249 1012 2016 2038 2468 2028 9874 2100 3899 12644 2450 1012 20277 2480 1010 3752 2023 2338 3727 2033 6603 2339 2016 2003 2635 1996 4476 2185 2013 2060 2336 1005 1055 2338 4898 1029 2111 2568 10895 2175 4965 2014 2338 2138 1997 2040 2016 2003 1010 2025 2349 2000 1996 9415 1012 1996 2394 10529 2003 2200 2317 1013 26927 4757 2100 1998 25250 2100 1012 2507 2115 2524 3687 5356 2000 14785 1010 4242 8235 2336 1005 1055 2338 4898 2008 2097 2196 2031 1037 3382 2279 2000 11284 1012 2507 2115 4268 1996 5592 1997 4083 2054 2995 3906 2428 2003 1012 102


INFO:tensorflow:input_ids: 101 1045 9544 1996 2613 11284 1025 1996 2028 2016 2853 2000 2149 2005 2061 2116 2086 2123 5582 2178 14249 1012 2016 2038 2468 2028 9874 2100 3899 12644 2450 1012 20277 2480 1010 3752 2023 2338 3727 2033 6603 2339 2016 2003 2635 1996 4476 2185 2013 2060 2336 1005 1055 2338 4898 1029 2111 2568 10895 2175 4965 2014 2338 2138 1997 2040 2016 2003 1010 2025 2349 2000 1996 9415 1012 1996 2394 10529 2003 2200 2317 1013 26927 4757 2100 1998 25250 2100 1012 2507 2115 2524 3687 5356 2000 14785 1010 4242 8235 2336 1005 1055 2338 4898 2008 2097 2196 2031 1037 3382 2279 2000 11284 1012 2507 2115 4268 1996 5592 1997 4083 2054 2995 3906 2428 2003 1012 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 0)


INFO:tensorflow:label: 1 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] the queen of ne brings her completely slant ##ed view to the scott peterson case . what is really co ##uri ##ous is how she managed to get 40 , 000 pages of discovery when at the same time the defense only had 27 , 000 pages . and of course , being in a major player in crime ##rata ##in ##ment she is fee to im ##bell ##ish and suppose ( much like the prose ##ction did ) to tell a simple tale of murder . except it ' s a tale of behavior . nothing new here folks , move along . some actual proof anyone committed a crime would be a much better read . [SEP]


INFO:tensorflow:tokens: [CLS] the queen of ne brings her completely slant ##ed view to the scott peterson case . what is really co ##uri ##ous is how she managed to get 40 , 000 pages of discovery when at the same time the defense only had 27 , 000 pages . and of course , being in a major player in crime ##rata ##in ##ment she is fee to im ##bell ##ish and suppose ( much like the prose ##ction did ) to tell a simple tale of murder . except it ' s a tale of behavior . nothing new here folks , move along . some actual proof anyone committed a crime would be a much better read . [SEP]


INFO:tensorflow:input_ids: 101 1996 3035 1997 11265 7545 2014 3294 27474 2098 3193 2000 1996 3660 12001 2553 1012 2054 2003 2428 2522 9496 3560 2003 2129 2016 3266 2000 2131 2871 1010 2199 5530 1997 5456 2043 2012 1996 2168 2051 1996 3639 2069 2018 2676 1010 2199 5530 1012 1998 1997 2607 1010 2108 1999 1037 2350 2447 1999 4126 14660 2378 3672 2016 2003 7408 2000 10047 17327 4509 1998 6814 1006 2172 2066 1996 12388 7542 2106 1007 2000 2425 1037 3722 6925 1997 4028 1012 3272 2009 1005 1055 1037 6925 1997 5248 1012 2498 2047 2182 12455 1010 2693 2247 1012 2070 5025 6947 3087 5462 1037 4126 2052 2022 1037 2172 2488 3191 1012 102 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 1996 3035 1997 11265 7545 2014 3294 27474 2098 3193 2000 1996 3660 12001 2553 1012 2054 2003 2428 2522 9496 3560 2003 2129 2016 3266 2000 2131 2871 1010 2199 5530 1997 5456 2043 2012 1996 2168 2051 1996 3639 2069 2018 2676 1010 2199 5530 1012 1998 1997 2607 1010 2108 1999 1037 2350 2447 1999 4126 14660 2378 3672 2016 2003 7408 2000 10047 17327 4509 1998 6814 1006 2172 2066 1996 12388 7542 2106 1007 2000 2425 1037 3722 6925 1997 4028 1012 3272 2009 1005 1055 1037 6925 1997 5248 1012 2498 2047 2182 12455 1010 2693 2247 1012 2070 5025 6947 3087 5462 1037 4126 2052 2022 1037 2172 2488 3191 1012 102 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 0)


INFO:tensorflow:label: 1 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] in trying to get books imported to my newly - purchased kind ##le , this one made it to the kind ##le but was not really a title i wanted . [SEP]


INFO:tensorflow:tokens: [CLS] in trying to get books imported to my newly - purchased kind ##le , this one made it to the kind ##le but was not really a title i wanted . [SEP]


INFO:tensorflow:input_ids: 101 1999 2667 2000 2131 2808 10964 2000 2026 4397 1011 4156 2785 2571 1010 2023 2028 2081 2009 2000 1996 2785 2571 2021 2001 2025 2428 1037 2516 1045 2359 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 1999 2667 2000 2131 2808 10964 2000 2026 4397 1011 4156 2785 2571 1010 2023 2028 2081 2009 2000 1996 2785 2571 2021 2001 2025 2428 1037 2516 1045 2359 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 0)


INFO:tensorflow:label: 1 (id = 0)


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Restoring parameters from bert\model.ckpt-60


INFO:tensorflow:Restoring parameters from bert\model.ckpt-60


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


In [110]:
enc_labels = []
act_labels = []
for i in range(len(predictions)):
  enc_labels.append(predictions[i][2])
  act_labels.append(predictions[i][3])

In [117]:
from sklearn.metrics import accuracy_score, mean_squared_error

print("     MSE:", mean_squared_error(test.score, act_labels))
print("Accuracy:", accuracy_score(test.score, act_labels))

     MSE: 1.775
Accuracy: 0.435


# Reference:
Most of the code has been taken from the following resource:

* https://colab.research.google.com/github/google-research/bert/blob/master/predicting_movie_reviews_with_bert_on_tf_hub.ipynb

