<a href="https://colab.research.google.com/github/SpencerPao/Natural-Language-Processing/blob/main/BERT/BERT_Code_Implementation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# [Clone Repository](https://github.com/google-research/bert) from Google Research
- Pretty much has everything that you need to get started on training and utilizing BERT
- A good chunk of this notebook (BERT) has been utilized from [this notebook](https://colab.research.google.com/github/google-research/bert/blob/master/predicting_movie_reviews_with_bert_on_tf_hub.ipynb).

In [None]:
!git clone https://github.com/google-research/bert.git

Cloning into 'bert'...
remote: Enumerating objects: 340, done.[K
remote: Total 340 (delta 0), reused 0 (delta 0), pack-reused 340[K
Receiving objects: 100% (340/340), 328.28 KiB | 8.42 MiB/s, done.
Resolving deltas: 100% (182/182), done.


In [None]:
# We want to use 1.X
% tensorflow_version 1.X

`%tensorflow_version` only switches the major version: 1.x or 2.x.
You set: `1.X`. This will be interpreted as: `1.x`.


TensorFlow 1.x selected.


In [None]:
'''Load in our models and important packages...'''
from sklearn.model_selection import train_test_split
import tensorflow_hub as hub
import numpy as np
import pandas as pd
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
import tensorflow as tf
from datetime import datetime

In [None]:
print(tf.__version__)

1.15.2


In [None]:
!pip install bert-tensorflow

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting bert-tensorflow
  Downloading bert_tensorflow-1.0.4-py2.py3-none-any.whl (64 kB)
[K     |████████████████████████████████| 64 kB 2.9 MB/s 
Installing collected packages: bert-tensorflow
Successfully installed bert-tensorflow-1.0.4


In [None]:
import sys
sys.path.append("/content/bert")

In [None]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization




# Let's get a base BERT Model

In [None]:
# we'll use the base model uncased since Text is lowercased. (uncased just lowercases all incoming raw text)
# model = TFBertForSequenceClassification.from_pretrained("bert-base-uncased")
# tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")

# Simiarly, we can get the models straight from the source...


# Download BERT BASE model from tF hub 

# Instructions on how to get literal model weights.

# !curl https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-12_H-768_A-12.zip --output uncased_L-12_H-768_A-12.zip
# !unzip uncased_L-12_H-768_A-12.zip

# If you are on windows and on local machine.
# !python -m wget https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-12_H-768_A-12.zip
# import zipfile
# with zipfile.ZipFile('uncased_L-12_H-768_A-12.zip', 'r') as zip_ref:
#     zip_ref.extractall('bert/content/') # zipping files to this location.

In [None]:
# Reading in twitter data on sentiment. (NEGATIVE, POSITIVE for target)
# Already cleaned and preprocessed...
df = pd.read_csv('twitter_data.csv')
df = df.sample(frac=1).reset_index()
df = df.drop(['index'], axis = 1)
df

Unnamed: 0,target,ids,date,flag,user,text
0,POSITIVE,1557045453,Sun Apr 19 01:42:30 PDT 2009,NO_QUERY,prafulh,hamilton strategy get way top see gravel nice ...
1,NEGATIVE,1551515456,Sat Apr 18 09:15:40 PDT 2009,NO_QUERY,mellen59,good morning 5 30 sleep
2,NEGATIVE,1573640524,Tue Apr 21 02:08:05 PDT 2009,NO_QUERY,natashasaurus,yay one else updating anything
3,POSITIVE,1559874025,Sun Apr 19 12:16:06 PDT 2009,NO_QUERY,JessieValentine,cookie sam tomorrow see wait papa roach concert
4,NEGATIVE,1678266018,Sat May 02 06:07:37 PDT 2009,NO_QUERY,sidSicklePowers,sleepy voice hoarse
...,...,...,...,...,...,...
99995,POSITIVE,1468802525,Tue Apr 07 03:48:52 PDT 2009,NO_QUERY,yoritomo_reiko,okay work afternoon break actually fairly good...
99996,POSITIVE,1469395244,Tue Apr 07 06:14:54 PDT 2009,NO_QUERY,thedudeims,today madame beautiful
99997,NEGATIVE,1556930444,Sun Apr 19 01:04:20 PDT 2009,NO_QUERY,camfitz,dang study break almost already
99998,POSITIVE,1556551070,Sat Apr 18 23:17:32 PDT 2009,NO_QUERY,Kateless,coming tonight


In [None]:
df.target.unique()

array(['POSITIVE', 'NEGATIVE'], dtype=object)

In [None]:
decode_map = {"NEGATIVE": 0, "POSITIVE": 1}
def decode_sentiment(label):
    return decode_map[(label)]
df.target = df.target.apply(lambda x: decode_sentiment(x))

In [None]:
df

Unnamed: 0,target,ids,date,flag,user,text
0,1,1557045453,Sun Apr 19 01:42:30 PDT 2009,NO_QUERY,prafulh,hamilton strategy get way top see gravel nice ...
1,0,1551515456,Sat Apr 18 09:15:40 PDT 2009,NO_QUERY,mellen59,good morning 5 30 sleep
2,0,1573640524,Tue Apr 21 02:08:05 PDT 2009,NO_QUERY,natashasaurus,yay one else updating anything
3,1,1559874025,Sun Apr 19 12:16:06 PDT 2009,NO_QUERY,JessieValentine,cookie sam tomorrow see wait papa roach concert
4,0,1678266018,Sat May 02 06:07:37 PDT 2009,NO_QUERY,sidSicklePowers,sleepy voice hoarse
...,...,...,...,...,...,...
99995,1,1468802525,Tue Apr 07 03:48:52 PDT 2009,NO_QUERY,yoritomo_reiko,okay work afternoon break actually fairly good...
99996,1,1469395244,Tue Apr 07 06:14:54 PDT 2009,NO_QUERY,thedudeims,today madame beautiful
99997,0,1556930444,Sun Apr 19 01:04:20 PDT 2009,NO_QUERY,camfitz,dang study break almost already
99998,1,1556551070,Sat Apr 18 23:17:32 PDT 2009,NO_QUERY,Kateless,coming tonight


In [None]:
# If any nulls, replace with empty string.
df["text"].fillna("", inplace = True)
df['text'].isnull().values.any()

False

In [None]:
# Remember that I did a lot of precleaning to get the dataframe in this state. If you are curious on how I did that,
# the Sentiment Analysis video is here: https://www.youtube.com/watch?v=CzRrD76pnVY&t=785s
df.target.value_counts()

1    50000
0    50000
Name: target, dtype: int64

In [None]:
train=df.sample(frac=0.8,random_state=200) #random state is a seed value
test=df.drop(train.index)

# We need to convert the Training and Testing Data into a BERT Format
More information can be found [Google AI Example](https://colab.research.google.com/github/google-research/bert/blob/master/predicting_movie_reviews_with_bert_on_tf_hub.ipynb) where I used a good chunk of their code.
- GUID - An id for each row
- text_a is the text we want to classify
- text_b is the sentence we want to predict (used for QA)
- label is the classification of our example (0,1, or true,false)

In [None]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore








In [None]:
# With that in mind, what does our tokenizer do? Let's look at an example
tokenizer.tokenize("Make sure to like and subscribe!") # just breaks up our words into identified vocabulary inside the vocab_file

['make', 'sure', 'to', 'like', 'and', 'sub', '##scribe', '!']

### BERT can't just accept raw text. We have to:
- Add special tokens to separate sentences for classification
- pass sequences of equal length with padding
    - padd with zeroes = pad token and ones = real tokens
- Lowercase text (if using uncased model = lower case)
- Tokenize words
- Break words into pieces if applicable
- Maps words to indices using vocab file (downloaded with the model.zip)
```
# for additional details run this:
!curl https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-12_H-768_A-12.zip --output uncased_L-12_H-768_A-12.zip
!unzip uncased_L-12_H-768_A-12.zip
```



### Important Tokens and Features for BERT
- Special tokens: ([SEP],102) - The marker for the end of a sentence
- Classification: ([CLS],101) - Add this token to the beginning of the sentence so BERT knows we are doing classification
    - Add a linear layer at the end of the model if you have a regression task
- Padding: ([PAD],0)
- Unknown tokens: ([UNK],100)

In [None]:
# This block helps transform the raw text into a data format that BERT Understands.
DATA_COLUMN = 'text'
LABEL_COLUMN = 'target'
train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this example
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

test_InputExamples = test.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

In [None]:
label_list = [0,1]
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128
# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)







INFO:tensorflow:Writing example 0 of 80000


INFO:tensorflow:Writing example 0 of 80000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] super mega bored sunday prospect yu ##m food really suck [SEP]


INFO:tensorflow:tokens: [CLS] super mega bored sunday prospect yu ##m food really suck [SEP]


INFO:tensorflow:input_ids: 101 3565 13164 11471 4465 9824 9805 2213 2833 2428 11891 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 3565 13164 11471 4465 9824 9805 2213 2833 2428 11891 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] got nicole amp kayla ti ##x twilight convention girls day [SEP]


INFO:tensorflow:tokens: [CLS] got nicole amp kayla ti ##x twilight convention girls day [SEP]


INFO:tensorflow:input_ids: 101 2288 9851 23713 26491 14841 2595 13132 4680 3057 2154 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 2288 9851 23713 26491 14841 2595 13132 4680 3057 2154 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] wait amp see game fall asleep amp might might wake [SEP]


INFO:tensorflow:tokens: [CLS] wait amp see game fall asleep amp might might wake [SEP]


INFO:tensorflow:input_ids: 101 3524 23713 2156 2208 2991 6680 23713 2453 2453 5256 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 3524 23713 2156 2208 2991 6680 23713 2453 2453 5256 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] om ##g im baking cookies 3 00 morning mu ##aha ##ha ##ha reason can ##t get sleep [SEP]


INFO:tensorflow:tokens: [CLS] om ##g im baking cookies 3 00 morning mu ##aha ##ha ##ha reason can ##t get sleep [SEP]


INFO:tensorflow:input_ids: 101 18168 2290 10047 21522 16324 1017 4002 2851 14163 23278 3270 3270 3114 2064 2102 2131 3637 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 18168 2290 10047 21522 16324 1017 4002 2851 14163 23278 3270 3270 3114 2064 2102 2131 3637 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] always wanted say quo ##t ri ##vet ##ing quo ##t something ri ##vet ##ing according new york times [SEP]


INFO:tensorflow:tokens: [CLS] always wanted say quo ##t ri ##vet ##ing quo ##t something ri ##vet ##ing according new york times [SEP]


INFO:tensorflow:input_ids: 101 2467 2359 2360 22035 2102 15544 19510 2075 22035 2102 2242 15544 19510 2075 2429 2047 2259 2335 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 2467 2359 2360 22035 2102 15544 19510 2075 22035 2102 2242 15544 19510 2075 2429 2047 2259 2335 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:Writing example 10000 of 80000


INFO:tensorflow:Writing example 10000 of 80000


INFO:tensorflow:Writing example 20000 of 80000


INFO:tensorflow:Writing example 20000 of 80000


INFO:tensorflow:Writing example 30000 of 80000


INFO:tensorflow:Writing example 30000 of 80000


INFO:tensorflow:Writing example 40000 of 80000


INFO:tensorflow:Writing example 40000 of 80000


INFO:tensorflow:Writing example 50000 of 80000


INFO:tensorflow:Writing example 50000 of 80000


INFO:tensorflow:Writing example 60000 of 80000


INFO:tensorflow:Writing example 60000 of 80000


INFO:tensorflow:Writing example 70000 of 80000


INFO:tensorflow:Writing example 70000 of 80000


INFO:tensorflow:Writing example 0 of 20000


INFO:tensorflow:Writing example 0 of 20000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] cookie sam tomorrow see wait papa roach concert [SEP]


INFO:tensorflow:tokens: [CLS] cookie sam tomorrow see wait papa roach concert [SEP]


INFO:tensorflow:input_ids: 101 17387 3520 4826 2156 3524 13008 20997 4164 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 17387 3520 4826 2156 3524 13008 20997 4164 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] sleepy voice hoarse [SEP]


INFO:tensorflow:tokens: [CLS] sleepy voice hoarse [SEP]


INFO:tensorflow:input_ids: 101 17056 2376 21221 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 17056 2376 21221 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] alma ##vi ##va poor brain sat sun [SEP]


INFO:tensorflow:tokens: [CLS] alma ##vi ##va poor brain sat sun [SEP]


INFO:tensorflow:input_ids: 101 11346 5737 3567 3532 4167 2938 3103 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 11346 5737 3567 3532 4167 2938 3103 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] better admit tho o ##oo ##oo ##oo ##oo win one mw ##ha ##ha ##ha vote [SEP]


INFO:tensorflow:tokens: [CLS] better admit tho o ##oo ##oo ##oo ##oo win one mw ##ha ##ha ##ha vote [SEP]


INFO:tensorflow:input_ids: 101 2488 6449 27793 1051 9541 9541 9541 9541 2663 2028 12464 3270 3270 3270 3789 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 2488 6449 27793 1051 9541 9541 9541 9541 2663 2028 12464 3270 3270 3270 3789 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] love bubble ##t ##wee ##ts wish could 3 day [SEP]


INFO:tensorflow:tokens: [CLS] love bubble ##t ##wee ##ts wish could 3 day [SEP]


INFO:tensorflow:input_ids: 101 2293 11957 2102 28394 3215 4299 2071 1017 2154 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 2293 11957 2102 28394 3215 4299 2071 1017 2154 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:Writing example 10000 of 20000


INFO:tensorflow:Writing example 10000 of 20000


# Fine Tuning the model!
Steps:
- Loads BERT tf hub module
- Adds a single layer that will be tuned at the end of architecture
- Tune with our purposed data

In [None]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)



In [None]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        f1_score = tf.contrib.metrics.f1_score(
            label_ids,
            predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        return {
            "eval_accuracy": accuracy,
            "f1_score": f1_score,
            "auc": auc,
            "precision": precision,
            "recall": recall,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
        }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


# Configuration Block

In [None]:
OUTPUT_DIR = "bert/output/"
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where the learning rate is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 500
SAVE_SUMMARY_STEPS = 100
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [None]:
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [None]:
# Specify output directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [None]:
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})

INFO:tensorflow:Using config: {'_model_dir': 'bert/output/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f7c37cb8250>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


INFO:tensorflow:Using config: {'_model_dir': 'bert/output/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f7c37cb8250>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


In [None]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

In [None]:
print(f'Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.


Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.




















Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where


Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into bert/output/model.ckpt.


INFO:tensorflow:Saving checkpoints for 0 into bert/output/model.ckpt.


INFO:tensorflow:loss = 0.71315, step = 0


INFO:tensorflow:loss = 0.71315, step = 0


INFO:tensorflow:global_step/sec: 1.02003


INFO:tensorflow:global_step/sec: 1.02003


INFO:tensorflow:loss = 0.64536226, step = 100 (98.042 sec)


INFO:tensorflow:loss = 0.64536226, step = 100 (98.042 sec)


INFO:tensorflow:global_step/sec: 1.18951


INFO:tensorflow:global_step/sec: 1.18951


INFO:tensorflow:loss = 0.5008812, step = 200 (84.069 sec)


INFO:tensorflow:loss = 0.5008812, step = 200 (84.069 sec)


INFO:tensorflow:global_step/sec: 1.19284


INFO:tensorflow:global_step/sec: 1.19284


INFO:tensorflow:loss = 0.5338428, step = 300 (83.834 sec)


INFO:tensorflow:loss = 0.5338428, step = 300 (83.834 sec)


INFO:tensorflow:global_step/sec: 1.19356


INFO:tensorflow:global_step/sec: 1.19356


INFO:tensorflow:loss = 0.45169708, step = 400 (83.785 sec)


INFO:tensorflow:loss = 0.45169708, step = 400 (83.785 sec)


INFO:tensorflow:Saving checkpoints for 500 into bert/output/model.ckpt.


INFO:tensorflow:Saving checkpoints for 500 into bert/output/model.ckpt.


INFO:tensorflow:global_step/sec: 1.10828


INFO:tensorflow:global_step/sec: 1.10828


INFO:tensorflow:loss = 0.6212359, step = 500 (90.230 sec)


INFO:tensorflow:loss = 0.6212359, step = 500 (90.230 sec)


INFO:tensorflow:global_step/sec: 1.19228


INFO:tensorflow:global_step/sec: 1.19228


INFO:tensorflow:loss = 0.4510832, step = 600 (83.871 sec)


INFO:tensorflow:loss = 0.4510832, step = 600 (83.871 sec)


INFO:tensorflow:global_step/sec: 1.19273


INFO:tensorflow:global_step/sec: 1.19273


INFO:tensorflow:loss = 0.5331078, step = 700 (83.843 sec)


INFO:tensorflow:loss = 0.5331078, step = 700 (83.843 sec)


INFO:tensorflow:global_step/sec: 1.19556


INFO:tensorflow:global_step/sec: 1.19556


INFO:tensorflow:loss = 0.40050077, step = 800 (83.643 sec)


INFO:tensorflow:loss = 0.40050077, step = 800 (83.643 sec)


INFO:tensorflow:global_step/sec: 1.19334


INFO:tensorflow:global_step/sec: 1.19334


INFO:tensorflow:loss = 0.6152002, step = 900 (83.797 sec)


INFO:tensorflow:loss = 0.6152002, step = 900 (83.797 sec)


INFO:tensorflow:Saving checkpoints for 1000 into bert/output/model.ckpt.


INFO:tensorflow:Saving checkpoints for 1000 into bert/output/model.ckpt.


INFO:tensorflow:global_step/sec: 1.10908


INFO:tensorflow:global_step/sec: 1.10908


INFO:tensorflow:loss = 0.5020639, step = 1000 (90.166 sec)


INFO:tensorflow:loss = 0.5020639, step = 1000 (90.166 sec)


INFO:tensorflow:global_step/sec: 1.19244


INFO:tensorflow:global_step/sec: 1.19244


INFO:tensorflow:loss = 0.33082983, step = 1100 (83.863 sec)


INFO:tensorflow:loss = 0.33082983, step = 1100 (83.863 sec)


INFO:tensorflow:global_step/sec: 1.19302


INFO:tensorflow:global_step/sec: 1.19302


INFO:tensorflow:loss = 0.5725663, step = 1200 (83.818 sec)


INFO:tensorflow:loss = 0.5725663, step = 1200 (83.818 sec)


INFO:tensorflow:global_step/sec: 1.1957


INFO:tensorflow:global_step/sec: 1.1957


INFO:tensorflow:loss = 0.4809195, step = 1300 (83.636 sec)


INFO:tensorflow:loss = 0.4809195, step = 1300 (83.636 sec)


INFO:tensorflow:global_step/sec: 1.19551


INFO:tensorflow:global_step/sec: 1.19551


INFO:tensorflow:loss = 0.34697297, step = 1400 (83.643 sec)


INFO:tensorflow:loss = 0.34697297, step = 1400 (83.643 sec)


INFO:tensorflow:Saving checkpoints for 1500 into bert/output/model.ckpt.


INFO:tensorflow:Saving checkpoints for 1500 into bert/output/model.ckpt.


INFO:tensorflow:global_step/sec: 1.1106


INFO:tensorflow:global_step/sec: 1.1106


INFO:tensorflow:loss = 0.38781005, step = 1500 (90.041 sec)


INFO:tensorflow:loss = 0.38781005, step = 1500 (90.041 sec)


INFO:tensorflow:global_step/sec: 1.19328


INFO:tensorflow:global_step/sec: 1.19328


INFO:tensorflow:loss = 0.41235858, step = 1600 (83.803 sec)


INFO:tensorflow:loss = 0.41235858, step = 1600 (83.803 sec)


INFO:tensorflow:global_step/sec: 1.19249


INFO:tensorflow:global_step/sec: 1.19249


INFO:tensorflow:loss = 0.2820285, step = 1700 (83.858 sec)


INFO:tensorflow:loss = 0.2820285, step = 1700 (83.858 sec)


INFO:tensorflow:global_step/sec: 1.19638


INFO:tensorflow:global_step/sec: 1.19638


INFO:tensorflow:loss = 0.49254096, step = 1800 (83.589 sec)


INFO:tensorflow:loss = 0.49254096, step = 1800 (83.589 sec)


INFO:tensorflow:global_step/sec: 1.19468


INFO:tensorflow:global_step/sec: 1.19468


INFO:tensorflow:loss = 0.3990028, step = 1900 (83.703 sec)


INFO:tensorflow:loss = 0.3990028, step = 1900 (83.703 sec)


INFO:tensorflow:Saving checkpoints for 2000 into bert/output/model.ckpt.


INFO:tensorflow:Saving checkpoints for 2000 into bert/output/model.ckpt.


INFO:tensorflow:global_step/sec: 1.11034


INFO:tensorflow:global_step/sec: 1.11034


INFO:tensorflow:loss = 0.45510235, step = 2000 (90.061 sec)


INFO:tensorflow:loss = 0.45510235, step = 2000 (90.061 sec)


INFO:tensorflow:global_step/sec: 1.19296


INFO:tensorflow:global_step/sec: 1.19296


INFO:tensorflow:loss = 0.5877873, step = 2100 (83.827 sec)


INFO:tensorflow:loss = 0.5877873, step = 2100 (83.827 sec)


INFO:tensorflow:global_step/sec: 1.19377


INFO:tensorflow:global_step/sec: 1.19377


INFO:tensorflow:loss = 0.3703716, step = 2200 (83.769 sec)


INFO:tensorflow:loss = 0.3703716, step = 2200 (83.769 sec)


INFO:tensorflow:global_step/sec: 1.19452


INFO:tensorflow:global_step/sec: 1.19452


INFO:tensorflow:loss = 0.35837942, step = 2300 (83.716 sec)


INFO:tensorflow:loss = 0.35837942, step = 2300 (83.716 sec)


INFO:tensorflow:global_step/sec: 1.19309


INFO:tensorflow:global_step/sec: 1.19309


INFO:tensorflow:loss = 0.4385148, step = 2400 (83.813 sec)


INFO:tensorflow:loss = 0.4385148, step = 2400 (83.813 sec)


INFO:tensorflow:Saving checkpoints for 2500 into bert/output/model.ckpt.


INFO:tensorflow:Saving checkpoints for 2500 into bert/output/model.ckpt.


Instructions for updating:
Use standard file APIs to delete files with this prefix.


Instructions for updating:
Use standard file APIs to delete files with this prefix.


INFO:tensorflow:global_step/sec: 1.10794


INFO:tensorflow:global_step/sec: 1.10794


INFO:tensorflow:loss = 0.5829838, step = 2500 (90.260 sec)


INFO:tensorflow:loss = 0.5829838, step = 2500 (90.260 sec)


INFO:tensorflow:global_step/sec: 1.19334


INFO:tensorflow:global_step/sec: 1.19334


INFO:tensorflow:loss = 0.4515376, step = 2600 (83.798 sec)


INFO:tensorflow:loss = 0.4515376, step = 2600 (83.798 sec)


INFO:tensorflow:global_step/sec: 1.19269


INFO:tensorflow:global_step/sec: 1.19269


INFO:tensorflow:loss = 0.43112049, step = 2700 (83.842 sec)


INFO:tensorflow:loss = 0.43112049, step = 2700 (83.842 sec)


INFO:tensorflow:global_step/sec: 1.19504


INFO:tensorflow:global_step/sec: 1.19504


INFO:tensorflow:loss = 0.29153055, step = 2800 (83.682 sec)


INFO:tensorflow:loss = 0.29153055, step = 2800 (83.682 sec)


INFO:tensorflow:global_step/sec: 1.19244


INFO:tensorflow:global_step/sec: 1.19244


INFO:tensorflow:loss = 0.4544673, step = 2900 (83.863 sec)


INFO:tensorflow:loss = 0.4544673, step = 2900 (83.863 sec)


INFO:tensorflow:Saving checkpoints for 3000 into bert/output/model.ckpt.


INFO:tensorflow:Saving checkpoints for 3000 into bert/output/model.ckpt.


INFO:tensorflow:global_step/sec: 1.10921


INFO:tensorflow:global_step/sec: 1.10921


INFO:tensorflow:loss = 0.51641184, step = 3000 (90.154 sec)


INFO:tensorflow:loss = 0.51641184, step = 3000 (90.154 sec)


INFO:tensorflow:global_step/sec: 1.19242


INFO:tensorflow:global_step/sec: 1.19242


INFO:tensorflow:loss = 0.2816838, step = 3100 (83.863 sec)


INFO:tensorflow:loss = 0.2816838, step = 3100 (83.863 sec)


INFO:tensorflow:global_step/sec: 1.19282


INFO:tensorflow:global_step/sec: 1.19282


INFO:tensorflow:loss = 0.25495774, step = 3200 (83.836 sec)


INFO:tensorflow:loss = 0.25495774, step = 3200 (83.836 sec)


INFO:tensorflow:global_step/sec: 1.19402


INFO:tensorflow:global_step/sec: 1.19402


INFO:tensorflow:loss = 0.116794735, step = 3300 (83.748 sec)


INFO:tensorflow:loss = 0.116794735, step = 3300 (83.748 sec)


INFO:tensorflow:global_step/sec: 1.19403


INFO:tensorflow:global_step/sec: 1.19403


INFO:tensorflow:loss = 0.32651222, step = 3400 (83.754 sec)


INFO:tensorflow:loss = 0.32651222, step = 3400 (83.754 sec)


INFO:tensorflow:Saving checkpoints for 3500 into bert/output/model.ckpt.


INFO:tensorflow:Saving checkpoints for 3500 into bert/output/model.ckpt.


INFO:tensorflow:global_step/sec: 1.10908


INFO:tensorflow:global_step/sec: 1.10908


INFO:tensorflow:loss = 0.42072883, step = 3500 (90.163 sec)


INFO:tensorflow:loss = 0.42072883, step = 3500 (90.163 sec)


INFO:tensorflow:global_step/sec: 1.19268


INFO:tensorflow:global_step/sec: 1.19268


INFO:tensorflow:loss = 0.31767565, step = 3600 (83.844 sec)


INFO:tensorflow:loss = 0.31767565, step = 3600 (83.844 sec)


INFO:tensorflow:global_step/sec: 1.19324


INFO:tensorflow:global_step/sec: 1.19324


INFO:tensorflow:loss = 0.43696582, step = 3700 (83.804 sec)


INFO:tensorflow:loss = 0.43696582, step = 3700 (83.804 sec)


INFO:tensorflow:global_step/sec: 1.19534


INFO:tensorflow:global_step/sec: 1.19534


INFO:tensorflow:loss = 0.2561726, step = 3800 (83.658 sec)


INFO:tensorflow:loss = 0.2561726, step = 3800 (83.658 sec)


INFO:tensorflow:global_step/sec: 1.19343


INFO:tensorflow:global_step/sec: 1.19343


INFO:tensorflow:loss = 0.15342641, step = 3900 (83.789 sec)


INFO:tensorflow:loss = 0.15342641, step = 3900 (83.789 sec)


INFO:tensorflow:Saving checkpoints for 4000 into bert/output/model.ckpt.


INFO:tensorflow:Saving checkpoints for 4000 into bert/output/model.ckpt.


INFO:tensorflow:global_step/sec: 1.11221


INFO:tensorflow:global_step/sec: 1.11221


INFO:tensorflow:loss = 0.44058436, step = 4000 (89.915 sec)


INFO:tensorflow:loss = 0.44058436, step = 4000 (89.915 sec)


INFO:tensorflow:global_step/sec: 1.19196


INFO:tensorflow:global_step/sec: 1.19196


INFO:tensorflow:loss = 0.24959008, step = 4100 (83.895 sec)


INFO:tensorflow:loss = 0.24959008, step = 4100 (83.895 sec)


INFO:tensorflow:global_step/sec: 1.19225


INFO:tensorflow:global_step/sec: 1.19225


INFO:tensorflow:loss = 0.31171513, step = 4200 (83.876 sec)


INFO:tensorflow:loss = 0.31171513, step = 4200 (83.876 sec)


INFO:tensorflow:global_step/sec: 1.19515


INFO:tensorflow:global_step/sec: 1.19515


INFO:tensorflow:loss = 0.21818657, step = 4300 (83.671 sec)


INFO:tensorflow:loss = 0.21818657, step = 4300 (83.671 sec)


INFO:tensorflow:global_step/sec: 1.19548


INFO:tensorflow:global_step/sec: 1.19548


INFO:tensorflow:loss = 0.23498183, step = 4400 (83.649 sec)


INFO:tensorflow:loss = 0.23498183, step = 4400 (83.649 sec)


INFO:tensorflow:Saving checkpoints for 4500 into bert/output/model.ckpt.


INFO:tensorflow:Saving checkpoints for 4500 into bert/output/model.ckpt.


INFO:tensorflow:global_step/sec: 1.11003


INFO:tensorflow:global_step/sec: 1.11003


INFO:tensorflow:loss = 0.26109004, step = 4500 (90.090 sec)


INFO:tensorflow:loss = 0.26109004, step = 4500 (90.090 sec)


INFO:tensorflow:global_step/sec: 1.19278


INFO:tensorflow:global_step/sec: 1.19278


INFO:tensorflow:loss = 0.2543116, step = 4600 (83.833 sec)


INFO:tensorflow:loss = 0.2543116, step = 4600 (83.833 sec)


INFO:tensorflow:global_step/sec: 1.19291


INFO:tensorflow:global_step/sec: 1.19291


INFO:tensorflow:loss = 0.29457057, step = 4700 (83.831 sec)


INFO:tensorflow:loss = 0.29457057, step = 4700 (83.831 sec)


INFO:tensorflow:global_step/sec: 1.19495


INFO:tensorflow:global_step/sec: 1.19495


INFO:tensorflow:loss = 0.27470738, step = 4800 (83.686 sec)


INFO:tensorflow:loss = 0.27470738, step = 4800 (83.686 sec)


INFO:tensorflow:global_step/sec: 1.19426


INFO:tensorflow:global_step/sec: 1.19426


INFO:tensorflow:loss = 0.21253696, step = 4900 (83.734 sec)


INFO:tensorflow:loss = 0.21253696, step = 4900 (83.734 sec)


INFO:tensorflow:Saving checkpoints for 5000 into bert/output/model.ckpt.


INFO:tensorflow:Saving checkpoints for 5000 into bert/output/model.ckpt.


INFO:tensorflow:global_step/sec: 1.11168


INFO:tensorflow:global_step/sec: 1.11168


INFO:tensorflow:loss = 0.28383166, step = 5000 (89.958 sec)


INFO:tensorflow:loss = 0.28383166, step = 5000 (89.958 sec)


INFO:tensorflow:global_step/sec: 1.19345


INFO:tensorflow:global_step/sec: 1.19345


INFO:tensorflow:loss = 0.37911904, step = 5100 (83.790 sec)


INFO:tensorflow:loss = 0.37911904, step = 5100 (83.790 sec)


INFO:tensorflow:global_step/sec: 1.19349


INFO:tensorflow:global_step/sec: 1.19349


INFO:tensorflow:loss = 0.08726579, step = 5200 (83.784 sec)


INFO:tensorflow:loss = 0.08726579, step = 5200 (83.784 sec)


INFO:tensorflow:global_step/sec: 1.19428


INFO:tensorflow:global_step/sec: 1.19428


INFO:tensorflow:loss = 0.18906696, step = 5300 (83.735 sec)


INFO:tensorflow:loss = 0.18906696, step = 5300 (83.735 sec)


INFO:tensorflow:global_step/sec: 1.19552


INFO:tensorflow:global_step/sec: 1.19552


INFO:tensorflow:loss = 0.14693402, step = 5400 (83.644 sec)


INFO:tensorflow:loss = 0.14693402, step = 5400 (83.644 sec)


INFO:tensorflow:Saving checkpoints for 5500 into bert/output/model.ckpt.


INFO:tensorflow:Saving checkpoints for 5500 into bert/output/model.ckpt.


INFO:tensorflow:global_step/sec: 1.10711


INFO:tensorflow:global_step/sec: 1.10711


INFO:tensorflow:loss = 0.24884568, step = 5500 (90.323 sec)


INFO:tensorflow:loss = 0.24884568, step = 5500 (90.323 sec)


INFO:tensorflow:global_step/sec: 1.19279


INFO:tensorflow:global_step/sec: 1.19279


INFO:tensorflow:loss = 0.03818196, step = 5600 (83.839 sec)


INFO:tensorflow:loss = 0.03818196, step = 5600 (83.839 sec)


INFO:tensorflow:global_step/sec: 1.19453


INFO:tensorflow:global_step/sec: 1.19453


INFO:tensorflow:loss = 0.16148572, step = 5700 (83.713 sec)


INFO:tensorflow:loss = 0.16148572, step = 5700 (83.713 sec)


INFO:tensorflow:global_step/sec: 1.19304


INFO:tensorflow:global_step/sec: 1.19304


INFO:tensorflow:loss = 0.25675732, step = 5800 (83.817 sec)


INFO:tensorflow:loss = 0.25675732, step = 5800 (83.817 sec)


INFO:tensorflow:global_step/sec: 1.19281


INFO:tensorflow:global_step/sec: 1.19281


INFO:tensorflow:loss = 0.06425087, step = 5900 (83.836 sec)


INFO:tensorflow:loss = 0.06425087, step = 5900 (83.836 sec)


INFO:tensorflow:Saving checkpoints for 6000 into bert/output/model.ckpt.


INFO:tensorflow:Saving checkpoints for 6000 into bert/output/model.ckpt.


INFO:tensorflow:global_step/sec: 1.11084


INFO:tensorflow:global_step/sec: 1.11084


INFO:tensorflow:loss = 0.43556958, step = 6000 (90.025 sec)


INFO:tensorflow:loss = 0.43556958, step = 6000 (90.025 sec)


INFO:tensorflow:global_step/sec: 1.19084


INFO:tensorflow:global_step/sec: 1.19084


INFO:tensorflow:loss = 0.009012176, step = 6100 (83.977 sec)


INFO:tensorflow:loss = 0.009012176, step = 6100 (83.977 sec)


INFO:tensorflow:global_step/sec: 1.19333


INFO:tensorflow:global_step/sec: 1.19333


INFO:tensorflow:loss = 0.10656994, step = 6200 (83.799 sec)


INFO:tensorflow:loss = 0.10656994, step = 6200 (83.799 sec)


INFO:tensorflow:global_step/sec: 1.19559


INFO:tensorflow:global_step/sec: 1.19559


INFO:tensorflow:loss = 0.18846416, step = 6300 (83.638 sec)


INFO:tensorflow:loss = 0.18846416, step = 6300 (83.638 sec)


INFO:tensorflow:global_step/sec: 1.19512


INFO:tensorflow:global_step/sec: 1.19512


INFO:tensorflow:loss = 0.12524271, step = 6400 (83.672 sec)


INFO:tensorflow:loss = 0.12524271, step = 6400 (83.672 sec)


INFO:tensorflow:Saving checkpoints for 6500 into bert/output/model.ckpt.


INFO:tensorflow:Saving checkpoints for 6500 into bert/output/model.ckpt.


INFO:tensorflow:global_step/sec: 1.10986


INFO:tensorflow:global_step/sec: 1.10986


INFO:tensorflow:loss = 0.0096502565, step = 6500 (90.103 sec)


INFO:tensorflow:loss = 0.0096502565, step = 6500 (90.103 sec)


INFO:tensorflow:global_step/sec: 1.19244


INFO:tensorflow:global_step/sec: 1.19244


INFO:tensorflow:loss = 0.09328323, step = 6600 (83.864 sec)


INFO:tensorflow:loss = 0.09328323, step = 6600 (83.864 sec)


INFO:tensorflow:global_step/sec: 1.19252


INFO:tensorflow:global_step/sec: 1.19252


INFO:tensorflow:loss = 0.016556237, step = 6700 (83.855 sec)


INFO:tensorflow:loss = 0.016556237, step = 6700 (83.855 sec)


INFO:tensorflow:global_step/sec: 1.19323


INFO:tensorflow:global_step/sec: 1.19323


INFO:tensorflow:loss = 0.0172479, step = 6800 (83.806 sec)


INFO:tensorflow:loss = 0.0172479, step = 6800 (83.806 sec)


INFO:tensorflow:global_step/sec: 1.19606


INFO:tensorflow:global_step/sec: 1.19606


INFO:tensorflow:loss = 0.10282295, step = 6900 (83.608 sec)


INFO:tensorflow:loss = 0.10282295, step = 6900 (83.608 sec)


INFO:tensorflow:Saving checkpoints for 7000 into bert/output/model.ckpt.


INFO:tensorflow:Saving checkpoints for 7000 into bert/output/model.ckpt.


INFO:tensorflow:global_step/sec: 1.11007


INFO:tensorflow:global_step/sec: 1.11007


INFO:tensorflow:loss = 0.06020329, step = 7000 (90.082 sec)


INFO:tensorflow:loss = 0.06020329, step = 7000 (90.082 sec)


INFO:tensorflow:global_step/sec: 1.19167


INFO:tensorflow:global_step/sec: 1.19167


INFO:tensorflow:loss = 0.037838593, step = 7100 (83.917 sec)


INFO:tensorflow:loss = 0.037838593, step = 7100 (83.917 sec)


INFO:tensorflow:global_step/sec: 1.19362


INFO:tensorflow:global_step/sec: 1.19362


INFO:tensorflow:loss = 0.01255431, step = 7200 (83.782 sec)


INFO:tensorflow:loss = 0.01255431, step = 7200 (83.782 sec)


INFO:tensorflow:global_step/sec: 1.19534


INFO:tensorflow:global_step/sec: 1.19534


INFO:tensorflow:loss = 0.16215014, step = 7300 (83.657 sec)


INFO:tensorflow:loss = 0.16215014, step = 7300 (83.657 sec)


INFO:tensorflow:global_step/sec: 1.1939


INFO:tensorflow:global_step/sec: 1.1939


INFO:tensorflow:loss = 0.16918188, step = 7400 (83.759 sec)


INFO:tensorflow:loss = 0.16918188, step = 7400 (83.759 sec)


INFO:tensorflow:Saving checkpoints for 7500 into bert/output/model.ckpt.


INFO:tensorflow:Saving checkpoints for 7500 into bert/output/model.ckpt.


INFO:tensorflow:Loss for final step: 0.090980425.


INFO:tensorflow:Loss for final step: 0.090980425.


Training took time  1:48:05.261179


# Testing

In [None]:
test_input_fn = run_classifier.input_fn_builder(
    features=test_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

In [None]:
estimator.evaluate(input_fn=test_input_fn, steps=None)

INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Starting evaluation at 2022-07-04T23:01:18Z


INFO:tensorflow:Starting evaluation at 2022-07-04T23:01:18Z


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Restoring parameters from bert/output/model.ckpt-7500


INFO:tensorflow:Restoring parameters from bert/output/model.ckpt-7500


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Finished evaluation at 2022-07-04-23:04:28


INFO:tensorflow:Finished evaluation at 2022-07-04-23:04:28


INFO:tensorflow:Saving dict for global step 7500: auc = 0.7682979, eval_accuracy = 0.7682, f1_score = 0.7667773, false_negatives = 2459.0, false_positives = 2177.0, global_step = 7500, loss = 0.8033782, precision = 0.7778118, recall = 0.7560516, true_negatives = 7743.0, true_positives = 7621.0


INFO:tensorflow:Saving dict for global step 7500: auc = 0.7682979, eval_accuracy = 0.7682, f1_score = 0.7667773, false_negatives = 2459.0, false_positives = 2177.0, global_step = 7500, loss = 0.8033782, precision = 0.7778118, recall = 0.7560516, true_negatives = 7743.0, true_positives = 7621.0


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 7500: bert/output/model.ckpt-7500


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 7500: bert/output/model.ckpt-7500


{'auc': 0.7682979,
 'eval_accuracy': 0.7682,
 'f1_score': 0.7667773,
 'false_negatives': 2459.0,
 'false_positives': 2177.0,
 'global_step': 7500,
 'loss': 0.8033782,
 'precision': 0.7778118,
 'recall': 0.7560516,
 'true_negatives': 7743.0,
 'true_positives': 7621.0}

# Now, if you want to do some Predictions

In [None]:
def getPrediction(in_sentences):
  labels = ["Negative", "Positive"]
  input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, label = 0) for x in in_sentences] # here, "" is just a dummy label
  input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
  predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
  predictions = estimator.predict(predict_input_fn)
  return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]

In [None]:
pred_sentences = [
    "Like and subscribe!",
    "That was great!",
    "That was Bad!",
    "Well Done!"
]

In [None]:
predictions = getPrediction(pred_sentences)

INFO:tensorflow:Writing example 0 of 4


INFO:tensorflow:Writing example 0 of 4


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] like and sub ##scribe ! [SEP]


INFO:tensorflow:tokens: [CLS] like and sub ##scribe ! [SEP]


INFO:tensorflow:input_ids: 101 2066 1998 4942 29234 999 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 2066 1998 4942 29234 999 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] that was great ! [SEP]


INFO:tensorflow:tokens: [CLS] that was great ! [SEP]


INFO:tensorflow:input_ids: 101 2008 2001 2307 999 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 2008 2001 2307 999 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] that was bad ! [SEP]


INFO:tensorflow:tokens: [CLS] that was bad ! [SEP]


INFO:tensorflow:input_ids: 101 2008 2001 2919 999 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 2008 2001 2919 999 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] well done ! [SEP]


INFO:tensorflow:tokens: [CLS] well done ! [SEP]


INFO:tensorflow:input_ids: 101 2092 2589 999 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 2092 2589 999 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Restoring parameters from bert/output/model.ckpt-7500


INFO:tensorflow:Restoring parameters from bert/output/model.ckpt-7500


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


In [None]:
predictions

[('Like and subscribe!',
  array([-4.196163  , -0.01516762], dtype=float32),
  'Positive'),
 ('That was great!',
  array([-6.2471666e+00, -1.9377756e-03], dtype=float32),
  'Positive'),
 ('That was Bad!', array([-0.3937208, -1.1225228], dtype=float32), 'Negative'),
 ('Well Done!',
  array([-5.5058508e+00, -4.0711625e-03], dtype=float32),
  'Positive')]

In [None]:
"""Now, if you are curious on how to conduct the pretraining phase i.e (MLM and NSP)
you will need to run the following: I literally copied and pasted this part from
the google repo; of course you will need to change a few parameters..."""

# Create Pretraining Data

# python create_pretraining_data.py \
#   --input_file=./sample_text.txt \
#   --output_file=/tmp/tf_examples.tfrecord \
#   --vocab_file=$BERT_BASE_DIR/vocab.txt \
#   --do_lower_case=True \
#   --max_seq_length=128 \
#   --max_predictions_per_seq=20 \
#   --masked_lm_prob=0.15 \
#   --random_seed=12345 \
#   --dupe_factor=5

# Create Context and Language understanding

# !python bert/run_pretraining.py \
#   --input_file=/tmp/tf_examples.tfrecord \
#   --output_dir=/tmp/pretraining_output \
#   --do_train=True \
#   --do_eval=True \
#   --bert_config_file=$BERT_BASE_DIR/bert_config.json \
#   --init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt \
#   --train_batch_size=32 \
#   --max_seq_length=128 \
#   --max_predictions_per_seq=20 \
#   --num_train_steps=20 \
#   --num_warmup_steps=10 \
#   --learning_rate=2e-5