In [0]:
# Copyright 2019 Google Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

In [2]:
!pip install sklearn pandas tensorflow_hub bert-tensorflow pathlib

Collecting pathlib
  Downloading https://files.pythonhosted.org/packages/ac/aa/9b065a76b9af472437a0059f77e8f962fe350438b927cb80184c32f075eb/pathlib-1.0.1.tar.gz (49kB)
[K    100% |████████████████████████████████| 51kB 2.6MB/s 
[?25hBuilding wheels for collected packages: pathlib
  Running setup.py bdist_wheel for pathlib ... [?25l- \ done
[?25h  Stored in directory: /home/ivantorubarov/.cache/pip/wheels/f9/b2/4a/68efdfe5093638a9918bd1bb734af625526e849487200aa171
Successfully built pathlib
Installing collected packages: pathlib
Successfully installed pathlib-1.0.1
[33mYou are using pip version 8.1.1, however version 19.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [None]:
!conda install scipy

Fetching package metadata ...........
Solving package specifications: .

#Predicting Movie Review Sentiment with BERT on TF Hub

If you’ve been following Natural Language Processing over the past year, you’ve probably heard of BERT: Bidirectional Encoder Representations from Transformers. It’s a neural network architecture designed by Google researchers that’s totally transformed what’s state-of-the-art for NLP tasks, like text classification, translation, summarization, and question answering.

Now that BERT's been added to [TF Hub](https://www.tensorflow.org/hub) as a loadable module, it's easy(ish) to add into existing Tensorflow text pipelines. In an existing pipeline, BERT can replace text embedding layers like ELMO and GloVE. Alternatively, [finetuning](http://wiki.fast.ai/index.php/Fine_tuning) BERT can provide both an accuracy boost and faster training time in many cases.

Here, we'll train a model to predict whether an IMDB movie review is positive or negative using BERT in Tensorflow with tf hub. Some code was adapted from [this colab notebook](https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb). Let's get started!

In [4]:
from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime

In addition to the standard libraries we imported above, we'll need to install BERT's python package.

In [2]:
!pip install bert-tensorflow

[33mYou are using pip version 9.0.1, however version 19.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [5]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization

In [5]:
from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)

Mounted at /content/gdrive


Below, we'll set an output directory location to store our model output and checkpoints. This can be a local directory, in which case you'd set OUTPUT_DIR to the name of the directory you'd like to create. If you're running this code in Google's hosted Colab, the directory won't persist after the Colab session ends.

Alternatively, if you're a GCP user, you can store output in a GCP bucket. To do that, set a directory name in OUTPUT_DIR and the name of the GCP bucket in the BUCKET field.

Set DO_DELETE to rewrite the OUTPUT_DIR if it exists. Otherwise, Tensorflow will load existing model checkpoints from that directory (if they exist).

In [6]:
!mkdir bert_output

mkdir: cannot create directory ‘bert_output’: File exists


In [7]:
OUTPUT_DIR = "bert_output"

In [8]:
# Set the output directory for saving model file
# Optionally, set a GCP bucket location

OUTPUT_DIR = 'bert_output'#@param {type:"string"}
#@markdown Whether or not to clear/delete the directory and create a new one
DO_DELETE = False #@param {type:"boolean"}
#@markdown Set USE_BUCKET and BUCKET if you want to (optionally) store model output on GCP bucket.
USE_BUCKET = False #@param {type:"boolean"}
BUCKET = 'BUCKET_NAME' #@param {type:"string"}

if USE_BUCKET:
  OUTPUT_DIR = 'gs://{}/{}'.format(BUCKET, OUTPUT_DIR)
  from google.colab import auth
  auth.authenticate_user()

if DO_DELETE:
  try:
    tf.gfile.DeleteRecursively(OUTPUT_DIR)
  except:
    # Doesn't matter if the directory didn't exist
    pass
tf.gfile.MakeDirs(OUTPUT_DIR)
print('***** Model output directory: {} *****'.format(OUTPUT_DIR))


***** Model output directory: bert_output *****


#Data

First, let's download the dataset, hosted by Stanford. The code below, which downloads, extracts, and imports the IMDB Large Movie Review Dataset, is borrowed from [this Tensorflow tutorial](https://www.tensorflow.org/hub/tutorials/text_classification_with_tf_hub).

In [9]:
from tensorflow import keras
import os
import re

In [10]:
Conventional = ["Generic correction", "Punctuation", "Spelling", "Capitalisation", "Grammar", "Determiners", "Articles", "Quantifiers", "Verbs", "Tense", "Choice of tense", "Tense form", "Voice", "Modals", "Verb pattern", "Intransitive verb", "Transitive verb", "Reflexive verb", "Verb with as", "Ambitransitive verb", "Two verbal forms in the predicate", "Verb + Infinitive", "Verb + Gerund", "Verb + Infinitive OR Gerund", "Verb + Bare Infinitive", "Verb + Object/Addressee + Bare Infinitive", "Infinitive Restoration Alternation", "Verb + Participle", "Get + participle", "Complex-object verb", "Verbal idiom", "Prepositional or phrasal verb", "Dative verb with alternation", "Verb followed by a clause", "Verb + that/WH + Clause", "Verb + if/whether + clause", "Verb + that + Subjunctive clause", "Verb + it + Conj + Clause", "Participial construction", "Infinitive construction", "Gerund phrase", "Nouns", "Countable/uncountable", "Prepositional noun", "Possessive form of a noun", "Noun as an attribute", "Noun + Infinitive", "Noun number", "Prepositions", "Conjunctions", "Adjectives", "Comparative degree of adjectives", "Superlative degree of adjectives", "Prepositional adjective", "Adjective as a collective noun", "Adverbs", "Comparative degree of adverbs", "Superlative degree of adverbs", "Prepositional adverb", "Numerals", "Pronouns", "Agreement", "Word order", "Standard word order", "Emphatic shift", "Cleft sentence", "Interrogative word order", "Incomplete sentence", "Exclamation", "Title structure", "Note structure", "Conditionals", "Attributes", "Relative clause", "Defining relative clause", "Non-defining relative clause", "Coordinate relative clause", "Attributive participial construction", "Parallel constructions", "Negation", "Comparative construction", "Numerical comparison", "Confusion of structures", "Vocabulary", "Word choice", "Choice of lexical item", "Words often confused", "Choice of a part of lexical item", "Absence of certain components of a collocation", "Redundant word(s)", "Word formation", "Derivational affixes", "Formational suffix", "Formational prefix", "Confusion of categories", "Compound word", "Discourse", "Referential device", "Coherence", "Linking device", "Inappropriate register", "Absence of a component in clause or sentence", "Redundant component in clause or sentence", "Absence of necessary explanation or detail", "Deletion"]
Tags = ["Correction", "Punctuation", "Spelling", "Capitalisation", "Grammar", "Determiners", "Articles", "Quantifiers", "Verbs", "Tense", "Tense_choice", "Tense_form", "Voice", "Modals", "Verb_pattern", "Intransitive", "Transitive", "Reflexive_verb", "Presentation", "Ambitransitive", "Two_in_a_row", "Verb_Inf", "Verb_Gerund", "Verb_Inf_Gerund", "Verb_Bare_Inf", "Verb_object_bare", "Restoration_alter", "Verb_part", "Get_part", "Complex_obj", "Verbal_idiom", "Prepositional_verb", "Dative", "Followed_by_a_clause", "that_clause", "if_whether_clause", "that_subj_clause", "it_conj_clause", "Participial_constr", "Infinitive_constr", "Gerund_phrase", "Nouns", "Countable_uncountable", "Prepositional_noun", "Possessive", "Noun_attribute", "Noun_inf", "Noun_number", "Prepositions", "Conjunctions", "Adjectives", "Comparative_adj", "Superlative_adj", "Prepositional_adjective", "Adj_as_collective", "Adverbs", "Comparative_adv", "Superlative_adv", "Prepositional_adv", "Numerals", "Pronouns", "Agreement_errors", "Word_order", "Standard", "Emphatic", "Cleft", "Interrogative", "Abs_comp_clause", "Exclamation", "Title_structure", "Note_structure", "Conditionals", "Attributes", "Relative_clause", "Defining", "Non_defining", "Coordinate", "Attr_participial", "Lack_par_constr", "Negation", "Comparative_constr", "Numerical", "Confusion_of_structures", "Vocabulary", "Word_choice", "lex_item_choice", "Often_confused", "lex_part_choice", "Absence_comp_colloc", "Redundant", "Derivation", "Formational_affixes", "Suffix", "Prefix", "Category_confusion", "Compound_word", "Discourse", "Ref_device", "Coherence", "Linking_device", "Inappropriate_register", "Absence_comp_sent", "Redundant_comp", "Absence_explanation", "delete"]
translate_dict = {e[0]: e[1] for e in zip(Conventional, Tags)}

error_type = "Choice of lexical item" #@param ["Spelling", "Choice of lexical item", "Deletion", "Prepositions", "Agreement", "Noun number", "Confusion of categories", "Referential device", "Capitalisation", "Words often confused"]
error_ratio = "AugmentedRatio" #@param ["ToFifteen", "AugmentedRatio"]

error_type = translate_dict[error_type]

It's about time we named our model

In [11]:
modelname = "lex_item_choice_AugmentedRatio" # @param {type:"string"}

In [12]:
import shutil

filename = error_ratio+"/train/"+error_type+".json"
shutil.copy2('./Datasets/'+filename,'.')
trn = pd.read_json(error_type+".json").reset_index(drop=True)
trn["is_error"] = trn["is_error"].astype(int)

filename = error_ratio+"/test/"+error_type+".json"
shutil.copy2('./Datasets/'+filename,'./'+error_type+'_test.json')
tst = pd.read_json(error_type+"_test.json").reset_index(drop=True)
tst["is_error"] = tst["is_error"].astype(int)

In [13]:
len(trn)

113830

To keep training fast, we'll take a sample of 5000 train and test examples, respectively.

In [14]:
train = trn
test = tst

In [15]:
train.columns

Index(['context', 'is_error', 'path', 'substring'], dtype='object')

For us, our input data is the 'sentence' column and our label is the 'polarity' column (0, 1 for negative and positive, respecitvely)

In [16]:
DATA_COLUMN = 'context'
SUBSTR_COLUMN = 'substring'
LABEL_COLUMN = 'is_error'
# label_list is the list of labels, i.e. True, False or 0, 1 or 'dog', 'cat'
label_list = [0, 1]

#Data Preprocessing
We'll need to transform our data into a format BERT understands. This involves two steps. First, we create  `InputExample`'s using the constructor provided in the BERT library.

- `text_a` is the text we want to classify, which in this case, is the `Request` field in our Dataframe. 
- `text_b` is used if we're training a model to understand the relationship between sentences (i.e. is `text_b` a translation of `text_a`? Is `text_b` an answer to the question asked by `text_a`?). This doesn't apply to our task, so we can leave `text_b` blank.
- `label` is the label for our example, i.e. True, False

In [17]:
# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this example
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = x[SUBSTR_COLUMN], 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

test_InputExamples = test.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = x[SUBSTR_COLUMN], 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

In [18]:
train_InputExamples

0         <bert.run_classifier.InputExample object at 0x...
1         <bert.run_classifier.InputExample object at 0x...
2         <bert.run_classifier.InputExample object at 0x...
3         <bert.run_classifier.InputExample object at 0x...
4         <bert.run_classifier.InputExample object at 0x...
5         <bert.run_classifier.InputExample object at 0x...
6         <bert.run_classifier.InputExample object at 0x...
7         <bert.run_classifier.InputExample object at 0x...
8         <bert.run_classifier.InputExample object at 0x...
9         <bert.run_classifier.InputExample object at 0x...
10        <bert.run_classifier.InputExample object at 0x...
11        <bert.run_classifier.InputExample object at 0x...
12        <bert.run_classifier.InputExample object at 0x...
13        <bert.run_classifier.InputExample object at 0x...
14        <bert.run_classifier.InputExample object at 0x...
15        <bert.run_classifier.InputExample object at 0x...
16        <bert.run_classifier.InputExam

Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (but don't worry--this is also included in the Python library):


1. Lowercase our text (if we're using a BERT lowercase model)
2. Tokenize it (i.e. "sally says hi" -> ["sally", "says", "hi"])
3. Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
4. Map our words to indexes using a vocab file that BERT provides
5. Add special "CLS" and "SEP" tokens (see the [readme](https://github.com/google-research/bert))
6. Append "index" and "segment" tokens to each input (see the [BERT paper](https://arxiv.org/pdf/1810.04805.pdf))

Happily, we don't have to worry about most of these details.




To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:

In [19]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_cased_L-12_H-768_A-12/1"

class NonMaskOmittingTokenizer(bert.tokenization.FullTokenizer):
  def tokenize(self, text):
    split_tokens = []
    for token in self.basic_tokenizer.tokenize(text):
      for sub_token in self.wordpiece_tokenizer.tokenize(token):
        split_tokens.append(sub_token)

    for index, item in enumerate(split_tokens):
      if index >= len(split_tokens)-2:
        break
      if item == '[' and split_tokens[index + 1] == 'MA' and split_tokens[index + 2] == '##S' and split_tokens[index + 3] == '##K' and split_tokens[index + 4] == ']':
        split_tokens[index] = "[MASK]"
        del split_tokens[index + 1]
        del split_tokens[index + 1]
        del split_tokens[index + 1]
        del split_tokens[index + 1]

    return split_tokens

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return NonMaskOmittingTokenizer(vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0430 09:04:13.039565 139686153271040 tf_logging.py:115] Saver not created because there are no variables in the graph to restore


In [20]:
tokenizer.tokenize("This here's an example of using the BERT [MASK] tokenizer")

['This',
 'here',
 "'",
 's',
 'an',
 'example',
 'of',
 'using',
 'the',
 'B',
 '##ER',
 '##T',
 '[MASK]',
 'token',
 '##izer']

Using our tokenizer, we'll call `run_classifier.convert_examples_to_features` on our InputExamples to convert them into features BERT understands.

In [21]:
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128
# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)

INFO:tensorflow:Writing example 0 of 113830


I0430 09:04:21.747951 139686153271040 tf_logging.py:115] Writing example 0 of 113830


INFO:tensorflow:*** Example ***


I0430 09:04:21.751524 139686153271040 tf_logging.py:115] *** Example ***


INFO:tensorflow:guid: None


I0430 09:04:21.753438 139686153271040 tf_logging.py:115] guid: None


INFO:tensorflow:tokens: [CLS] For example , in this year O ##lim ##pic Games encouraged people . People [MASK] to do sport more . That ` s why , it influenced on public health positively . [SEP] became [SEP]


I0430 09:04:21.755522 139686153271040 tf_logging.py:115] tokens: [CLS] For example , in this year O ##lim ##pic Games encouraged people . People [MASK] to do sport more . That ` s why , it influenced on public health positively . [SEP] became [SEP]


INFO:tensorflow:input_ids: 101 1370 1859 117 1107 1142 1214 152 24891 20437 2957 6182 1234 119 2563 103 1106 1202 4799 1167 119 1337 169 188 1725 117 1122 4401 1113 1470 2332 14257 119 102 1245 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:04:21.757594 139686153271040 tf_logging.py:115] input_ids: 101 1370 1859 117 1107 1142 1214 152 24891 20437 2957 6182 1234 119 2563 103 1106 1202 4799 1167 119 1337 169 188 1725 117 1122 4401 1113 1470 2332 14257 119 102 1245 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:04:21.759553 139686153271040 tf_logging.py:115] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:04:21.761346 139686153271040 tf_logging.py:115] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0430 09:04:21.762908 139686153271040 tf_logging.py:115] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0430 09:04:21.767100 139686153271040 tf_logging.py:115] *** Example ***


INFO:tensorflow:guid: None


I0430 09:04:21.768981 139686153271040 tf_logging.py:115] guid: None


INFO:tensorflow:tokens: [CLS] The information given illustrate ##s the amount of investment in renewable energy given by two types of countries such as developed and developing countries from 2006 to 2013 , and it also shows , in comparison , the amount of investment as a world total over [MASK] period . It can be de ##duced from the graph that investments that were done in developed countries remained considerably higher than investments in developing countries from 2006 to 2013 . [SEP] similar [SEP]


I0430 09:04:21.770656 139686153271040 tf_logging.py:115] tokens: [CLS] The information given illustrate ##s the amount of investment in renewable energy given by two types of countries such as developed and developing countries from 2006 to 2013 , and it also shows , in comparison , the amount of investment as a world total over [MASK] period . It can be de ##duced from the graph that investments that were done in developed countries remained considerably higher than investments in developing countries from 2006 to 2013 . [SEP] similar [SEP]


INFO:tensorflow:input_ids: 101 1109 1869 1549 20873 1116 1103 2971 1104 5151 1107 17216 2308 1549 1118 1160 3322 1104 2182 1216 1112 1872 1105 4297 2182 1121 1386 1106 1381 117 1105 1122 1145 2196 117 1107 7577 117 1103 2971 1104 5151 1112 170 1362 1703 1166 103 1669 119 1135 1169 1129 1260 20196 1121 1103 10873 1115 12372 1115 1127 1694 1107 1872 2182 1915 9627 2299 1190 12372 1107 4297 2182 1121 1386 1106 1381 119 102 1861 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:04:21.772453 139686153271040 tf_logging.py:115] input_ids: 101 1109 1869 1549 20873 1116 1103 2971 1104 5151 1107 17216 2308 1549 1118 1160 3322 1104 2182 1216 1112 1872 1105 4297 2182 1121 1386 1106 1381 117 1105 1122 1145 2196 117 1107 7577 117 1103 2971 1104 5151 1112 170 1362 1703 1166 103 1669 119 1135 1169 1129 1260 20196 1121 1103 10873 1115 12372 1115 1127 1694 1107 1872 2182 1915 9627 2299 1190 12372 1107 4297 2182 1121 1386 1106 1381 119 102 1861 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:04:21.774111 139686153271040 tf_logging.py:115] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:04:21.775936 139686153271040 tf_logging.py:115] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0430 09:04:21.777671 139686153271040 tf_logging.py:115] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0430 09:04:21.781911 139686153271040 tf_logging.py:115] *** Example ***


INFO:tensorflow:guid: None


I0430 09:04:21.783746 139686153271040 tf_logging.py:115] guid: None


INFO:tensorflow:tokens: [CLS] The point where percentage in two countries matches concerns people with the income slightly higher than that accepted middle . In both countries people of this [MASK] spend approximately 4 per cent of their income on petrol . On the whole , the overall tendencies are significantly different in two countries but the most remarkable difference lies in the class of poor ##est people whereas the richest people behave similarly in the UK and in the USA . [SEP] class [SEP]


I0430 09:04:21.785560 139686153271040 tf_logging.py:115] tokens: [CLS] The point where percentage in two countries matches concerns people with the income slightly higher than that accepted middle . In both countries people of this [MASK] spend approximately 4 per cent of their income on petrol . On the whole , the overall tendencies are significantly different in two countries but the most remarkable difference lies in the class of poor ##est people whereas the richest people behave similarly in the UK and in the USA . [SEP] class [SEP]


INFO:tensorflow:input_ids: 101 1109 1553 1187 6556 1107 1160 2182 2697 5365 1234 1114 1103 2467 2776 2299 1190 1115 3134 2243 119 1130 1241 2182 1234 1104 1142 103 4511 2324 125 1679 9848 1104 1147 2467 1113 19847 119 1212 1103 2006 117 1103 2905 23581 1132 5409 1472 1107 1160 2182 1133 1103 1211 9495 3719 2887 1107 1103 1705 1104 2869 2556 1234 6142 1103 20513 1234 18492 9279 1107 1103 1993 1105 1107 1103 3066 119 102 1705 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:04:21.787322 139686153271040 tf_logging.py:115] input_ids: 101 1109 1553 1187 6556 1107 1160 2182 2697 5365 1234 1114 1103 2467 2776 2299 1190 1115 3134 2243 119 1130 1241 2182 1234 1104 1142 103 4511 2324 125 1679 9848 1104 1147 2467 1113 19847 119 1212 1103 2006 117 1103 2905 23581 1132 5409 1472 1107 1160 2182 1133 1103 1211 9495 3719 2887 1107 1103 1705 1104 2869 2556 1234 6142 1103 20513 1234 18492 9279 1107 1103 1993 1105 1107 1103 3066 119 102 1705 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:04:21.789047 139686153271040 tf_logging.py:115] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:04:21.790769 139686153271040 tf_logging.py:115] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0430 09:04:21.792457 139686153271040 tf_logging.py:115] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0430 09:04:21.795987 139686153271040 tf_logging.py:115] *** Example ***


INFO:tensorflow:guid: None


I0430 09:04:21.797733 139686153271040 tf_logging.py:115] guid: None


INFO:tensorflow:tokens: [CLS] The trends of g ##rat ##h ’ s development for Sweden and USA are nearly the same . The key difference is that USA ’ s t ##rand is having a stable period during 2000 - s - 2020 - s while Sweden has [MASK] rise and then slight fall in population age ##n 65 . This situation have caused difference in latest per ##sent ##s at 204 ##0s . [SEP] rapid [SEP]


I0430 09:04:21.799515 139686153271040 tf_logging.py:115] tokens: [CLS] The trends of g ##rat ##h ’ s development for Sweden and USA are nearly the same . The key difference is that USA ’ s t ##rand is having a stable period during 2000 - s - 2020 - s while Sweden has [MASK] rise and then slight fall in population age ##n 65 . This situation have caused difference in latest per ##sent ##s at 204 ##0s . [SEP] rapid [SEP]


INFO:tensorflow:input_ids: 101 1109 14652 1104 176 7625 1324 787 188 1718 1111 3865 1105 3066 1132 2212 1103 1269 119 1109 2501 3719 1110 1115 3066 787 188 189 13141 1110 1515 170 6111 1669 1219 1539 118 188 118 12795 118 188 1229 3865 1144 103 3606 1105 1173 6812 2303 1107 1416 1425 1179 2625 119 1188 2820 1138 2416 3719 1107 6270 1679 27408 1116 1120 21355 13031 119 102 6099 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:04:21.801332 139686153271040 tf_logging.py:115] input_ids: 101 1109 14652 1104 176 7625 1324 787 188 1718 1111 3865 1105 3066 1132 2212 1103 1269 119 1109 2501 3719 1110 1115 3066 787 188 189 13141 1110 1515 170 6111 1669 1219 1539 118 188 118 12795 118 188 1229 3865 1144 103 3606 1105 1173 6812 2303 1107 1416 1425 1179 2625 119 1188 2820 1138 2416 3719 1107 6270 1679 27408 1116 1120 21355 13031 119 102 6099 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:04:21.802988 139686153271040 tf_logging.py:115] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:04:21.804642 139686153271040 tf_logging.py:115] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0430 09:04:21.806415 139686153271040 tf_logging.py:115] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0430 09:04:21.810252 139686153271040 tf_logging.py:115] *** Example ***


INFO:tensorflow:guid: None


I0430 09:04:21.811972 139686153271040 tf_logging.py:115] guid: None


INFO:tensorflow:tokens: [CLS] Modern world has really big problem , on the one hand , with health of human ##ices , but at the same time , has active develop of technology which try to reduce our problems q ##ui ##c ##ly . Clearly , that nowadays [MASK] can ’ t imagine our life without modern technology . So , some people have big problem with health , because they don ’ t know how to right connected with technology . [SEP] human [SEP]


I0430 09:04:21.813759 139686153271040 tf_logging.py:115] tokens: [CLS] Modern world has really big problem , on the one hand , with health of human ##ices , but at the same time , has active develop of technology which try to reduce our problems q ##ui ##c ##ly . Clearly , that nowadays [MASK] can ’ t imagine our life without modern technology . So , some people have big problem with health , because they don ’ t know how to right connected with technology . [SEP] human [SEP]


INFO:tensorflow:input_ids: 101 4825 1362 1144 1541 1992 2463 117 1113 1103 1141 1289 117 1114 2332 1104 1769 18117 117 1133 1120 1103 1269 1159 117 1144 2327 3689 1104 2815 1134 2222 1106 4851 1412 2645 186 6592 1665 1193 119 19260 117 1115 20148 103 1169 787 189 5403 1412 1297 1443 2030 2815 119 1573 117 1199 1234 1138 1992 2463 1114 2332 117 1272 1152 1274 787 189 1221 1293 1106 1268 3387 1114 2815 119 102 1769 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:04:21.815592 139686153271040 tf_logging.py:115] input_ids: 101 4825 1362 1144 1541 1992 2463 117 1113 1103 1141 1289 117 1114 2332 1104 1769 18117 117 1133 1120 1103 1269 1159 117 1144 2327 3689 1104 2815 1134 2222 1106 4851 1412 2645 186 6592 1665 1193 119 19260 117 1115 20148 103 1169 787 189 5403 1412 1297 1443 2030 2815 119 1573 117 1199 1234 1138 1992 2463 1114 2332 117 1272 1152 1274 787 189 1221 1293 1106 1268 3387 1114 2815 119 102 1769 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:04:21.817413 139686153271040 tf_logging.py:115] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:04:21.819133 139686153271040 tf_logging.py:115] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0430 09:04:21.820859 139686153271040 tf_logging.py:115] label: 1 (id = 1)


INFO:tensorflow:Writing example 10000 of 113830


I0430 09:04:32.196543 139686153271040 tf_logging.py:115] Writing example 10000 of 113830


INFO:tensorflow:Writing example 20000 of 113830


I0430 09:04:42.307498 139686153271040 tf_logging.py:115] Writing example 20000 of 113830


INFO:tensorflow:Writing example 30000 of 113830


I0430 09:04:53.240358 139686153271040 tf_logging.py:115] Writing example 30000 of 113830


INFO:tensorflow:Writing example 40000 of 113830


I0430 09:05:03.534213 139686153271040 tf_logging.py:115] Writing example 40000 of 113830


INFO:tensorflow:Writing example 50000 of 113830


I0430 09:05:13.958012 139686153271040 tf_logging.py:115] Writing example 50000 of 113830


INFO:tensorflow:Writing example 60000 of 113830


I0430 09:05:24.302808 139686153271040 tf_logging.py:115] Writing example 60000 of 113830


INFO:tensorflow:Writing example 70000 of 113830


I0430 09:05:34.276668 139686153271040 tf_logging.py:115] Writing example 70000 of 113830


INFO:tensorflow:Writing example 80000 of 113830


I0430 09:05:44.732049 139686153271040 tf_logging.py:115] Writing example 80000 of 113830


INFO:tensorflow:Writing example 90000 of 113830


I0430 09:05:54.794809 139686153271040 tf_logging.py:115] Writing example 90000 of 113830


INFO:tensorflow:Writing example 100000 of 113830


I0430 09:06:04.753903 139686153271040 tf_logging.py:115] Writing example 100000 of 113830


INFO:tensorflow:Writing example 110000 of 113830


I0430 09:06:14.776854 139686153271040 tf_logging.py:115] Writing example 110000 of 113830


INFO:tensorflow:Writing example 0 of 28458


I0430 09:06:19.183376 139686153271040 tf_logging.py:115] Writing example 0 of 28458


INFO:tensorflow:*** Example ***


I0430 09:06:19.186730 139686153271040 tf_logging.py:115] *** Example ***


INFO:tensorflow:guid: None


I0430 09:06:19.188531 139686153271040 tf_logging.py:115] guid: None


INFO:tensorflow:tokens: [CLS] The first diagram gives information about products , which is delivered by rail . It is noticeable that the [MASK] feature of the first graph is percentage of transported metals ( 35 % ) . In contrast , road transportation has only 11 % of metals delivery , but it provides 28 % of manufactured goods transportation . [SEP] mean [SEP]


I0430 09:06:19.190332 139686153271040 tf_logging.py:115] tokens: [CLS] The first diagram gives information about products , which is delivered by rail . It is noticeable that the [MASK] feature of the first graph is percentage of transported metals ( 35 % ) . In contrast , road transportation has only 11 % of metals delivery , but it provides 28 % of manufactured goods transportation . [SEP] mean [SEP]


INFO:tensorflow:input_ids: 101 1109 1148 18217 3114 1869 1164 2982 117 1134 1110 4653 1118 4356 119 1135 1110 19178 1115 1103 103 2672 1104 1103 1148 10873 1110 6556 1104 9470 13237 113 2588 110 114 119 1130 5014 117 1812 6312 1144 1178 1429 110 1104 13237 6779 117 1133 1122 2790 1743 110 1104 7227 4817 6312 119 102 1928 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:06:19.192067 139686153271040 tf_logging.py:115] input_ids: 101 1109 1148 18217 3114 1869 1164 2982 117 1134 1110 4653 1118 4356 119 1135 1110 19178 1115 1103 103 2672 1104 1103 1148 10873 1110 6556 1104 9470 13237 113 2588 110 114 119 1130 5014 117 1812 6312 1144 1178 1429 110 1104 13237 6779 117 1133 1122 2790 1743 110 1104 7227 4817 6312 119 102 1928 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:06:19.193813 139686153271040 tf_logging.py:115] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:06:19.195450 139686153271040 tf_logging.py:115] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0430 09:06:19.196947 139686153271040 tf_logging.py:115] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0430 09:06:19.200273 139686153271040 tf_logging.py:115] *** Example ***


INFO:tensorflow:guid: None


I0430 09:06:19.201736 139686153271040 tf_logging.py:115] guid: None


INFO:tensorflow:tokens: [CLS] And the best punishment for a sports ##man is a banning from all the competitions for the rest of his / her life . On the other hand , everyone should have a chance to make himself better , to recognize mistakes and do all his best to [MASK] it . If there is an opportunity that if a sports ##man was banned , he would not repeat such actions anymore . [SEP] improve [SEP]


I0430 09:06:19.203145 139686153271040 tf_logging.py:115] tokens: [CLS] And the best punishment for a sports ##man is a banning from all the competitions for the rest of his / her life . On the other hand , everyone should have a chance to make himself better , to recognize mistakes and do all his best to [MASK] it . If there is an opportunity that if a sports ##man was banned , he would not repeat such actions anymore . [SEP] improve [SEP]


INFO:tensorflow:input_ids: 101 1262 1103 1436 7703 1111 170 2865 1399 1110 170 26380 1121 1155 1103 6025 1111 1103 1832 1104 1117 120 1123 1297 119 1212 1103 1168 1289 117 2490 1431 1138 170 2640 1106 1294 1471 1618 117 1106 6239 12572 1105 1202 1155 1117 1436 1106 103 1122 119 1409 1175 1110 1126 3767 1115 1191 170 2865 1399 1108 7548 117 1119 1156 1136 9488 1216 3721 4169 119 102 4607 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:06:19.205659 139686153271040 tf_logging.py:115] input_ids: 101 1262 1103 1436 7703 1111 170 2865 1399 1110 170 26380 1121 1155 1103 6025 1111 1103 1832 1104 1117 120 1123 1297 119 1212 1103 1168 1289 117 2490 1431 1138 170 2640 1106 1294 1471 1618 117 1106 6239 12572 1105 1202 1155 1117 1436 1106 103 1122 119 1409 1175 1110 1126 3767 1115 1191 170 2865 1399 1108 7548 117 1119 1156 1136 9488 1216 3721 4169 119 102 4607 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:06:19.207040 139686153271040 tf_logging.py:115] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:06:19.208435 139686153271040 tf_logging.py:115] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0430 09:06:19.209943 139686153271040 tf_logging.py:115] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0430 09:06:19.213336 139686153271040 tf_logging.py:115] *** Example ***


INFO:tensorflow:guid: None


I0430 09:06:19.214907 139686153271040 tf_logging.py:115] guid: None


INFO:tensorflow:tokens: [CLS] l ##hat the offered policy is not appropriate . There is no doubt that if we admit the requirement of equal number of students of both gender ##s we will also agree with the fact that intellectual or mental abilities of male and female [MASK] . Of course , centuries ago men and women didn ’ t have similar rights and opportunities . [SEP] differentiate [SEP]


I0430 09:06:19.216426 139686153271040 tf_logging.py:115] tokens: [CLS] l ##hat the offered policy is not appropriate . There is no doubt that if we admit the requirement of equal number of students of both gender ##s we will also agree with the fact that intellectual or mental abilities of male and female [MASK] . Of course , centuries ago men and women didn ’ t have similar rights and opportunities . [SEP] differentiate [SEP]


INFO:tensorflow:input_ids: 101 181 11220 1103 2356 2818 1110 1136 5806 119 1247 1110 1185 4095 1115 1191 1195 5890 1103 8875 1104 4463 1295 1104 1651 1104 1241 5772 1116 1195 1209 1145 5340 1114 1103 1864 1115 8066 1137 4910 7134 1104 2581 1105 2130 103 119 2096 1736 117 3944 2403 1441 1105 1535 1238 787 189 1138 1861 2266 1105 6305 119 102 23159 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:06:19.217973 139686153271040 tf_logging.py:115] input_ids: 101 181 11220 1103 2356 2818 1110 1136 5806 119 1247 1110 1185 4095 1115 1191 1195 5890 1103 8875 1104 4463 1295 1104 1651 1104 1241 5772 1116 1195 1209 1145 5340 1114 1103 1864 1115 8066 1137 4910 7134 1104 2581 1105 2130 103 119 2096 1736 117 3944 2403 1441 1105 1535 1238 787 189 1138 1861 2266 1105 6305 119 102 23159 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:06:19.219431 139686153271040 tf_logging.py:115] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:06:19.220911 139686153271040 tf_logging.py:115] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0430 09:06:19.222424 139686153271040 tf_logging.py:115] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0430 09:06:19.225574 139686153271040 tf_logging.py:115] *** Example ***


INFO:tensorflow:guid: None


I0430 09:06:19.227213 139686153271040 tf_logging.py:115] guid: None


INFO:tensorflow:tokens: [CLS] In spite of this , money is believed to be significant no less than happiness . The main argument in favor of money is people say that you won ’ t be happy before you earn a big n [MASK] of money . From this point of view , in our world it is money that di ##ct ##ates us the rules . [SEP] number [SEP]


I0430 09:06:19.228815 139686153271040 tf_logging.py:115] tokens: [CLS] In spite of this , money is believed to be significant no less than happiness . The main argument in favor of money is people say that you won ’ t be happy before you earn a big n [MASK] of money . From this point of view , in our world it is money that di ##ct ##ates us the rules . [SEP] number [SEP]


INFO:tensorflow:input_ids: 101 1130 8438 1104 1142 117 1948 1110 2475 1106 1129 2418 1185 1750 1190 9266 119 1109 1514 6171 1107 5010 1104 1948 1110 1234 1474 1115 1128 1281 787 189 1129 2816 1196 1128 7379 170 1992 183 103 1104 1948 119 1622 1142 1553 1104 2458 117 1107 1412 1362 1122 1110 1948 1115 4267 5822 5430 1366 1103 2995 119 102 1295 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:06:19.230333 139686153271040 tf_logging.py:115] input_ids: 101 1130 8438 1104 1142 117 1948 1110 2475 1106 1129 2418 1185 1750 1190 9266 119 1109 1514 6171 1107 5010 1104 1948 1110 1234 1474 1115 1128 1281 787 189 1129 2816 1196 1128 7379 170 1992 183 103 1104 1948 119 1622 1142 1553 1104 2458 117 1107 1412 1362 1122 1110 1948 1115 4267 5822 5430 1366 1103 2995 119 102 1295 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:06:19.231785 139686153271040 tf_logging.py:115] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:06:19.233246 139686153271040 tf_logging.py:115] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0430 09:06:19.234743 139686153271040 tf_logging.py:115] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0430 09:06:19.237751 139686153271040 tf_logging.py:115] *** Example ***


INFO:tensorflow:guid: None


I0430 09:06:19.239192 139686153271040 tf_logging.py:115] guid: None


INFO:tensorflow:tokens: [CLS] After this increase we can see steady fall from 206 ##0 to 210 ##0 . In fact the lowest level of extinction of species was [MASK] in 2000 , it was about 5000 dying out per million species . A more detailed look at the chart reveals that human impact makes up 81 , 3 per cent , that makes it the main threat to plant life . [SEP] fixed [SEP]


I0430 09:06:19.240746 139686153271040 tf_logging.py:115] tokens: [CLS] After this increase we can see steady fall from 206 ##0 to 210 ##0 . In fact the lowest level of extinction of species was [MASK] in 2000 , it was about 5000 dying out per million species . A more detailed look at the chart reveals that human impact makes up 81 , 3 per cent , that makes it the main threat to plant life . [SEP] fixed [SEP]


INFO:tensorflow:input_ids: 101 1258 1142 2773 1195 1169 1267 6386 2303 1121 20278 1568 1106 13075 1568 119 1130 1864 1103 6905 1634 1104 16137 1104 1530 1108 103 1107 1539 117 1122 1108 1164 13837 5694 1149 1679 1550 1530 119 138 1167 6448 1440 1120 1103 3481 7189 1115 1769 3772 2228 1146 5615 117 124 1679 9848 117 1115 2228 1122 1103 1514 4433 1106 2582 1297 119 102 4275 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:06:19.242319 139686153271040 tf_logging.py:115] input_ids: 101 1258 1142 2773 1195 1169 1267 6386 2303 1121 20278 1568 1106 13075 1568 119 1130 1864 1103 6905 1634 1104 16137 1104 1530 1108 103 1107 1539 117 1122 1108 1164 13837 5694 1149 1679 1550 1530 119 138 1167 6448 1440 1120 1103 3481 7189 1115 1769 3772 2228 1146 5615 117 124 1679 9848 117 1115 2228 1122 1103 1514 4433 1106 2582 1297 119 102 4275 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:06:19.243858 139686153271040 tf_logging.py:115] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0430 09:06:19.245270 139686153271040 tf_logging.py:115] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0430 09:06:19.246681 139686153271040 tf_logging.py:115] label: 1 (id = 1)


INFO:tensorflow:Writing example 10000 of 28458


I0430 09:06:29.564913 139686153271040 tf_logging.py:115] Writing example 10000 of 28458


INFO:tensorflow:Writing example 20000 of 28458


I0430 09:06:40.346717 139686153271040 tf_logging.py:115] Writing example 20000 of 28458


#Creating a model

Now that we've prepared our data, let's focus on building a model. `create_model` does just this below. First, it loads the BERT tf hub module again (this time to extract the computation graph). Next, it creates a single new layer that will be trained to adapt BERT to our sentiment task (i.e. classifying whether a movie review is positive or negative). This strategy of using a mostly trained model is called [fine-tuning](http://wiki.fast.ai/index.php/Fine_tuning).

In [22]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)


Next we'll wrap our model function in a `model_fn_builder` function that adapts our model to work for training, evaluation, and prediction.

In [23]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        f1_score = tf.contrib.metrics.f1_score(
            label_ids,
            predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        return {
            "eval_accuracy": accuracy,
            "f1_score": f1_score,
            "auc": auc,
            "precision": precision,
            "recall": recall,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
        }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


In [24]:
# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 500
SAVE_SUMMARY_STEPS = 100

In [25]:
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS) * 2
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

print(num_train_steps)

21342


In [26]:
# Specify outpit directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [27]:
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})


INFO:tensorflow:Using config: {'_model_dir': 'bert_output', '_protocol': None, '_log_step_count_steps': 100, '_task_id': 0, '_save_summary_steps': 100, '_save_checkpoints_secs': None, '_save_checkpoints_steps': 500, '_experimental_distribute': None, '_global_id_in_cluster': 0, '_num_worker_replicas': 1, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_train_distribute': None, '_is_chief': True, '_master': '', '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f068b70f7b8>, '_tf_random_seed': None, '_evaluation_master': '', '_device_fn': None, '_task_type': 'worker', '_eval_distribute': None, '_keep_checkpoint_every_n_hours': 10000, '_num_ps_replicas': 0, '_keep_checkpoint_max': 5}


I0430 09:07:10.390866 139686153271040 tf_logging.py:115] Using config: {'_model_dir': 'bert_output', '_protocol': None, '_log_step_count_steps': 100, '_task_id': 0, '_save_summary_steps': 100, '_save_checkpoints_secs': None, '_save_checkpoints_steps': 500, '_experimental_distribute': None, '_global_id_in_cluster': 0, '_num_worker_replicas': 1, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_train_distribute': None, '_is_chief': True, '_master': '', '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f068b70f7b8>, '_tf_random_seed': None, '_evaluation_master': '', '_device_fn': None, '_task_type': 'worker', '_eval_distribute': None, '_keep_checkpoint_every_n_hours': 10000, '_num_ps_replicas': 0, '_keep_checkpoint_max': 5}


Next we create an input builder function that takes our training feature set (`train_features`) and produces a generator. This is a pretty standard design pattern for working with Tensorflow [Estimators](https://www.tensorflow.org/guide/estimators).

In [28]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

In [29]:
sess = tf.Session(config=tf.ConfigProto( allow_soft_placement=True, log_device_placement=True))

with tf.device('/gpu:0'):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
    c = tf.matmul(a, b)

with tf.Session() as sess:
    print (sess.run(c))

[[22. 28.]
 [49. 64.]]


In [30]:
with tf.Session() as sess:
    x = tf.test.is_gpu_available()
x

True

In [1]:
print('Beginning Training!')
current_time = datetime.now()
with tf.device('/gpu:0'):
    estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!


NameError: name 'datetime' is not defined

Now we train our model! For me, using a Colab notebook running on Google's GPUs, my training time was about 14 minutes.

Now let's use our test data to see how well our model did:

In [0]:
test_input_fn = run_classifier.input_fn_builder(
    features=test_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

In [0]:
estimator.evaluate(input_fn=test_input_fn, steps=None)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore


  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2019-02-20T22:24:15Z
INFO:tensorflow:Graph was finalized.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO:tensorflow:Restoring parameters from OUTPUT_DIR_NAME/model.ckpt-1874
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Finished evaluation at 2019-02-20-22:27:05
INFO:tensorflow:Saving dict for global step 1874: auc = 0.59808373, eval_accuracy = 0.931, f1_score = 0.27368414, false_negatives = 459.0, false_positives = 231.0, global_step = 1874, loss = 0.39963886, precision = 0.3601108, recall = 0.22071308, true_negatives = 9180.0, true_positives = 130.0
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1874: OUTPUT_DIR_NAME/model.ckpt-1874


{'auc': 0.59808373,
 'eval_accuracy': 0.931,
 'f1_score': 0.27368414,
 'false_negatives': 459.0,
 'false_positives': 231.0,
 'global_step': 1874,
 'loss': 0.39963886,
 'precision': 0.3601108,
 'recall': 0.22071308,
 'true_negatives': 9180.0,
 'true_positives': 130.0}

#Extracting the trained model


Now let's save our model:

In [0]:
# Export the model
def serving_input_fn():
  with tf.variable_scope("foo"):
    feature_spec = {
        "input_ids": tf.FixedLenFeature([MAX_SEQ_LENGTH], tf.int64),
        "input_mask": tf.FixedLenFeature([MAX_SEQ_LENGTH], tf.int64),
        "segment_ids": tf.FixedLenFeature([MAX_SEQ_LENGTH], tf.int64),
        "label_ids": tf.FixedLenFeature([], tf.int64),
      }
    serialized_tf_example = tf.placeholder(dtype=tf.string,
                                           shape=[None],
                                           name='input_example_tensor')
    receiver_tensors = {'examples': serialized_tf_example}
    features = tf.parse_example(serialized_tf_example, feature_spec)
    return tf.estimator.export.ServingInputReceiver(features, receiver_tensors)

EXPORT_DIR = './Exported models/'+modelname
estimator._export_to_tpu = False  # this is important
path = estimator.export_savedmodel(EXPORT_DIR, serving_input_fn)
print(path)

Check if we can load it correctly:

In [0]:
subdirs = [x for x in Path(EXPORT_DIR).iterdir()
           if x.is_dir() and 'temp' not in str(x)]
latest = str(sorted(subdirs)[-1])

In [0]:
from tensorflow.contrib import predictor

predict_fn = predictor.from_saved_model(latest)

In [0]:
def getPrediction(in_sentences):
  labels = ["Not an error", "Is an error"]
  input_examples = [run_classifier.InputExample(guid="", text_a = x[0], text_b = x[1], label = 0) for x in in_sentences] # here, "" is just a dummy label
  input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
  # predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
  predictions = predict_fn(input_features)
  return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]

In [0]:
import nltk
import re
import itertools

nltk.download('perluniprops')
nltk.download('punkt')

[nltk_data] Downloading package perluniprops to /root/nltk_data...
[nltk_data]   Package perluniprops is already up-to-date!
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


True

In [0]:
from nltk.tokenize import TweetTokenizer
tknzr = TweetTokenizer()

from nltk.tokenize.moses import MosesDetokenizer
detokenizer = MosesDetokenizer()

def annotate(text):
  current_time = datetime.now()
  raw = text
  raw = re.sub(r'\s', ' ', raw)
  raw = re.sub(r'( )+', ' ', raw)
  sentences = nltk.sent_tokenize(raw)
  wordsoup = []
  annotation_set = []
  checking_ids = []
  tokens = []
  J = 0
  for sentence in sentences:
    wordsoup.append(tknzr.tokenize(sentence))
  for i in range(len(wordsoup)):
    tokenized_sentence = wordsoup[i]
    for j in range(len(tokenized_sentence)):
      token = tokenized_sentence[j]
      if not re.search(r'[a-zA-Z]', token):
        tokens.append([token, 0])
      else:
        tokens.append([token, None])
        substring = token
        si = i-2
        if si < 0:
          si = 0
        detokenizing_string = wordsoup[si:i] + [tokenized_sentence[:j] + ['[MASK]'] + tokenized_sentence[j+1:]] + wordsoup[i+1:i+3]
        detokenizing_string = [item for sublist in detokenizing_string for item in sublist]
        entry = detokenizer.detokenize(detokenizing_string, return_str=True)
        annotation_set.append([entry, substring])
        checking_ids.append(J)
      J += 1
  predictions = getPrediction(annotation_set)
  for p in range(len(predictions)):
    if predictions[p][-1] == 'Is an error':
      tokens[checking_ids[p]][1] = 1
    else:
      tokens[checking_ids[p]][1] = 0
  print('\n')
  print("Annotation took time ", datetime.now() - current_time)
  return tokens

def print_annotated_webpage(tokens, title):
  out = '<html>\n<head>\n<title>'+title+'</title>\n<meta charset="utf-8">\n<style type="text/css">\n.blue {\n\tbackground: #a8d1ff;\n\tdisplay: inline-block;\n}\n</style>\n</head>\n<body>'
  outsoup = []
  for token in tokens:
    if token[1] == 1:
      token[0] = '<div class="blue">'+token[0]+'</div>'
    outsoup.append(token[0])
  bodystring = detokenizer.detokenize(outsoup, return_str=True)
  out += bodystring
  out += '</body>\n</html>'
  return out

In [0]:
import json

curname = "wsites_" + Model_name + ".json"

test_texts = {'DTi_50_2': """It is widely known that music labels and film makers lose a great deal of money every year from illegal copying and free internet sharing. Some people say that people who do that should be punished, some think that such men are the new «Robin Hoods». Let us take a look at this problem.
On the one hand, illegal copying is prohibited in almost all civilized countries and there is a reason for that. Music, books and films are considered as an intellectual property and it is quite uderstandable because artists are getting paid for composing, painting and film making only if their products sell, and if they do not have enough money for their living and creating they will just get another job, and this is why the law of intellectual property exists. It helps artists to get money they deserve and to have enough funds to make a quality product.
On the other hand, nowadays it is not so simple as it may look like at first. Musicians do not often need labels to record their masterpieces anymore because personal computers went so far that now you can record some high-quality sound right in your living room so you do not need to rent a studio for that. As for film makers, they get millions just by dressing the main character in a big logo T-shirt of some reach corporation. In fact, money, that a film company is paid for commercial, can sometimes fully cover the film making expences. 
To conclude, I would like to say that we should obey the law of intellectual property. If we want to live in a respectful society, but sometimes, I think we should also ask ourselves what we are paying for.""",
'VSa_105_1': """The graph provides information about development of the book sales system in 4 different countries (USA, Germany, China, UK) in 2014 and predicts future perspectives of this industry in 2018.
Generally speaking, it is observed a great increase of eBook on the market in the USA in 2018, which will surpass printed sources. By contrast, in other countries, except the UK, printed literature will remain constant position.
Despite the fact that print dominated on the book market of the USA in 2014 and exceeded eBook by a half (10,5 to 5,5 respectively), in 2018 it is forecasted that benefit of print literature will shrink on 3 billion US dollars. At the same time the income of eBook will grow on 3 billion US dollars and will be the most selling source of literature (8,5 billion $ of Ebook as against 7,5 billion $ of print).
In other countries the book market is not so developed as it in the USA. However, it is forecasted that in 2018 the ebook sales will slightly overtake printed book sales (2,3 to 2 respectively).
In Germany and China printed books will be the most popular kind of literature (6 to 4,2 respectively), especially in Germany print indicators remain stable.
But it also be a small growth in sales of Ebooks in these countries (from 1 to 1,5 in German and from 0,5 to 1 in China).""",
'NMya_90_1': """On the first graph we can see information about maximal and minimal average temperatures in Yakutsk. Graphs for both, the minimum and maximum have a parabolic appearance: they both have a steady increase until the hottest month of the year (july) and after that they both have as constant decrease down to december. Analazing this whole graph, you can point out several things. First of all, the difference of the highest and lowest temperatures between the coldest month and the hottest month is sixty degrees and fifty degrees respectively. The biggest difference between maximum and minimum temperatures is seen on march and it is about seventeen degrees. According to this graph, july is always the hottest month and january is the coldest.

On the second graph we can see similar temperature comparsions for Rio de Janeiro. Two lines represent minimum and maximum average temperatures for every month of the year. According to this graph, temperatures in Rio don’t change much throughout the year. the maximal difference of minimum and maximum temperature is on january, july and august and is approximately only seven degrees. For this city the coldest months are june and july, and the hottest are january and february.

Comparing the two graphs we notice this: average year temperature in rio is almost constant, comparing to Yakutsk and is never less then 18 °C, and the coldest and the hottest months in Rio are the hottest and the coldest in Yakutsk.
""",
'NChe_16_2': """In reacent time a lot of people claims that modern technology is a curse that leads to a health problems. It is no use to argue with this statement.
For begining,  many doctors already said, that such things like computers can caus a lot of prolems with heath. So, one of the most famous illness is damaged eyes. Then kids or adults spend a load of time next to monitor they starting to lose eyes sharphes. What is more, some scientiest claiws that some deuices cau lead euen cancer. There is no uniqe poiut betwine doctors, any way there is such kind of danger.
As for my, I find this problems not so unreducable. In general, all health desiases caused by modern techuologies were caused by ouer-use of this technologies and useing it in anpropriate way. So, if people start to control time that they spend with there deuices it will help to recduce a lot of problems with eyes. As a next step, we should understand that lightning aroun us is also very important. That means, that modern technologies haue some bad influens on people, but we can also dicrease that bad effect.
To sum up, wiede use of modern technology shown us that there are not only benefits in deuelopmeut of deuices, but also the great danger. But corect use of it can reduce to the minimum all harmful effects""",
'ESha_3_2': """Nowadays it is becoming cosier to express yourself by different ways. Some people do it by using words, some by using pictures of films. But is it clear to allow artist to do and to act how they want? 


There are so many ways to express your thoghts and feelings. People from ancient times show their ideas and thoughts by paintings and music. A lot of paintings of famous authors are held now in different galleries and big amount of people see them every day. But in long time ago just really talanted people become famous and well known artists.


Now situation is different. Every person can become an artist. Sometimes, people don't think how their ideas and works would influence other people's minds. Too many untalanted and unproffesional people create music, movies and paintings. It is really hard for a good artist to show himself in such amount of untalanted people.


In my opinion, government should allowed to express ideas and thoughts stronly really good artists. I their ideas are clear and they have something to show and tell people, they should do it. But if artist's works don't have any idea or logic or it can't bring anything good to public, there is nothing to show.


Every person have a chance to show itself by any way he choose. But at first, we should think, if he really wants it and if she really have something to say and show to bublicity. Only really good works should be shown to people, because it can influence them and their minds a lot.
"""}

outie = [print_annotated_webpage(annotate(test_texts[tt]), tt) for tt in test_texts]
with open(curname, 'w', encoding="utf-8") as w:
  json.dump(outie, w)

INFO:tensorflow:Writing example 0 of 290
INFO:tensorflow:*** Example ***
INFO:tensorflow:guid: 
INFO:tensorflow:tokens: [CLS] [MASK] is widely known that music labels and film makers lose a great deal of money every year from illegal copying and free internet sharing . some people say that people who do that should be punished , some think that such men are the new « robin hood ##s » . let us take a look at this problem . [SEP] it [SEP]
INFO:tensorflow:input_ids: 101 103 2003 4235 2124 2008 2189 10873 1998 2143 11153 4558 1037 2307 3066 1997 2769 2296 2095 2013 6206 24731 1998 2489 4274 6631 1012 2070 2111 2360 2008 2111 2040 2079 2008 2323 2022 14248 1010 2070 2228 2008 2107 2273 2024 1996 2047 1077 5863 7415 2015 1090 1012 2292 2149 2202 1037 2298 2012 2023 3291 1012 102 2009 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

In [0]:
# Create & upload a file.
uploaded = drive.CreateFile({'title': curname})
uploaded.SetContentFile('./'+curname)
uploaded.Upload()
print('Uploaded file with ID {}'.format(uploaded.get('id')))

Uploaded file with ID 16_pmUKOHS5lKXM2o-1QyVcPQohalCRZd


# Preparing existing model for REALEC production