<a href="https://colab.research.google.com/github/kenmikanmi/bert_playground/blob/master/Predicting_Movie_Reviews_with_BERT_on_TF_Hub_ipynb.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
# Copyright 2019 Google Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#Predicting Movie Review Sentiment with BERT on TF Hub

# この notebook について
- 本家の [Predicting Movie Review Sentiment with BERT on TF Hub](https://colab.research.google.com/github/google-research/bert/blob/master/predicting_movie_reviews_with_bert_on_tf_hub.ipynb#scrollTo=xiYrZKaHwV81) に訳・適宜説明を加えて改変したものです。

- BERT とは、Google AI Research が提案したモデルで、テキスト分類、翻訳、要約、質問応答などのタスクで SoTA な結果を出している。
- BERT は [TF Hub](https://www.tensorflow.org/hub) に追加されたので、TensorFlowのモジュールとして利用できる。そしてこのモジュールは既存の特徴抽出器と置き換えることができる。
- BERT は [finetuning](http://wiki.fast.ai/index.php/Fine_tuning) においても高速かつ良い性能を持つ。
- この notebook では、IMDB movie review データセットの推論を行う。

In [1]:
from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime

In addition to the standard libraries we imported above, we'll need to install BERT's python package.

In [2]:
!pip install bert-tensorflow

Collecting bert-tensorflow
[?25l  Downloading https://files.pythonhosted.org/packages/a6/66/7eb4e8b6ea35b7cc54c322c816f976167a43019750279a8473d355800a93/bert_tensorflow-1.0.1-py2.py3-none-any.whl (67kB)
[K     |████▉                           | 10kB 19.3MB/s eta 0:00:01[K     |█████████▊                      | 20kB 1.7MB/s eta 0:00:01[K     |██████████████▋                 | 30kB 2.5MB/s eta 0:00:01[K     |███████████████████▍            | 40kB 3.3MB/s eta 0:00:01[K     |████████████████████████▎       | 51kB 2.1MB/s eta 0:00:01[K     |█████████████████████████████▏  | 61kB 2.5MB/s eta 0:00:01[K     |████████████████████████████████| 71kB 2.2MB/s 
Installing collected packages: bert-tensorflow
Successfully installed bert-tensorflow-1.0.1


In [3]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization




Below, we'll set an output directory location to store our model output and checkpoints. This can be a local directory, in which case you'd set OUTPUT_DIR to the name of the directory you'd like to create. If you're running this code in Google's hosted Colab, the directory won't persist after the Colab session ends.

Alternatively, if you're a GCP user, you can store output in a GCP bucket. To do that, set a directory name in OUTPUT_DIR and the name of the GCP bucket in the BUCKET field.

Set DO_DELETE to rewrite the OUTPUT_DIR if it exists. Otherwise, Tensorflow will load existing model checkpoints from that directory (if they exist).

In [5]:
# Set the output directory for saving model file
# Optionally, set a GCP bucket location

OUTPUT_DIR = 'BERT_PLAYGROUND'#@param {type:"string"}
#@markdown Whether or not to clear/delete the directory and create a new one
DO_DELETE = False #@param {type:"boolean"}
#@markdown Set USE_BUCKET and BUCKET if you want to (optionally) store model output on GCP bucket.
USE_BUCKET = False #@param {type:"boolean"}
BUCKET = 'BUCKET_NAME' #@param {type:"string"}

if USE_BUCKET:
  OUTPUT_DIR = 'gs://{}/{}'.format(BUCKET, OUTPUT_DIR)
  from google.colab import auth
  auth.authenticate_user()

if DO_DELETE:
  try:
    tf.gfile.DeleteRecursively(OUTPUT_DIR)
  except:
    # Doesn't matter if the directory didn't exist
    pass
tf.gfile.MakeDirs(OUTPUT_DIR)
print('***** Model output directory: {} *****'.format(OUTPUT_DIR))


***** Model output directory: BERT_PLAYGROUND *****


#Data

First, let's download the dataset, hosted by Stanford. The code below, which downloads, extracts, and imports the IMDB Large Movie Review Dataset, is borrowed from [this Tensorflow tutorial](https://www.tensorflow.org/hub/tutorials/text_classification_with_tf_hub).

In [0]:
from tensorflow import keras
import os
import re

# Load all files from a directory in a DataFrame.
def load_directory_data(directory):
  data = {}
  data["sentence"] = []
  data["sentiment"] = []
  for file_path in os.listdir(directory):
    with tf.gfile.GFile(os.path.join(directory, file_path), "r") as f:
      data["sentence"].append(f.read())
      data["sentiment"].append(re.match("\d+_(\d+)\.txt", file_path).group(1))
  return pd.DataFrame.from_dict(data)

# Merge positive and negative examples, add a polarity column and shuffle.
def load_dataset(directory):
  pos_df = load_directory_data(os.path.join(directory, "pos"))
  neg_df = load_directory_data(os.path.join(directory, "neg"))
  pos_df["polarity"] = 1
  neg_df["polarity"] = 0
  return pd.concat([pos_df, neg_df]).sample(frac=1).reset_index(drop=True)

# Download and process the dataset files.
def download_and_load_datasets(force_download=False):
  dataset = tf.keras.utils.get_file(
      fname="aclImdb.tar.gz", 
      origin="http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz", 
      extract=True)
  
  train_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                       "aclImdb", "train"))
  test_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                      "aclImdb", "test"))
  
  return train_df, test_df


In [7]:
train, test = download_and_load_datasets()

Downloading data from http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz


Exception ignored in: <bound method BufferedInputStream.<lambda> of <tensorflow.python.pywrap_tensorflow_internal.BufferedInputStream; proxy of <Swig Object of type 'tensorflow::io::BufferedInputStream *' at 0x7f66c1649ed0> >>
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/pywrap_tensorflow_internal.py", line 2365, in <lambda>
    __del__ = lambda self: None
KeyboardInterrupt


To keep training fast, we'll take a sample of 5000 train and test examples, respectively.

In [0]:
train = train.sample(5000)
test = test.sample(5000)

In [9]:
train.columns

Index(['sentence', 'sentiment', 'polarity'], dtype='object')

For us, our input data is the 'sentence' column and our label is the 'polarity' column (0, 1 for negative and positive, respecitvely)

In [0]:
DATA_COLUMN = 'sentence'
LABEL_COLUMN = 'polarity'
# label_list is the list of labels, i.e. True, False or 0, 1 or 'dog', 'cat'
label_list = [0, 1]

#データの前処理
BERT に入力可能な形式にデータを変形させる必要がある。

まず BERT ライブラリのコンストラクタを用いて `InputExample` なるものを作る。

引数の説明：
- `text_a` is the text we want to classify, which in this case, is the `Request` field in our Dataframe. 
- `text_b` is used if we're training a model to understand the relationship between sentences (i.e. is `text_b` a translation of `text_a`? Is `text_b` an answer to the question asked by `text_a`?). This doesn't apply to our task, so we can leave `text_b` blank.
- `label` is the label for our example, i.e. True, False

In [0]:
# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this example
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

test_InputExamples = test.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (but don't worry--this is also included in the Python library):


1. Lowercase our text (if we're using a BERT lowercase model)
2. Tokenize it (i.e. "sally says hi" -> ["sally", "says", "hi"])
3. Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
4. Map our words to indexes using a vocab file that BERT provides
5. Add special "CLS" and "SEP" tokens (see the [readme](https://github.com/google-research/bert))
6. Append "index" and "segment" tokens to each input (see the [BERT paper](https://arxiv.org/pdf/1810.04805.pdf))

Happily, we don't have to worry about most of these details.




To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:

In [12]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


Great--we just learned that the BERT model we're using expects lowercase data (that's what stored in tokenization_info["do_lower_case"]) and we also loaded BERT's vocab file. We also created a tokenizer, which breaks words into word pieces:

In [13]:
tokenizer.tokenize("This here's an example of using the BERT tokenizer")

['this',
 'here',
 "'",
 's',
 'an',
 'example',
 'of',
 'using',
 'the',
 'bert',
 'token',
 '##izer']

Using our tokenizer, we'll call `run_classifier.convert_examples_to_features` on our InputExamples to convert them into features BERT understands.

In [14]:
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128
# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)







INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] i think it took a lot of guts for her to come forward like that . it is unfortunate that when a celebrity suffers that is what helps people most . but , in her case , what she did was remarkable . i have been in the mental health field for five years and i think it is great that mental illness is not a terrible word anymore and i believe she helped . i always thought she was great and always will . i am glad that she wrote this book and that the movie was made . she is a remarkable lady and i hope she continues to act . she has been through a lot and has faced it . i would [SEP]


INFO:tensorflow:tokens: [CLS] i think it took a lot of guts for her to come forward like that . it is unfortunate that when a celebrity suffers that is what helps people most . but , in her case , what she did was remarkable . i have been in the mental health field for five years and i think it is great that mental illness is not a terrible word anymore and i believe she helped . i always thought she was great and always will . i am glad that she wrote this book and that the movie was made . she is a remarkable lady and i hope she continues to act . she has been through a lot and has faced it . i would [SEP]


INFO:tensorflow:input_ids: 101 1045 2228 2009 2165 1037 2843 1997 18453 2005 2014 2000 2272 2830 2066 2008 1012 2009 2003 15140 2008 2043 1037 8958 17567 2008 2003 2054 7126 2111 2087 1012 2021 1010 1999 2014 2553 1010 2054 2016 2106 2001 9487 1012 1045 2031 2042 1999 1996 5177 2740 2492 2005 2274 2086 1998 1045 2228 2009 2003 2307 2008 5177 7355 2003 2025 1037 6659 2773 4902 1998 1045 2903 2016 3271 1012 1045 2467 2245 2016 2001 2307 1998 2467 2097 1012 1045 2572 5580 2008 2016 2626 2023 2338 1998 2008 1996 3185 2001 2081 1012 2016 2003 1037 9487 3203 1998 1045 3246 2016 4247 2000 2552 1012 2016 2038 2042 2083 1037 2843 1998 2038 4320 2009 1012 1045 2052 102


INFO:tensorflow:input_ids: 101 1045 2228 2009 2165 1037 2843 1997 18453 2005 2014 2000 2272 2830 2066 2008 1012 2009 2003 15140 2008 2043 1037 8958 17567 2008 2003 2054 7126 2111 2087 1012 2021 1010 1999 2014 2553 1010 2054 2016 2106 2001 9487 1012 1045 2031 2042 1999 1996 5177 2740 2492 2005 2274 2086 1998 1045 2228 2009 2003 2307 2008 5177 7355 2003 2025 1037 6659 2773 4902 1998 1045 2903 2016 3271 1012 1045 2467 2245 2016 2001 2307 1998 2467 2097 1012 1045 2572 5580 2008 2016 2626 2023 2338 1998 2008 1996 3185 2001 2081 1012 2016 2003 1037 9487 3203 1998 1045 3246 2016 4247 2000 2552 1012 2016 2038 2042 2083 1037 2843 1998 2038 4320 2009 1012 1045 2052 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] this is a great " small " film . i say " small " because it doesn ' t have a hundred guns firing or a dozen explosions , as in a john woo film . great performances by roy sc ##hei ##der and the three " bad guys " . john frank ##en ##heimer seems to have more luck with small productions these days . the film is very easy to watch , the story is more of a yarn than a washing machine - - instead of everything going around and around , it seems as though things just get worse as the plot thick ##ens . wonderful ending , very positive . i never read the elm ##ore leonard book , but it [SEP]


INFO:tensorflow:tokens: [CLS] this is a great " small " film . i say " small " because it doesn ' t have a hundred guns firing or a dozen explosions , as in a john woo film . great performances by roy sc ##hei ##der and the three " bad guys " . john frank ##en ##heimer seems to have more luck with small productions these days . the film is very easy to watch , the story is more of a yarn than a washing machine - - instead of everything going around and around , it seems as though things just get worse as the plot thick ##ens . wonderful ending , very positive . i never read the elm ##ore leonard book , but it [SEP]


INFO:tensorflow:input_ids: 101 2023 2003 1037 2307 1000 2235 1000 2143 1012 1045 2360 1000 2235 1000 2138 2009 2987 1005 1056 2031 1037 3634 4409 7493 2030 1037 6474 18217 1010 2004 1999 1037 2198 15854 2143 1012 2307 4616 2011 6060 8040 26036 4063 1998 1996 2093 1000 2919 4364 1000 1012 2198 3581 2368 18826 3849 2000 2031 2062 6735 2007 2235 5453 2122 2420 1012 1996 2143 2003 2200 3733 2000 3422 1010 1996 2466 2003 2062 1997 1037 27158 2084 1037 12699 3698 1011 1011 2612 1997 2673 2183 2105 1998 2105 1010 2009 3849 2004 2295 2477 2074 2131 4788 2004 1996 5436 4317 6132 1012 6919 4566 1010 2200 3893 1012 1045 2196 3191 1996 17709 5686 7723 2338 1010 2021 2009 102


INFO:tensorflow:input_ids: 101 2023 2003 1037 2307 1000 2235 1000 2143 1012 1045 2360 1000 2235 1000 2138 2009 2987 1005 1056 2031 1037 3634 4409 7493 2030 1037 6474 18217 1010 2004 1999 1037 2198 15854 2143 1012 2307 4616 2011 6060 8040 26036 4063 1998 1996 2093 1000 2919 4364 1000 1012 2198 3581 2368 18826 3849 2000 2031 2062 6735 2007 2235 5453 2122 2420 1012 1996 2143 2003 2200 3733 2000 3422 1010 1996 2466 2003 2062 1997 1037 27158 2084 1037 12699 3698 1011 1011 2612 1997 2673 2183 2105 1998 2105 1010 2009 3849 2004 2295 2477 2074 2131 4788 2004 1996 5436 4317 6132 1012 6919 4566 1010 2200 3893 1012 1045 2196 3191 1996 17709 5686 7723 2338 1010 2021 2009 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] or ##ca starts as crust ##y irish sea captain nolan ( richard harris ) & his crew are trying to capture a great white shark so they can sell it for big bucks , unfortunately when a ha ##ples ##s marine biologist called ken ( robert carr ##adi ##ne ) comes under attack from it the shark is killed by a killer whale , this raises nolan ' s interest in killer whales & decides he want ' s to catch one of them instead . however while trying to do so he catches a pregnant female & injuries it to the extent she ab ##ort ##s her un ##born foe ##tus on deck which makes a mess & en ##rage ##s her mate , nolan [SEP]


INFO:tensorflow:tokens: [CLS] or ##ca starts as crust ##y irish sea captain nolan ( richard harris ) & his crew are trying to capture a great white shark so they can sell it for big bucks , unfortunately when a ha ##ples ##s marine biologist called ken ( robert carr ##adi ##ne ) comes under attack from it the shark is killed by a killer whale , this raises nolan ' s interest in killer whales & decides he want ' s to catch one of them instead . however while trying to do so he catches a pregnant female & injuries it to the extent she ab ##ort ##s her un ##born foe ##tus on deck which makes a mess & en ##rage ##s her mate , nolan [SEP]


INFO:tensorflow:input_ids: 101 2030 3540 4627 2004 19116 2100 3493 2712 2952 13401 1006 2957 5671 1007 1004 2010 3626 2024 2667 2000 5425 1037 2307 2317 11420 2061 2027 2064 5271 2009 2005 2502 14189 1010 6854 2043 1037 5292 21112 2015 3884 21477 2170 6358 1006 2728 12385 17190 2638 1007 3310 2104 2886 2013 2009 1996 11420 2003 2730 2011 1037 6359 13156 1010 2023 13275 13401 1005 1055 3037 1999 6359 17967 1004 7288 2002 2215 1005 1055 2000 4608 2028 1997 2068 2612 1012 2174 2096 2667 2000 2079 2061 2002 11269 1037 6875 2931 1004 6441 2009 2000 1996 6698 2016 11113 11589 2015 2014 4895 10280 22277 5809 2006 5877 2029 3084 1037 6752 1004 4372 24449 2015 2014 6775 1010 13401 102


INFO:tensorflow:input_ids: 101 2030 3540 4627 2004 19116 2100 3493 2712 2952 13401 1006 2957 5671 1007 1004 2010 3626 2024 2667 2000 5425 1037 2307 2317 11420 2061 2027 2064 5271 2009 2005 2502 14189 1010 6854 2043 1037 5292 21112 2015 3884 21477 2170 6358 1006 2728 12385 17190 2638 1007 3310 2104 2886 2013 2009 1996 11420 2003 2730 2011 1037 6359 13156 1010 2023 13275 13401 1005 1055 3037 1999 6359 17967 1004 7288 2002 2215 1005 1055 2000 4608 2028 1997 2068 2612 1012 2174 2096 2667 2000 2079 2061 2002 11269 1037 6875 2931 1004 6441 2009 2000 1996 6698 2016 11113 11589 2015 2014 4895 10280 22277 5809 2006 5877 2029 3084 1037 6752 1004 4372 24449 2015 2014 6775 1010 13401 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] in this movie the year 202 ##2 looks much like the seventies . this is amusing at first , but soon the viewer perceive ##s how very different that decade ##nt futuristic world is despite the appearances , how many things that we take for granted could become unavailable . < br / > < br / > characters often interact in a peculiar way , with no ta ##ct or manners or respect . i believe this is intentional , not bad acting . after all , who witnessed the social changes in the 60s and 70s may well assume that by 202 ##2 an over ##pop ##ulated city ' s inhabitants behave like that . < br / > < br / > i [SEP]


INFO:tensorflow:tokens: [CLS] in this movie the year 202 ##2 looks much like the seventies . this is amusing at first , but soon the viewer perceive ##s how very different that decade ##nt futuristic world is despite the appearances , how many things that we take for granted could become unavailable . < br / > < br / > characters often interact in a peculiar way , with no ta ##ct or manners or respect . i believe this is intentional , not bad acting . after all , who witnessed the social changes in the 60s and 70s may well assume that by 202 ##2 an over ##pop ##ulated city ' s inhabitants behave like that . < br / > < br / > i [SEP]


INFO:tensorflow:input_ids: 101 1999 2023 3185 1996 2095 16798 2475 3504 2172 2066 1996 26232 1012 2023 2003 19142 2012 2034 1010 2021 2574 1996 13972 23084 2015 2129 2200 2367 2008 5476 3372 28971 2088 2003 2750 1996 3922 1010 2129 2116 2477 2008 2057 2202 2005 4379 2071 2468 20165 1012 1026 7987 1013 1028 1026 7987 1013 1028 3494 2411 11835 1999 1037 14099 2126 1010 2007 2053 11937 6593 2030 14632 2030 4847 1012 1045 2903 2023 2003 21249 1010 2025 2919 3772 1012 2044 2035 1010 2040 9741 1996 2591 3431 1999 1996 20341 1998 17549 2089 2092 7868 2008 2011 16798 2475 2019 2058 16340 8898 2103 1005 1055 4864 16582 2066 2008 1012 1026 7987 1013 1028 1026 7987 1013 1028 1045 102


INFO:tensorflow:input_ids: 101 1999 2023 3185 1996 2095 16798 2475 3504 2172 2066 1996 26232 1012 2023 2003 19142 2012 2034 1010 2021 2574 1996 13972 23084 2015 2129 2200 2367 2008 5476 3372 28971 2088 2003 2750 1996 3922 1010 2129 2116 2477 2008 2057 2202 2005 4379 2071 2468 20165 1012 1026 7987 1013 1028 1026 7987 1013 1028 3494 2411 11835 1999 1037 14099 2126 1010 2007 2053 11937 6593 2030 14632 2030 4847 1012 1045 2903 2023 2003 21249 1010 2025 2919 3772 1012 2044 2035 1010 2040 9741 1996 2591 3431 1999 1996 20341 1998 17549 2089 2092 7868 2008 2011 16798 2475 2019 2058 16340 8898 2103 1005 1055 4864 16582 2066 2008 1012 1026 7987 1013 1028 1026 7987 1013 1028 1045 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] there are no reasons of taking this documentary serious and there are four reasons for that : < br / > < br / > 1 ) the people who made this documentary ( including the director and the producer ) are serbs or of serbian origin , therefore the criteria of neutrality fails . for instance , they mentioned that the diaspora croats ( the so called " us ##tase " ) played a huge part in the fall of yugoslavia , but they didn ' t mention that there were equal serbian organizations as well ( ce ##t ##nik ##s ) ! for you who aren ' t that familiar with balkan w ##w ##2 history : the serbian so called " ce ##t [SEP]


INFO:tensorflow:tokens: [CLS] there are no reasons of taking this documentary serious and there are four reasons for that : < br / > < br / > 1 ) the people who made this documentary ( including the director and the producer ) are serbs or of serbian origin , therefore the criteria of neutrality fails . for instance , they mentioned that the diaspora croats ( the so called " us ##tase " ) played a huge part in the fall of yugoslavia , but they didn ' t mention that there were equal serbian organizations as well ( ce ##t ##nik ##s ) ! for you who aren ' t that familiar with balkan w ##w ##2 history : the serbian so called " ce ##t [SEP]


INFO:tensorflow:input_ids: 101 2045 2024 2053 4436 1997 2635 2023 4516 3809 1998 2045 2024 2176 4436 2005 2008 1024 1026 7987 1013 1028 1026 7987 1013 1028 1015 1007 1996 2111 2040 2081 2023 4516 1006 2164 1996 2472 1998 1996 3135 1007 2024 16757 2030 1997 6514 4761 1010 3568 1996 9181 1997 21083 11896 1012 2005 6013 1010 2027 3855 2008 1996 18239 26222 1006 1996 2061 2170 1000 2149 18260 1000 1007 2209 1037 4121 2112 1999 1996 2991 1997 8936 1010 2021 2027 2134 1005 1056 5254 2008 2045 2020 5020 6514 4411 2004 2092 1006 8292 2102 8238 2015 1007 999 2005 2017 2040 4995 1005 1056 2008 5220 2007 17581 1059 2860 2475 2381 1024 1996 6514 2061 2170 1000 8292 2102 102


INFO:tensorflow:input_ids: 101 2045 2024 2053 4436 1997 2635 2023 4516 3809 1998 2045 2024 2176 4436 2005 2008 1024 1026 7987 1013 1028 1026 7987 1013 1028 1015 1007 1996 2111 2040 2081 2023 4516 1006 2164 1996 2472 1998 1996 3135 1007 2024 16757 2030 1997 6514 4761 1010 3568 1996 9181 1997 21083 11896 1012 2005 6013 1010 2027 3855 2008 1996 18239 26222 1006 1996 2061 2170 1000 2149 18260 1000 1007 2209 1037 4121 2112 1999 1996 2991 1997 8936 1010 2021 2027 2134 1005 1056 5254 2008 2045 2020 5020 6514 4411 2004 2092 1006 8292 2102 8238 2015 1007 999 2005 2017 2040 4995 1005 1056 2008 5220 2007 17581 1059 2860 2475 2381 1024 1996 6514 2061 2170 1000 8292 2102 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:Writing example 0 of 5000


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] horrible horrible movie , i still can ' t believe my friend talked me into seeing this ! no plot , bad acting , un ##fu ##nn ##y scenes , and very very stupid dialogue . all i have to say is that this movie is the worst movie i have seen and it ' s worse than halloween iii which i gave 0 stars too . so i give it 0 stars and a 0 out of 10 , well on here a 1 , but you get the point . [SEP]


INFO:tensorflow:tokens: [CLS] horrible horrible movie , i still can ' t believe my friend talked me into seeing this ! no plot , bad acting , un ##fu ##nn ##y scenes , and very very stupid dialogue . all i have to say is that this movie is the worst movie i have seen and it ' s worse than halloween iii which i gave 0 stars too . so i give it 0 stars and a 0 out of 10 , well on here a 1 , but you get the point . [SEP]


INFO:tensorflow:input_ids: 101 9202 9202 3185 1010 1045 2145 2064 1005 1056 2903 2026 2767 5720 2033 2046 3773 2023 999 2053 5436 1010 2919 3772 1010 4895 11263 10695 2100 5019 1010 1998 2200 2200 5236 7982 1012 2035 1045 2031 2000 2360 2003 2008 2023 3185 2003 1996 5409 3185 1045 2031 2464 1998 2009 1005 1055 4788 2084 14414 3523 2029 1045 2435 1014 3340 2205 1012 2061 1045 2507 2009 1014 3340 1998 1037 1014 2041 1997 2184 1010 2092 2006 2182 1037 1015 1010 2021 2017 2131 1996 2391 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 9202 9202 3185 1010 1045 2145 2064 1005 1056 2903 2026 2767 5720 2033 2046 3773 2023 999 2053 5436 1010 2919 3772 1010 4895 11263 10695 2100 5019 1010 1998 2200 2200 5236 7982 1012 2035 1045 2031 2000 2360 2003 2008 2023 3185 2003 1996 5409 3185 1045 2031 2464 1998 2009 1005 1055 4788 2084 14414 3523 2029 1045 2435 1014 3340 2205 1012 2061 1045 2507 2009 1014 3340 1998 1037 1014 2041 1997 2184 1010 2092 2006 2182 1037 1015 1010 2021 2017 2131 1996 2391 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] a young doctor and his wife are suddenly expecting a child . both are disturbed about a two hour memory laps ##e on the night of conception . interesting twist on an hackney ##ed story . very good f / x and interesting editing . jillian mc ##w ##hir ##ter is outstanding in a cast that features arnold vo ##sl ##oo , wil ##ford br ##im ##ley and brad do ##uri ##f . br ##im ##ley brings normal ##cy to the out ##land ##ish . ku ##dos to director brian yu ##z ##na . [SEP]


INFO:tensorflow:tokens: [CLS] a young doctor and his wife are suddenly expecting a child . both are disturbed about a two hour memory laps ##e on the night of conception . interesting twist on an hackney ##ed story . very good f / x and interesting editing . jillian mc ##w ##hir ##ter is outstanding in a cast that features arnold vo ##sl ##oo , wil ##ford br ##im ##ley and brad do ##uri ##f . br ##im ##ley brings normal ##cy to the out ##land ##ish . ku ##dos to director brian yu ##z ##na . [SEP]


INFO:tensorflow:input_ids: 101 1037 2402 3460 1998 2010 2564 2024 3402 8074 1037 2775 1012 2119 2024 12491 2055 1037 2048 3178 3638 10876 2063 2006 1996 2305 1997 13120 1012 5875 9792 2006 2019 28425 2098 2466 1012 2200 2204 1042 1013 1060 1998 5875 9260 1012 27286 11338 2860 11961 3334 2003 5151 1999 1037 3459 2008 2838 7779 29536 14540 9541 1010 19863 3877 7987 5714 3051 1998 8226 2079 9496 2546 1012 7987 5714 3051 7545 3671 5666 2000 1996 2041 3122 4509 1012 13970 12269 2000 2472 4422 9805 2480 2532 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 1037 2402 3460 1998 2010 2564 2024 3402 8074 1037 2775 1012 2119 2024 12491 2055 1037 2048 3178 3638 10876 2063 2006 1996 2305 1997 13120 1012 5875 9792 2006 2019 28425 2098 2466 1012 2200 2204 1042 1013 1060 1998 5875 9260 1012 27286 11338 2860 11961 3334 2003 5151 1999 1037 3459 2008 2838 7779 29536 14540 9541 1010 19863 3877 7987 5714 3051 1998 8226 2079 9496 2546 1012 7987 5714 3051 7545 3671 5666 2000 1996 2041 3122 4509 1012 13970 12269 2000 2472 4422 9805 2480 2532 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] lack ##awan ##na blues is an entertaining , eng ##ross ##ing , emotionally - charged hbo - tv movie based on the childhood memories of actor ruben santiago - hudson ( who also appears in a small role ) . this joy ##ous motion picture experience is centered around santiago - hudson ' s childhood guardian , rachel " nanny " crosby , a strong , big - hearted black woman who ran a boarding house in upstate new york during the 1950 ' s . nanny was a one - woman social service organization whose boarding house was filled with drunk ##s , derelict ##s , cr ##ip ##ples , drug addict ##s , mis ##fi ##ts , and everyone else in town who needed [SEP]


INFO:tensorflow:tokens: [CLS] lack ##awan ##na blues is an entertaining , eng ##ross ##ing , emotionally - charged hbo - tv movie based on the childhood memories of actor ruben santiago - hudson ( who also appears in a small role ) . this joy ##ous motion picture experience is centered around santiago - hudson ' s childhood guardian , rachel " nanny " crosby , a strong , big - hearted black woman who ran a boarding house in upstate new york during the 1950 ' s . nanny was a one - woman social service organization whose boarding house was filled with drunk ##s , derelict ##s , cr ##ip ##ples , drug addict ##s , mis ##fi ##ts , and everyone else in town who needed [SEP]


INFO:tensorflow:input_ids: 101 3768 25903 2532 5132 2003 2019 14036 1010 25540 25725 2075 1010 14868 1011 5338 14633 1011 2694 3185 2241 2006 1996 5593 5758 1997 3364 19469 8728 1011 6842 1006 2040 2036 3544 1999 1037 2235 2535 1007 1012 2023 6569 3560 4367 3861 3325 2003 8857 2105 8728 1011 6842 1005 1055 5593 6697 1010 5586 1000 19174 1000 14282 1010 1037 2844 1010 2502 1011 18627 2304 2450 2040 2743 1037 9405 2160 1999 29530 2047 2259 2076 1996 3925 1005 1055 1012 19174 2001 1037 2028 1011 2450 2591 2326 3029 3005 9405 2160 2001 3561 2007 7144 2015 1010 28839 2015 1010 13675 11514 21112 1010 4319 26855 2015 1010 28616 8873 3215 1010 1998 3071 2842 1999 2237 2040 2734 102


INFO:tensorflow:input_ids: 101 3768 25903 2532 5132 2003 2019 14036 1010 25540 25725 2075 1010 14868 1011 5338 14633 1011 2694 3185 2241 2006 1996 5593 5758 1997 3364 19469 8728 1011 6842 1006 2040 2036 3544 1999 1037 2235 2535 1007 1012 2023 6569 3560 4367 3861 3325 2003 8857 2105 8728 1011 6842 1005 1055 5593 6697 1010 5586 1000 19174 1000 14282 1010 1037 2844 1010 2502 1011 18627 2304 2450 2040 2743 1037 9405 2160 1999 29530 2047 2259 2076 1996 3925 1005 1055 1012 19174 2001 1037 2028 1011 2450 2591 2326 3029 3005 9405 2160 2001 3561 2007 7144 2015 1010 28839 2015 1010 13675 11514 21112 1010 4319 26855 2015 1010 28616 8873 3215 1010 1998 3071 2842 1999 2237 2040 2734 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:label: 1 (id = 1)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] a group of do ##uche - bag teenagers go up to an old mining town in hopes of finding gold nu ##gg ##ets . the one hitch in the hair - brain ##ed scheme is that the ancient supernatural miner whom the gold belongs to doesn ' t wish to part with his treasure so easily and so begins to dispatch the inter ##lo ##pers accordingly . < br / > < br / > literally cl ##iche - sp ##rou ##ting dial ##og , horrible acting , some insane ##ly terrible ' southern dialect ' and a lame un ##me ##mora ##ble killer who resembles jeep ##ers creep ##ers ( without the aforementioned ' s pre ##di ##le ##ction of young boys naturally ) combine [SEP]


INFO:tensorflow:tokens: [CLS] a group of do ##uche - bag teenagers go up to an old mining town in hopes of finding gold nu ##gg ##ets . the one hitch in the hair - brain ##ed scheme is that the ancient supernatural miner whom the gold belongs to doesn ' t wish to part with his treasure so easily and so begins to dispatch the inter ##lo ##pers accordingly . < br / > < br / > literally cl ##iche - sp ##rou ##ting dial ##og , horrible acting , some insane ##ly terrible ' southern dialect ' and a lame un ##me ##mora ##ble killer who resembles jeep ##ers creep ##ers ( without the aforementioned ' s pre ##di ##le ##ction of young boys naturally ) combine [SEP]


INFO:tensorflow:input_ids: 101 1037 2177 1997 2079 19140 1011 4524 12908 2175 2039 2000 2019 2214 5471 2237 1999 8069 1997 4531 2751 16371 13871 8454 1012 1996 2028 27738 1999 1996 2606 1011 4167 2098 5679 2003 2008 1996 3418 11189 18594 3183 1996 2751 7460 2000 2987 1005 1056 4299 2000 2112 2007 2010 8813 2061 4089 1998 2061 4269 2000 18365 1996 6970 4135 7347 11914 1012 1026 7987 1013 1028 1026 7987 1013 1028 6719 18856 17322 1011 11867 22494 3436 13764 8649 1010 9202 3772 1010 2070 9577 2135 6659 1005 2670 9329 1005 1998 1037 20342 4895 4168 22122 3468 6359 2040 12950 14007 2545 19815 2545 1006 2302 1996 17289 1005 1055 3653 4305 2571 7542 1997 2402 3337 8100 1007 11506 102


INFO:tensorflow:input_ids: 101 1037 2177 1997 2079 19140 1011 4524 12908 2175 2039 2000 2019 2214 5471 2237 1999 8069 1997 4531 2751 16371 13871 8454 1012 1996 2028 27738 1999 1996 2606 1011 4167 2098 5679 2003 2008 1996 3418 11189 18594 3183 1996 2751 7460 2000 2987 1005 1056 4299 2000 2112 2007 2010 8813 2061 4089 1998 2061 4269 2000 18365 1996 6970 4135 7347 11914 1012 1026 7987 1013 1028 1026 7987 1013 1028 6719 18856 17322 1011 11867 22494 3436 13764 8649 1010 9202 3772 1010 2070 9577 2135 6659 1005 2670 9329 1005 1998 1037 20342 4895 4168 22122 3468 6359 2040 12950 14007 2545 19815 2545 1006 2302 1996 17289 1005 1055 3653 4305 2571 7542 1997 2402 3337 8100 1007 11506 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: None


INFO:tensorflow:guid: None


INFO:tensorflow:tokens: [CLS] this feels very stil ##ted and patron ##izing to a great extent . the whole plot is extremely forced - especially the " gallant " effort to save the college from ruin , and the moral ##istic over ##tone ( especially by the leading lady ) gr ##ates a bit . < br / > < br / > but there are one or two comic moments that do help relieve the boredom , and the dancing is quite fun ( especially for alleged amateurs - ha , ha ! ) < br / > < br / > the shop proprietor and the young guy doing spectacular tap dancing were particular highlights . and i liked peter hayes impressions of charles laugh ##ton and ronald [SEP]


INFO:tensorflow:tokens: [CLS] this feels very stil ##ted and patron ##izing to a great extent . the whole plot is extremely forced - especially the " gallant " effort to save the college from ruin , and the moral ##istic over ##tone ( especially by the leading lady ) gr ##ates a bit . < br / > < br / > but there are one or two comic moments that do help relieve the boredom , and the dancing is quite fun ( especially for alleged amateurs - ha , ha ! ) < br / > < br / > the shop proprietor and the young guy doing spectacular tap dancing were particular highlights . and i liked peter hayes impressions of charles laugh ##ton and ronald [SEP]


INFO:tensorflow:input_ids: 101 2023 5683 2200 25931 3064 1998 9161 6026 2000 1037 2307 6698 1012 1996 2878 5436 2003 5186 3140 1011 2926 1996 1000 26984 1000 3947 2000 3828 1996 2267 2013 10083 1010 1998 1996 7191 6553 2058 5524 1006 2926 2011 1996 2877 3203 1007 24665 8520 1037 2978 1012 1026 7987 1013 1028 1026 7987 1013 1028 2021 2045 2024 2028 2030 2048 5021 5312 2008 2079 2393 15804 1996 29556 1010 1998 1996 5613 2003 3243 4569 1006 2926 2005 6884 24361 1011 5292 1010 5292 999 1007 1026 7987 1013 1028 1026 7987 1013 1028 1996 4497 21584 1998 1996 2402 3124 2725 12656 11112 5613 2020 3327 11637 1012 1998 1045 4669 2848 10192 19221 1997 2798 4756 2669 1998 8923 102


INFO:tensorflow:input_ids: 101 2023 5683 2200 25931 3064 1998 9161 6026 2000 1037 2307 6698 1012 1996 2878 5436 2003 5186 3140 1011 2926 1996 1000 26984 1000 3947 2000 3828 1996 2267 2013 10083 1010 1998 1996 7191 6553 2058 5524 1006 2926 2011 1996 2877 3203 1007 24665 8520 1037 2978 1012 1026 7987 1013 1028 1026 7987 1013 1028 2021 2045 2024 2028 2030 2048 5021 5312 2008 2079 2393 15804 1996 29556 1010 1998 1996 5613 2003 3243 4569 1006 2926 2005 6884 24361 1011 5292 1010 5292 999 1007 1026 7987 1013 1028 1026 7987 1013 1028 1996 4497 21584 1998 1996 2402 3124 2725 12656 11112 5613 2020 3327 11637 1012 1998 1045 4669 2848 10192 19221 1997 2798 4756 2669 1998 8923 102


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


# モデルの作成
データの準備が完了したので、モデルの構築に進む。
まず、以下では `create_model` 関数を定義する。この関数では、`TF Hub` から BERT のモジュールをとってきて、そのあとに１層のニューラルネットワークを追加する。この層は BERT の出力をもとに感情分析のタスクを行うものである。この方法はfinetuningと呼ばれる。

In [0]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)


次に、上記で定義した関数をラップする `model_fn_builder` 関数を定義する。

この関数は、訓練・評価・推論用に分けてモデルを準備してくれる。

In [0]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        f1_score = tf.contrib.metrics.f1_score(
            label_ids,
            predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        return {
            "eval_accuracy": accuracy,
            "f1_score": f1_score,
            "auc": auc,
            "precision": precision,
            "recall": recall,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
        }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


In [0]:
# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 500
SAVE_SUMMARY_STEPS = 100

In [0]:
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [0]:
# Specify outpit directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [21]:
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})


INFO:tensorflow:Using config: {'_model_dir': 'BERT_PLAYGROUND', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f66a05f17f0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


INFO:tensorflow:Using config: {'_model_dir': 'BERT_PLAYGROUND', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f66a05f17f0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}



Next we create an input builder function that takes our training feature set (`train_features`) and produces a generator.
このやり方は、Tensorflow では一般的なデザインパターンみたいです。参考：[Estimators](https://www.tensorflow.org/guide/estimators).

In [0]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

次のセルで訓練させます。だいたい、Google Colab の GPU を使って14分くらいかかるそうです。

In [23]:
print(f'Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.


Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.




















Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where


Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into BERT_PLAYGROUND/model.ckpt.


INFO:tensorflow:Saving checkpoints for 0 into BERT_PLAYGROUND/model.ckpt.


INFO:tensorflow:loss = 0.7471657, step = 0


INFO:tensorflow:loss = 0.7471657, step = 0


INFO:tensorflow:global_step/sec: 1.55657


INFO:tensorflow:global_step/sec: 1.55657


INFO:tensorflow:loss = 0.30168575, step = 100 (64.249 sec)


INFO:tensorflow:loss = 0.30168575, step = 100 (64.249 sec)


INFO:tensorflow:global_step/sec: 2.10597


INFO:tensorflow:global_step/sec: 2.10597


INFO:tensorflow:loss = 0.35019258, step = 200 (47.485 sec)


INFO:tensorflow:loss = 0.35019258, step = 200 (47.485 sec)


INFO:tensorflow:global_step/sec: 2.10804


INFO:tensorflow:global_step/sec: 2.10804


INFO:tensorflow:loss = 0.02110035, step = 300 (47.433 sec)


INFO:tensorflow:loss = 0.02110035, step = 300 (47.433 sec)


INFO:tensorflow:global_step/sec: 2.10761


INFO:tensorflow:global_step/sec: 2.10761


INFO:tensorflow:loss = 0.006973612, step = 400 (47.447 sec)


INFO:tensorflow:loss = 0.006973612, step = 400 (47.447 sec)


INFO:tensorflow:Saving checkpoints for 468 into BERT_PLAYGROUND/model.ckpt.


INFO:tensorflow:Saving checkpoints for 468 into BERT_PLAYGROUND/model.ckpt.


INFO:tensorflow:Loss for final step: 0.0045530777.


INFO:tensorflow:Loss for final step: 0.0045530777.


Training took time  0:05:02.901069


テストデータで学習の状況を確認してみる：

In [0]:
test_input_fn = run_classifier.input_fn_builder(
    features=test_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

In [25]:
estimator.evaluate(input_fn=test_input_fn, steps=None)

INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Starting evaluation at 2020-02-27T05:33:00Z


INFO:tensorflow:Starting evaluation at 2020-02-27T05:33:00Z


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Restoring parameters from BERT_PLAYGROUND/model.ckpt-468


INFO:tensorflow:Restoring parameters from BERT_PLAYGROUND/model.ckpt-468


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Finished evaluation at 2020-02-27-05:33:32


INFO:tensorflow:Finished evaluation at 2020-02-27-05:33:32


INFO:tensorflow:Saving dict for global step 468: auc = 0.86697274, eval_accuracy = 0.8672, f1_score = 0.8710678, false_negatives = 285.0, false_positives = 379.0, global_step = 468, loss = 0.5168713, precision = 0.85545385, recall = 0.88726264, true_negatives = 2093.0, true_positives = 2243.0


INFO:tensorflow:Saving dict for global step 468: auc = 0.86697274, eval_accuracy = 0.8672, f1_score = 0.8710678, false_negatives = 285.0, false_positives = 379.0, global_step = 468, loss = 0.5168713, precision = 0.85545385, recall = 0.88726264, true_negatives = 2093.0, true_positives = 2243.0


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 468: BERT_PLAYGROUND/model.ckpt-468


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 468: BERT_PLAYGROUND/model.ckpt-468


{'auc': 0.86697274,
 'eval_accuracy': 0.8672,
 'f1_score': 0.8710678,
 'false_negatives': 285.0,
 'false_positives': 379.0,
 'global_step': 468,
 'loss': 0.5168713,
 'precision': 0.85545385,
 'recall': 0.88726264,
 'true_negatives': 2093.0,
 'true_positives': 2243.0}

↑結果を見ると、２値分類の精度は $86.7$% ほどです。

以下では、自分たちの用意したデータで推論できるように関数を作っています。

In [0]:
def getPrediction(in_sentences):
  labels = ["Negative", "Positive"]
  input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, label = 0) for x in in_sentences] # here, "" is just a dummy label
  input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
  predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
  predictions = estimator.predict(predict_input_fn)
  return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]

# Here's the playground!

好きな文章を入れて、その文章がポジティブかネガティブか判定するやつをやってみる：

In [0]:
pred_sentences = [
  "Fujitsu Zinrai technologies are great! We recommend them to our partner companies.",
  "President Tokita announced free dress code. I think it lead our office environment more comfortable",
  "Fujitsu do everything in conservative manner. I don't like it.",
  "Fujitsu announced new DA architecture which improves late computing speed and bad performance."
]

In [44]:
predictions = getPrediction(pred_sentences)

INFO:tensorflow:Writing example 0 of 4


INFO:tensorflow:Writing example 0 of 4


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] fuji ##tsu z ##in ##rai technologies are great ! we recommend them to our partner companies . [SEP]


INFO:tensorflow:tokens: [CLS] fuji ##tsu z ##in ##rai technologies are great ! we recommend them to our partner companies . [SEP]


INFO:tensorflow:input_ids: 101 20933 10422 1062 2378 14995 6786 2024 2307 999 2057 16755 2068 2000 2256 4256 3316 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 20933 10422 1062 2378 14995 6786 2024 2307 999 2057 16755 2068 2000 2256 4256 3316 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] president to ##kit ##a announced free dress code . i think it lead our office environment more comfortable [SEP]


INFO:tensorflow:tokens: [CLS] president to ##kit ##a announced free dress code . i think it lead our office environment more comfortable [SEP]


INFO:tensorflow:input_ids: 101 2343 2000 23615 2050 2623 2489 4377 3642 1012 1045 2228 2009 2599 2256 2436 4044 2062 6625 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 2343 2000 23615 2050 2623 2489 4377 3642 1012 1045 2228 2009 2599 2256 2436 4044 2062 6625 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] fuji ##tsu do everything in conservative manner . i don ' t like it . [SEP]


INFO:tensorflow:tokens: [CLS] fuji ##tsu do everything in conservative manner . i don ' t like it . [SEP]


INFO:tensorflow:input_ids: 101 20933 10422 2079 2673 1999 4603 5450 1012 1045 2123 1005 1056 2066 2009 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 20933 10422 2079 2673 1999 4603 5450 1012 1045 2123 1005 1056 2066 2009 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:*** Example ***


INFO:tensorflow:*** Example ***


INFO:tensorflow:guid: 


INFO:tensorflow:guid: 


INFO:tensorflow:tokens: [CLS] fuji ##tsu announced new da architecture which improves late computing speed and bad performance . [SEP]


INFO:tensorflow:tokens: [CLS] fuji ##tsu announced new da architecture which improves late computing speed and bad performance . [SEP]


INFO:tensorflow:input_ids: 101 20933 10422 2623 2047 4830 4294 2029 24840 2397 9798 3177 1998 2919 2836 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_ids: 101 20933 10422 2623 2047 4830 4294 2029 24840 2397 9798 3177 1998 2919 2836 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:label: 0 (id = 0)


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Restoring parameters from BERT_PLAYGROUND/model.ckpt-468


INFO:tensorflow:Restoring parameters from BERT_PLAYGROUND/model.ckpt-468


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


↓感情分析の結果

In [45]:
predictions

[('Fujitsu Zinrai technologies are great! We recommend them to our partner companies.',
  array([-6.2377424e+00, -1.9562172e-03], dtype=float32),
  'Positive'),
 ('President Tokita announced free dress code. I think it lead our office environment more comfortable',
  array([-4.5449243 , -0.01067782], dtype=float32),
  'Positive'),
 ("Fujitsu do everything in conservative manner. I don't like it.",
  array([-2.6818283e-03, -5.9226103e+00], dtype=float32),
  'Negative'),
 ('Fujitsu announced new DA architecture which improves late computing speed and bad performance.',
  array([-0.00668551, -5.01116   ], dtype=float32),
  'Negative')]