<a href="https://colab.research.google.com/github/szukiyu/tensorflow/blob/master/Predicting_Movie_Reviews_with_BERT_on_TF_Hub_ipynb_%E3%81%AE%E3%82%B3%E3%83%94%E3%83%BC.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
# Copyright 2019 Google Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#Predicting Movie Review Sentiment with BERT on TF Hub

If you’ve been following Natural Language Processing over the past year, you’ve probably heard of BERT: Bidirectional Encoder Representations from Transformers. It’s a neural network architecture designed by Google researchers that’s totally transformed what’s state-of-the-art for NLP tasks, like text classification, translation, summarization, and question answering.

Now that BERT's been added to [TF Hub](https://www.tensorflow.org/hub) as a loadable module, it's easy(ish) to add into existing Tensorflow text pipelines. In an existing pipeline, BERT can replace text embedding layers like ELMO and GloVE. Alternatively, [finetuning](http://wiki.fast.ai/index.php/Fine_tuning) BERT can provide both an accuracy boost and faster training time in many cases.

Here, we'll train a model to predict whether an IMDB movie review is positive or negative using BERT in Tensorflow with tf hub. Some code was adapted from [this colab notebook](https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb). Let's get started!

In [0]:
from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime

In addition to the standard libraries we imported above, we'll need to install BERT's python package.

In [22]:
!pip install bert-tensorflow



In [0]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization

Below, we'll set an output directory location to store our model output and checkpoints. This can be a local directory, in which case you'd set OUTPUT_DIR to the name of the directory you'd like to create. If you're running this code in Google's hosted Colab, the directory won't persist after the Colab session ends.

Alternatively, if you're a GCP user, you can store output in a GCP bucket. To do that, set a directory name in OUTPUT_DIR and the name of the GCP bucket in the BUCKET field.

Set DO_DELETE to rewrite the OUTPUT_DIR if it exists. Otherwise, Tensorflow will load existing model checkpoints from that directory (if they exist).

In [72]:
# Set the output directory for saving model file
# Optionally, set a GCP bucket location

OUTPUT_DIR = 'OUTPUT_DIR_NAME3'#@param {type:"string"}
#@markdown Whether or not to clear/delete the directory and create a new one
DO_DELETE = True #@param {type:"boolean"}
#@markdown Set USE_BUCKET and BUCKET if you want to (optionally) store model output on GCP bucket.
USE_BUCKET = True #@param {type:"boolean"}
BUCKET = 'bert_szukiyu' #@param {type:"string"}

if USE_BUCKET:
  OUTPUT_DIR = 'gs://{}/{}'.format(BUCKET, OUTPUT_DIR)
  from google.colab import auth
  auth.authenticate_user()

if DO_DELETE:
  try:
    tf.gfile.DeleteRecursively(OUTPUT_DIR)
  except:
    # Doesn't matter if the directory didn't exist
    pass
tf.gfile.MakeDirs(OUTPUT_DIR)
print('***** Model output directory: {} *****'.format(OUTPUT_DIR))


***** Model output directory: gs://bert_szukiyu/OUTPUT_DIR_NAME3 *****


#Data

First, let's download the dataset, hosted by Stanford. The code below, which downloads, extracts, and imports the IMDB Large Movie Review Dataset, is borrowed from [this Tensorflow tutorial](https://www.tensorflow.org/hub/tutorials/text_classification_with_tf_hub).

In [0]:
from tensorflow import keras
import os
import re

# Load all files from a directory in a DataFrame.
def load_directory_data(directory):
  data = {}
  data["sentence"] = []
  data["sentiment"] = []
  for file_path in os.listdir(directory):
    with tf.gfile.GFile(os.path.join(directory, file_path), "r") as f:
      data["sentence"].append(f.read())
      data["sentiment"].append(re.match("\d+_(\d+)\.txt", file_path).group(1))
  return pd.DataFrame.from_dict(data)

# Merge positive and negative examples, add a polarity column and shuffle.
def load_dataset(directory):
  pos_df = load_directory_data(os.path.join(directory, "pos"))
  neg_df = load_directory_data(os.path.join(directory, "neg"))
  pos_df["polarity"] = 1
  neg_df["polarity"] = 0
  return pd.concat([pos_df, neg_df]).sample(frac=1).reset_index(drop=True)

# Download and process the dataset files.
def download_and_load_datasets(force_download=False):
  dataset = tf.keras.utils.get_file(
      fname="aclImdb.tar.gz", 
      origin="http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz", 
      extract=True)
  
  train_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                       "aclImdb", "train"))
  test_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                      "aclImdb", "test"))
  
  return train_df, test_df


In [0]:
train, test = download_and_load_datasets()

Downloading data from http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz


To keep training fast, we'll take a sample of 5000 train and test examples, respectively.

In [0]:
train = train.sample(5000)
test = test.sample(5000)

In [0]:
train.columns

Index(['sentence', 'sentiment', 'polarity'], dtype='object')

In [0]:
print(train)

                                                sentence sentiment  polarity
8223   Back to the roots with "like it is in heaven" ...        10         1
2442   Seeing as I hate reading long essays hoping to...         4         0
20735  Farrah Fawcett has spent the better part of he...         8         1
18877  What seemed as a good premise for a movie...un...         1         0
19804  There aren't many overcoming-the-odds stories ...         8         1
10044  John Leguizemo, a wonderful comic actor, is a ...         3         0
2242   "Pickup On South Street" is a high speed drama...         7         1
3896   A solid B movie.<br /><br />I like Jake Weber....         7         1
17678  What a crime...<br /><br />You forgot to brush...         1         0
12791  When you actually find a video game to be scar...        10         1
5284   I wrote a review of this movie further down af...         1         0
8937   Following the release of Cube 2: Hypercube (20...         3         0

For us, our input data is the 'sentence' column and our label is the 'polarity' column (0, 1 for negative and positive, respecitvely)

In [0]:
from google_drive_downloader import GoogleDriveDownloader as gdd

tgz_fname = "ldcc-20140209.tar.gz"


In [7]:
# Google driverからダウンロード
gdd.download_file_from_google_drive(file_id="1b-llzNQdmKIp0FYMwzGOKmXdQUNpNXC8",
                                   dest_path="./ldcc-20140209.tar.gz", unzip=False)

Downloading 1b-llzNQdmKIp0FYMwzGOKmXdQUNpNXC8 into ./ldcc-20140209.tar.gz... Done.


In [8]:
# 株式会社ロンウィットからダウンロード
import urllib.request
tgz_url = "https://www.rondhuit.com/download/ldcc-20140209.tar.gz"
urllib.request.urlretrieve(tgz_url, "ldcc-20140209.tar.gz")

('ldcc-20140209.tar.gz', <http.client.HTTPMessage at 0x7fde26e38898>)

In [0]:
import tarfile
import csv
import re

target_genre = ["it-life-hack", "kaden-channel"]

zero_fnames = []
one_fnames = []
tsv_fname = "all.tsv"

brackets_tail = re.compile('【[^】]*】$')
brackets_head = re.compile('^【[^】]*】')

def remove_brackets(inp):
    output = re.sub(brackets_head, '',
                   re.sub(brackets_tail, '', inp))
    return output

def read_title(f):
    # 2行スキップ
    next(f)
    next(f)
    title = next(f) # 3行目を返す
    title = remove_brackets(title.decode('utf-8'))
    return title[:-1]

with tarfile.open(tgz_fname, encoding="utf-8") as tf:
    # 対象ファイルの選定
    for ti in tf:
        # ライセンスファイルはスキップ
        if "LICENSE.txt" in ti.name:
            continue
        if target_genre[0] in ti.name and ti.name.endswith(".txt"):
            zero_fnames.append(ti.name)
            continue
        if target_genre[1] in ti.name and ti.name.endswith(".txt"):
            one_fnames.append(ti.name)
    with open(tsv_fname, "w", encoding='utf-8') as wf:
        writer = csv.writer(wf, delimiter='\t')
        # ラベル 0
        for name in zero_fnames:
            f = tf.extractfile(name)
            title = read_title(f)
            row = [target_genre[0], 0, '', title]
            writer.writerow(row)
        # ラベル 1
        for name in one_fnames:
            f = tf.extractfile(name)
            title = read_title(f)
            row = [target_genre[1], 1, '', title]
            writer.writerow(row)


In [0]:
import random

random.seed(100)
with open("all.tsv", 'r', encoding='utf-8') as f, open("rand-all.tsv", "w", encoding='utf-8') as wf:
    lines = f.readlines()
    random.shuffle(lines)
    for line in lines:
        wf.write(line)

In [11]:
random.seed(101)

train_fname, dev_fname, test_fname = ["train.tsv", "dev.tsv", "test.tsv"]
with open("rand-all.tsv", encoding='utf-8') as f, open(train_fname, "w", encoding='utf-8') as tf, open(dev_fname, "w", encoding='utf-8') as df, open(test_fname, "w", encoding='utf-8') as ef:
    ef.write("class\tsentence\n")
    for line in f:
        print("line=",line)
        v = random.randint(0, 9)
        if v == 8:
            df.write(line)
        elif v == 9:
            row = line.split('\t')
 
            ef.write("\t".join([row[1], row[3]]))
        else:
            tf.write(line)

line= it-life-hack	0		何かとネットが騒がしかった！　不審なAndroidアプリ騒動までを振り返る

line= kaden-channel	1		ソニーだいじょうぶ？　PS Vitaの質問ページがちょっと変で話題になり、ソニーは慌てて修正

line= kaden-channel	1		絨毯もフローリングもスチームで「洗う」ＦＳＭ１２００

line= kaden-channel	1		お風呂場がベター!?　iPhoneの保護シールがうまく貼れない人にヒント

line= it-life-hack	0		24時間しょこたん三昧　本日0時より「中川翔子が売ってみた！独り24時間テレビ」ニコニコ生放送開始

line= kaden-channel	1		『癒し』と『安眠サポート』—部屋中をほんのり桜色に照らすＬＥＤ照明シャープＤＬ−Ｃ６０４Ｖ

line= kaden-channel	1		忘年会シーズンは要注意！　Facebook上にある76％は恥ずかしい写真だった

line= it-life-hack	0		あの名作に隠された秘密！スラムダンクの秘密　完全版

line= it-life-hack	0		一連の流れを紹介！ Google playで出回った不審なAndroidアプリ問題

line= kaden-channel	1		タムロンのEマウントレンズを純正レンズと比べてみた

line= kaden-channel	1		普通の小学生に戻ってほしい？　人気子役の愛菜ちゃんが消えてほしい芸能人１位に

line= it-life-hack	0		先週気になったニュースや話題を振り返る「ITフラッシュバック」

line= it-life-hack	0		iPhone用オモシロアプリからお役立ちアプリまで一挙紹介

line= it-life-hack	0		駆け込み購入で「しまった」を挽回できる！古い液晶テレビを最新にする作戦

line= kaden-channel	1		安全でスタイリッシュ！タイガー電気ケトル「PCD-A型」で賢く節電

line= kaden-channel	1		以前の方が面白かった？　学生が選ぶ「復活してほしいバラエティ」に納得の声

line= kaden-channel	1		どっちが問題？　韓国ロケで

In [0]:
import pandas as pd

In [0]:
DATA_COLUMN = 'sentence'
LABEL_COLUMN = 'polarity'
# label_list is the list of labels, i.e. True, False or 0, 1 or 'dog', 'cat'
label_list = [0, 1]

In [0]:
train = pd.read_csv("train.tsv", delimiter='\t')
test =  pd.read_csv("test.tsv", delimiter='\t')
train.columns = ['line1','polarity','space','sentence']
test.columns = ['polarity','sentence']


In [0]:
print(train[['polarity','sentence']])

      polarity                                           sentence
0            1               お風呂場がベター!?　iPhoneの保護シールがうまく貼れない人にヒント
1            0  24時間しょこたん三昧　本日0時より「中川翔子が売ってみた！独り24時間テレビ」ニコニコ生放送開始
2            1      『癒し』と『安眠サポート』—部屋中をほんのり桜色に照らすＬＥＤ照明シャープＤＬ−Ｃ６０４Ｖ
3            0                          あの名作に隠された秘密！スラムダンクの秘密　完全版
4            1                          タムロンのEマウントレンズを純正レンズと比べてみた
5            1             普通の小学生に戻ってほしい？　人気子役の愛菜ちゃんが消えてほしい芸能人１位に
6            0                    先週気になったニュースや話題を振り返る「ITフラッシュバック」
7            0                      iPhone用オモシロアプリからお役立ちアプリまで一挙紹介
8            0                駆け込み購入で「しまった」を挽回できる！古い液晶テレビを最新にする作戦
9            1                  安全でスタイリッシュ！タイガー電気ケトル「PCD-A型」で賢く節電
10           1               以前の方が面白かった？　学生が選ぶ「復活してほしいバラエティ」に納得の声
11           1              どっちが問題？　韓国ロケで激怒する中国人女優と、それをバッシングする韓国人
12           0    次世代セキュリティソフト「ノートン360 v6」を徹底検証！ITライフハック編集長が使ってみた
13           0                   Macの次期OS「Mountain Lion」の真の狙いとは？ 
14        

In [15]:
print(test[['polarity','sentence']])

     polarity                                           sentence
0           0              何かとネットが騒がしかった！　不審なAndroidアプリ騒動までを振り返る
1           0          一連の流れを紹介！ Google playで出回った不審なAndroidアプリ問題
2           1                        連載●After Effects 天国への階段 第9回
3           0                     Mac必須アプリの使いこなし術！パソコンの最新情報をチェック
4           1                 任天堂「Ｗｉｉ　Ｕ」は年末商戦期に発売決定　−　販売エリアは日米欧豪
5           0                    スマホもバッテリーパックもコレ1台でOK！欲張りなマルチ充電器
6           1                     元旦深夜０時に「ミラーファイト」新作がYouTubeで公開！
7           0             ストップ違法コピー！ BSA世界ソフトウェア違法コピー調査2011の結果発表
8           0                               必要なファイルを素早く取り出すマル秘ワザ
9           1  スタイリッシュな男性と、きらめく女性向け！　パナソニックグループ、単３形・単４形「エネループ...
10          0             これを付ければ速くなる！？　AMDプラットフォームに最適化されたメモリー登場
11          0                 タブの多段表示やウィンドウのシングル表示　Firefoxを強化しよう
12          1                         使うほど髪にハリツヤ、モッズ・ヘアサロンのドライヤー
13          1       2012年はモバイルプロジェクター元年！　新時代のプロジェクターはこんなに進化している！
14          0            

#Data Preprocessing
We'll need to transform our data into a format BERT understands. This involves two steps. First, we create  `InputExample`'s using the constructor provided in the BERT library.

- `text_a` is the text we want to classify, which in this case, is the `Request` field in our Dataframe. 
- `text_b` is used if we're training a model to understand the relationship between sentences (i.e. is `text_b` a translation of `text_a`? Is `text_b` an answer to the question asked by `text_a`?). This doesn't apply to our task, so we can leave `text_b` blank.
- `label` is the label for our example, i.e. True, False

In [0]:
# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this example
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

test_InputExamples = test.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (but don't worry--this is also included in the Python library):


1. Lowercase our text (if we're using a BERT lowercase model)
2. Tokenize it (i.e. "sally says hi" -> ["sally", "says", "hi"])
3. Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
4. Map our words to indexes using a vocab file that BERT provides
5. Add special "CLS" and "SEP" tokens (see the [readme](https://github.com/google-research/bert))
6. Append "index" and "segment" tokens to each input (see the [BERT paper](https://arxiv.org/pdf/1810.04805.pdf))

Happily, we don't have to worry about most of these details.




In [0]:
from google.colab import drive
drive.mount('/content/gdrive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=email%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocs.test%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.photos.readonly%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /content/gdrive


To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:

In [75]:
# This is a path to an uncased (all lowercase) version of BERT
#BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_multi_cased_L-12_H-768_A-12/1"
def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0519 08:36:41.754153 140593909376896 saver.py:1483] Saver not created because there are no variables in the graph to restore


Great--we just learned that the BERT model we're using expects lowercase data (that's what stored in tokenization_info["do_lower_case"]) and we also loaded BERT's vocab file. We also created a tokenizer, which breaks words into word pieces:

In [26]:
tokenizer.tokenize("私の名前は鈴木友です。")

['私', 'の', '名', '前', 'は', '鈴', '木', '友', 'で', '##す', '。']

Using our tokenizer, we'll call `run_classifier.convert_examples_to_features` on our InputExamples to convert them into features BERT understands.

In [76]:
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128
# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)

INFO:tensorflow:Writing example 0 of 1405


I0519 08:36:49.920398 140593909376896 run_classifier.py:774] Writing example 0 of 1405


INFO:tensorflow:*** Example ***


I0519 08:36:49.928479 140593909376896 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0519 08:36:49.930512 140593909376896 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] お 風 呂 場 が ##ベ ##ター ! ? iPhone ##の 保 護 シ ##ール ##が ##う ##ま ##く 貼 れ ##ない 人 に ##ヒ ##ント [SEP]


I0519 08:36:49.932982 140593909376896 run_classifier.py:464] tokens: [CLS] お 風 呂 場 が ##ベ ##ター ! ? iPhone ##の 保 護 シ ##ール ##が ##う ##ま ##く 貼 れ ##ない 人 に ##ヒ ##ント [SEP]


INFO:tensorflow:input_ids: 101 1910 8445 2806 3116 1912 111825 19054 106 136 37167 10634 2312 7288 2006 15396 10898 22526 27058 18825 7434 1976 15355 2179 1943 76145 39104 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:49.935753 140593909376896 run_classifier.py:465] input_ids: 101 1910 8445 2806 3116 1912 111825 19054 106 136 37167 10634 2312 7288 2006 15396 10898 22526 27058 18825 7434 1976 15355 2179 1943 76145 39104 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:49.937911 140593909376896 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:49.939808 140593909376896 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0519 08:36:49.941481 140593909376896 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0519 08:36:49.944000 140593909376896 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0519 08:36:49.945822 140593909376896 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] 24 時 間 し ##ょ ##こ ##た ##ん 三 昧 本 日 0 時 より 「 中 川 翔 子 が 売 って ##み ##た ！ 独 り ##24 時 間 テレビ 」 ニ ##コ ##ニ ##コ 生 放 送 開 始 [SEP]


I0519 08:36:49.947583 140593909376896 run_classifier.py:464] tokens: [CLS] 24 時 間 し ##ょ ##こ ##た ##ん 三 昧 本 日 0 時 より 「 中 川 翔 子 が 売 って ##み ##た ！ 独 り ##24 時 間 テレビ 」 ニ ##コ ##ニ ##コ 生 放 送 開 始 [SEP]


INFO:tensorflow:input_ids: 101 10233 4388 8137 1923 111809 28442 20058 18628 2077 4377 4476 4348 121 4388 14029 1890 2104 3579 6441 3350 1912 3175 14813 17575 20058 10055 5435 1974 53398 4388 8137 14297 1891 2026 19571 52923 19571 5600 4284 7719 8133 3268 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:49.949475 140593909376896 run_classifier.py:465] input_ids: 101 10233 4388 8137 1923 111809 28442 20058 18628 2077 4377 4476 4348 121 4388 14029 1890 2104 3579 6441 3350 1912 3175 14813 17575 20058 10055 5435 1974 53398 4388 8137 14297 1891 2026 19571 52923 19571 5600 4284 7719 8133 3268 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:49.951238 140593909376896 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:49.953075 140593909376896 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0519 08:36:49.954712 140593909376896 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0519 08:36:49.957020 140593909376896 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0519 08:36:49.958748 140593909376896 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] 『 癒 し 』 と 『 安 眠 サ ##ポート 』 [UNK] 部 屋 中 を ##ほ ##ん ##の ##り 桜 色 に 照 [UNK] 照 明 [UNK] [SEP]


I0519 08:36:49.960336 140593909376896 run_classifier.py:464] tokens: [CLS] 『 癒 し 』 と 『 安 眠 サ ##ポート 』 [UNK] 部 屋 中 を ##ほ ##ん ##の ##り 桜 色 に 照 [UNK] 照 明 [UNK] [SEP]


INFO:tensorflow:input_ids: 101 1892 5702 1923 1893 1940 1892 3378 5770 2004 110704 1893 100 7831 3490 2104 1980 111804 18628 10634 14244 4595 6670 1943 5337 100 5337 4368 100 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:49.962134 140593909376896 run_classifier.py:465] input_ids: 101 1892 5702 1923 1893 1940 1892 3378 5770 2004 110704 1893 100 7831 3490 2104 1980 111804 18628 10634 14244 4595 6670 1943 5337 100 5337 4368 100 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:49.963913 140593909376896 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:49.965632 140593909376896 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0519 08:36:49.967418 140593909376896 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0519 08:36:49.969487 140593909376896 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0519 08:36:49.971229 140593909376896 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] あ ##の 名 作 に 隠 された 秘 密 ！ ス ##ラム ##ダ ##ンク ##の 秘 密 完 全 版 [SEP]


I0519 08:36:49.973070 140593909376896 run_classifier.py:464] tokens: [CLS] あ ##の 名 作 に 隠 された 秘 密 ！ ス ##ラム ##ダ ##ンク ##の 秘 密 完 全 版 [SEP]


INFO:tensorflow:input_ids: 101 1904 10634 2774 2259 1943 8254 10952 5956 3417 10055 2008 41649 35412 57619 10634 5956 3417 3380 2448 5396 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:49.974840 140593909376896 run_classifier.py:465] input_ids: 101 1904 10634 2774 2259 1943 8254 10952 5956 3417 10055 2008 41649 35412 57619 10634 5956 3417 3380 2448 5396 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:49.976664 140593909376896 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:49.978410 140593909376896 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0519 08:36:49.980613 140593909376896 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0519 08:36:49.982694 140593909376896 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0519 08:36:49.984494 140593909376896 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] タ ##ム ##ロン ##の ##E ##マ ##ウン ##ト ##レン ##ズ ##を 純 正 レ ##ンズ ##と 比 べて ##み ##た [SEP]


I0519 08:36:49.986209 140593909376896 run_classifier.py:464] tokens: [CLS] タ ##ム ##ロン ##の ##E ##マ ##ウン ##ト ##レン ##ズ ##を 純 正 レ ##ンズ ##と 比 べて ##み ##た [SEP]


INFO:tensorflow:input_ids: 101 2014 14750 55287 10634 11259 22820 53669 13913 61631 14685 11377 6189 4791 2059 78992 11662 4839 86660 17575 20058 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:49.988016 140593909376896 run_classifier.py:465] input_ids: 101 2014 14750 55287 10634 11259 22820 53669 13913 61631 14685 11377 6189 4791 2059 78992 11662 4839 86660 17575 20058 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:49.989780 140593909376896 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:49.991914 140593909376896 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0519 08:36:49.994742 140593909376896 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:Writing example 0 of 151


I0519 08:36:50.453114 140593909376896 run_classifier.py:774] Writing example 0 of 151


INFO:tensorflow:*** Example ***


I0519 08:36:50.455386 140593909376896 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0519 08:36:50.463955 140593909376896 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] 何 か ##と ##ネット ##が 騒 が ##し ##かった ！ 不 審 な ##A ##ndro ##id ##ア ##プリ 騒 動 まで ##を 振 り 返 る [SEP]


I0519 08:36:50.466380 140593909376896 run_classifier.py:464] tokens: [CLS] 何 か ##と ##ネット ##が 騒 が ##し ##かった ！ 不 審 な ##A ##ndro ##id ##ア ##プリ 騒 動 まで ##を 振 り 返 る [SEP]


INFO:tensorflow:input_ids: 101 2253 1911 11662 50217 10898 8530 1912 14803 52310 10055 2080 3434 1942 10738 78908 11249 18226 85551 8530 2621 14218 11377 4117 1974 7698 1975 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:50.468498 140593909376896 run_classifier.py:465] input_ids: 101 2253 1911 11662 50217 10898 8530 1912 14803 52310 10055 2080 3434 1942 10738 78908 11249 18226 85551 8530 2621 14218 11377 4117 1974 7698 1975 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:50.471620 140593909376896 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:50.474609 140593909376896 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0519 08:36:50.477725 140593909376896 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0519 08:36:50.482547 140593909376896 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0519 08:36:50.485291 140593909376896 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] 一 連 の 流 れを 紹 介 ！ Google play ##で 出 回 った 不 審 な ##A ##ndro ##id ##ア ##プリ 問 題 [SEP]


I0519 08:36:50.487608 140593909376896 run_classifier.py:464] tokens: [CLS] 一 連 の 流 れを 紹 介 ！ Google play ##で 出 回 った 不 審 な ##A ##ndro ##id ##ア ##プリ 問 題 [SEP]


INFO:tensorflow:input_ids: 101 2072 7742 1946 4982 82227 6205 2188 10055 13888 12253 12236 2527 2999 12290 2080 3434 1942 10738 78908 11249 18226 85551 2893 8398 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:50.490448 140593909376896 run_classifier.py:465] input_ids: 101 2072 7742 1946 4982 82227 6205 2188 10055 13888 12253 12236 2527 2999 12290 2080 3434 1942 10738 78908 11249 18226 85551 2893 8398 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:50.493975 140593909376896 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:50.497133 140593909376896 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0519 08:36:50.500270 140593909376896 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0519 08:36:50.503530 140593909376896 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0519 08:36:50.506726 140593909376896 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] 連 載 ● ##A ##fter Effects 天 国 への 階 段 第 9 回 [SEP]


I0519 08:36:50.509326 140593909376896 run_classifier.py:464] tokens: [CLS] 連 載 ● ##A ##fter Effects 天 国 への 階 段 第 9 回 [SEP]


INFO:tensorflow:input_ids: 101 7742 7612 1859 10738 33163 66453 3198 3014 14853 8244 4823 6063 130 2999 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:50.511986 140593909376896 run_classifier.py:465] input_ids: 101 7742 7612 1859 10738 33163 66453 3198 3014 14853 8244 4823 6063 130 2999 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:50.515160 140593909376896 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:50.518004 140593909376896 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0519 08:36:50.520596 140593909376896 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0519 08:36:50.523989 140593909376896 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0519 08:36:50.526725 140593909376896 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] Mac 必 須 ア ##プリ ##の 使 い ##こ ##な ##し 術 ！ パ ##ソ ##コン ##の 最 新 情 報 を ##チ ##ェ ##ック [SEP]


I0519 08:36:50.529227 140593909376896 run_classifier.py:464] tokens: [CLS] Mac 必 須 ア ##プリ ##の 使 い ##こ ##な ##し 術 ！ パ ##ソ ##コン ##の 最 新 情 報 を ##チ ##ェ ##ック [SEP]


INFO:tensorflow:input_ids: 101 16917 3793 8377 1985 85551 10634 2275 1906 28442 22946 14803 7071 10055 2032 76599 60723 10634 4458 4333 3878 3115 1980 33499 67221 18060 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:50.532064 140593909376896 run_classifier.py:465] input_ids: 101 16917 3793 8377 1985 85551 10634 2275 1906 28442 22946 14803 7071 10055 2032 76599 60723 10634 4458 4333 3878 3115 1980 33499 67221 18060 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:50.534319 140593909376896 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:50.538145 140593909376896 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0519 08:36:50.541208 140593909376896 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0519 08:36:50.545081 140593909376896 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0519 08:36:50.548128 140593909376896 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] 任 天 堂 「 [UNK] [UNK] 」 は 年 末 商 戦 期 に 発 売 決 定 − 販 売 エ ##リア ##は 日 米 欧 豪 [SEP]


I0519 08:36:50.551180 140593909376896 run_classifier.py:464] tokens: [CLS] 任 天 堂 「 [UNK] [UNK] 」 は 年 末 商 戦 期 に 発 売 決 定 − 販 売 エ ##リア ##は 日 米 欧 豪 [SEP]


INFO:tensorflow:input_ids: 101 2212 3198 3102 1890 100 100 1891 1947 3642 4475 2890 3984 4470 1943 5712 3175 4898 3388 1803 7422 3175 1991 21530 11588 4348 6140 4776 7406 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:50.555763 140593909376896 run_classifier.py:465] input_ids: 101 2212 3198 3102 1890 100 100 1891 1947 3642 4475 2890 3984 4470 1943 5712 3175 4898 3388 1803 7422 3175 1991 21530 11588 4348 6140 4776 7406 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:50.558373 140593909376896 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:36:50.560288 140593909376896 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0519 08:36:50.562448 140593909376896 run_classifier.py:468] label: 1 (id = 1)


#Creating a model

Now that we've prepared our data, let's focus on building a model. `create_model` does just this below. First, it loads the BERT tf hub module again (this time to extract the computation graph). Next, it creates a single new layer that will be trained to adapt BERT to our sentiment task (i.e. classifying whether a movie review is positive or negative). This strategy of using a mostly trained model is called [fine-tuning](http://wiki.fast.ai/index.php/Fine_tuning).

In [0]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)


Next we'll wrap our model function in a `model_fn_builder` function that adapts our model to work for training, evaluation, and prediction.

In [0]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        f1_score = tf.contrib.metrics.f1_score(
            label_ids,
            predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        return {
            "eval_accuracy": accuracy,
            "f1_score": f1_score,
            "auc": auc,
            "precision": precision,
            "recall": recall,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
        }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


In [0]:
# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 1500
SAVE_SUMMARY_STEPS = 100

In [0]:
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [0]:
# Specify outpit directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [84]:
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})


INFO:tensorflow:Using config: {'_model_dir': 'gs://bert_szukiyu/OUTPUT_DIR_NAME3', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 1500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fdd38a87ba8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


I0519 08:37:24.747343 140593909376896 estimator.py:201] Using config: {'_model_dir': 'gs://bert_szukiyu/OUTPUT_DIR_NAME3', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 1500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fdd38a87ba8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


Next we create an input builder function that takes our training feature set (`train_features`) and produces a generator. This is a pretty standard design pattern for working with Tensorflow [Estimators](https://www.tensorflow.org/guide/estimators).

In [0]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

Now we train our model! For me, using a Colab notebook running on Google's GPUs, my training time was about 14 minutes.

In [86]:
print(f'Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!
INFO:tensorflow:Calling model_fn.


I0519 08:37:33.564462 140593909376896 estimator.py:1111] Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0519 08:37:36.597781 140593909376896 saver.py:1483] Saver not created because there are no variables in the graph to restore
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


INFO:tensorflow:Done calling model_fn.


I0519 08:37:45.563351 140593909376896 estimator.py:1113] Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


I0519 08:37:45.567019 140593909376896 basic_session_run_hooks.py:527] Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


I0519 08:37:53.170431 140593909376896 monitored_session.py:222] Graph was finalized.


INFO:tensorflow:Running local_init_op.


I0519 08:38:05.085232 140593909376896 session_manager.py:491] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0519 08:38:05.389353 140593909376896 session_manager.py:493] Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into gs://bert_szukiyu/OUTPUT_DIR_NAME3/model.ckpt.


I0519 08:38:18.102727 140593909376896 basic_session_run_hooks.py:594] Saving checkpoints for 0 into gs://bert_szukiyu/OUTPUT_DIR_NAME3/model.ckpt.


INFO:tensorflow:loss = 0.6878727, step = 0


I0519 08:39:33.209519 140593909376896 basic_session_run_hooks.py:249] loss = 0.6878727, step = 0


INFO:tensorflow:global_step/sec: 0.984316


I0519 08:41:14.802105 140593909376896 basic_session_run_hooks.py:680] global_step/sec: 0.984316


INFO:tensorflow:loss = 0.23841092, step = 100 (101.600 sec)


I0519 08:41:14.809239 140593909376896 basic_session_run_hooks.py:247] loss = 0.23841092, step = 100 (101.600 sec)


INFO:tensorflow:Saving checkpoints for 131 into gs://bert_szukiyu/OUTPUT_DIR_NAME3/model.ckpt.


I0519 08:41:41.649446 140593909376896 basic_session_run_hooks.py:594] Saving checkpoints for 131 into gs://bert_szukiyu/OUTPUT_DIR_NAME3/model.ckpt.


INFO:tensorflow:Loss for final step: 0.012240674.


I0519 08:42:46.456567 140593909376896 estimator.py:359] Loss for final step: 0.012240674.


Training took time  0:05:14.195205


Now let's use our test data to see how well our model did:

In [0]:
test_input_fn = run_classifier.input_fn_builder(
    features=test_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

In [88]:
estimator.evaluate(input_fn=test_input_fn, steps=None)

INFO:tensorflow:Calling model_fn.


I0519 08:43:12.591289 140593909376896 estimator.py:1111] Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0519 08:43:16.275843 140593909376896 saver.py:1483] Saver not created because there are no variables in the graph to restore
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


INFO:tensorflow:Done calling model_fn.


I0519 08:43:25.400309 140593909376896 estimator.py:1113] Done calling model_fn.


INFO:tensorflow:Starting evaluation at 2019-05-19T08:43:25Z


I0519 08:43:25.426205 140593909376896 evaluation.py:257] Starting evaluation at 2019-05-19T08:43:25Z


INFO:tensorflow:Graph was finalized.


I0519 08:43:26.854354 140593909376896 monitored_session.py:222] Graph was finalized.


INFO:tensorflow:Restoring parameters from gs://bert_szukiyu/OUTPUT_DIR_NAME3/model.ckpt-131


I0519 08:43:27.117405 140593909376896 saver.py:1270] Restoring parameters from gs://bert_szukiyu/OUTPUT_DIR_NAME3/model.ckpt-131


INFO:tensorflow:Running local_init_op.


I0519 08:45:07.232523 140593909376896 session_manager.py:491] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0519 08:45:07.553804 140593909376896 session_manager.py:493] Done running local_init_op.


INFO:tensorflow:Finished evaluation at 2019-05-19-08:45:11


I0519 08:45:11.004026 140593909376896 evaluation.py:277] Finished evaluation at 2019-05-19-08:45:11


INFO:tensorflow:Saving dict for global step 131: auc = 0.8857569, eval_accuracy = 0.8874172, f1_score = 0.89570546, false_negatives = 5.0, false_positives = 12.0, global_step = 131, loss = 0.31906718, precision = 0.85882354, recall = 0.9358974, true_negatives = 61.0, true_positives = 73.0


I0519 08:45:11.006292 140593909376896 estimator.py:1979] Saving dict for global step 131: auc = 0.8857569, eval_accuracy = 0.8874172, f1_score = 0.89570546, false_negatives = 5.0, false_positives = 12.0, global_step = 131, loss = 0.31906718, precision = 0.85882354, recall = 0.9358974, true_negatives = 61.0, true_positives = 73.0


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 131: gs://bert_szukiyu/OUTPUT_DIR_NAME3/model.ckpt-131


I0519 08:45:21.540326 140593909376896 estimator.py:2039] Saving 'checkpoint_path' summary for global step 131: gs://bert_szukiyu/OUTPUT_DIR_NAME3/model.ckpt-131


{'auc': 0.8857569,
 'eval_accuracy': 0.8874172,
 'f1_score': 0.89570546,
 'false_negatives': 5.0,
 'false_positives': 12.0,
 'global_step': 131,
 'loss': 0.31906718,
 'precision': 0.85882354,
 'recall': 0.9358974,
 'true_negatives': 61.0,
 'true_positives': 73.0}

Now let's write code to make predictions on new sentences:

In [0]:
def getPrediction(in_sentences):
  labels = ["Negative", "Positive"]
  input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, label = 0) for x in in_sentences] # here, "" is just a dummy label
  input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
  predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
  predictions = estimator.predict(predict_input_fn)
  return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]

In [0]:
pred_sentences = [
  "今日は良い一日だ",
  "株価は上がる",
  "今日は天気が悪いな",
  "初めて彼女ができました"
]

In [94]:
predictions = getPrediction(pred_sentences)

INFO:tensorflow:Writing example 0 of 4


I0519 08:49:40.039969 140593909376896 run_classifier.py:774] Writing example 0 of 4


INFO:tensorflow:*** Example ***


I0519 08:49:40.043618 140593909376896 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: 


I0519 08:49:40.045492 140593909376896 run_classifier.py:462] guid: 


INFO:tensorflow:tokens: [CLS] 今 日 は 良 い 一 日 だ [SEP]


I0519 08:49:40.050465 140593909376896 run_classifier.py:464] tokens: [CLS] 今 日 は 良 い 一 日 だ [SEP]


INFO:tensorflow:input_ids: 101 2187 4348 1947 6667 1906 2072 4348 1932 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:49:40.053718 140593909376896 run_classifier.py:465] input_ids: 101 2187 4348 1947 6667 1906 2072 4348 1932 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:49:40.056386 140593909376896 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:49:40.059374 140593909376896 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0519 08:49:40.063015 140593909376896 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0519 08:49:40.069602 140593909376896 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: 


I0519 08:49:40.072299 140593909376896 run_classifier.py:462] guid: 


INFO:tensorflow:tokens: [CLS] 株 価 は 上 がる [SEP]


I0519 08:49:40.073594 140593909376896 run_classifier.py:464] tokens: [CLS] 株 価 は 上 がる [SEP]


INFO:tensorflow:input_ids: 101 4575 2289 1947 2078 73901 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:49:40.074884 140593909376896 run_classifier.py:465] input_ids: 101 4575 2289 1947 2078 73901 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:49:40.076582 140593909376896 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:49:40.078108 140593909376896 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0519 08:49:40.079416 140593909376896 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0519 08:49:40.080803 140593909376896 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: 


I0519 08:49:40.082157 140593909376896 run_classifier.py:462] guid: 


INFO:tensorflow:tokens: [CLS] 今 日 は 天 気 が 悪 い ##な [SEP]


I0519 08:49:40.083431 140593909376896 run_classifier.py:464] tokens: [CLS] 今 日 は 天 気 が 悪 い ##な [SEP]


INFO:tensorflow:input_ids: 101 2187 4348 1947 3198 4854 1912 3871 1906 22946 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:49:40.085182 140593909376896 run_classifier.py:465] input_ids: 101 2187 4348 1947 3198 4854 1912 3871 1906 22946 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:49:40.086597 140593909376896 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:49:40.087975 140593909376896 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0519 08:49:40.089223 140593909376896 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0519 08:49:40.090927 140593909376896 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: 


I0519 08:49:40.092356 140593909376896 run_classifier.py:462] guid: 


INFO:tensorflow:tokens: [CLS] 初 めて 彼 女 が ##で ##き ##ま ##した [SEP]


I0519 08:49:40.094287 140593909376896 run_classifier.py:464] tokens: [CLS] 初 めて 彼 女 が ##で ##き ##ま ##した [SEP]


INFO:tensorflow:input_ids: 101 2547 19945 3760 3235 1912 12236 16838 27058 16923 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:49:40.095736 140593909376896 run_classifier.py:465] input_ids: 101 2547 19945 3760 3235 1912 12236 16838 27058 16923 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:49:40.097176 140593909376896 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0519 08:49:40.098812 140593909376896 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0519 08:49:40.099939 140593909376896 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:Calling model_fn.


I0519 08:49:41.142229 140593909376896 estimator.py:1111] Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0519 08:49:44.635598 140593909376896 saver.py:1483] Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Done calling model_fn.


I0519 08:49:44.849048 140593909376896 estimator.py:1113] Done calling model_fn.


INFO:tensorflow:Graph was finalized.


I0519 08:49:45.842766 140593909376896 monitored_session.py:222] Graph was finalized.


INFO:tensorflow:Restoring parameters from gs://bert_szukiyu/OUTPUT_DIR_NAME3/model.ckpt-131


I0519 08:49:46.117480 140593909376896 saver.py:1270] Restoring parameters from gs://bert_szukiyu/OUTPUT_DIR_NAME3/model.ckpt-131


INFO:tensorflow:Running local_init_op.


I0519 08:50:41.207892 140593909376896 session_manager.py:491] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0519 08:50:41.302258 140593909376896 session_manager.py:493] Done running local_init_op.


Voila! We have a sentiment classifier!

In [95]:
predictions

[('今日は良い一日だ', array([-0.06588927, -2.752544  ], dtype=float32), 'Negative'),
 ('株価は上がる', array([-3.9052618 , -0.02034113], dtype=float32), 'Positive'),
 ('今日は天気が悪いな', array([-4.1553664 , -0.01580427], dtype=float32), 'Positive'),
 ('初めて彼女ができました', array([-4.8279076 , -0.00803548], dtype=float32), 'Positive')]