# Happiness Source Prediction

I am going to take help of BERT: Bidirectional Encoder Representations from Transformers. It’s a neural network architecture designed by Google researchers that’s totally transformed what’s state-of-the-art for NLP tasks, like text classification, translation, summarization, and question answering. I am going to use it for this classification task.

In [1]:
# imports 

import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime
from sklearn.preprocessing import LabelEncoder

W0419 22:01:47.086790 140089183954816 __init__.py:56] Some hub symbols are not available because TensorFlow version is less than 1.14


Installing BERT's python package.

In [2]:
!pip install bert-tensorflow

Collecting bert-tensorflow
[?25l  Downloading https://files.pythonhosted.org/packages/a6/66/7eb4e8b6ea35b7cc54c322c816f976167a43019750279a8473d355800a93/bert_tensorflow-1.0.1-py2.py3-none-any.whl (67kB)
[K    100% |████████████████████████████████| 71kB 3.4MB/s 
Installing collected packages: bert-tensorflow
Successfully installed bert-tensorflow-1.0.1


In [0]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization

Loading the data into Google colab

In [4]:
from google.colab import files
uploaded = files.upload()

Saving hm_train.csv to hm_train.csv


In [0]:
import io
train_data = pd.read_csv(io.BytesIO(uploaded['hm_train.csv']))

In [6]:
from google.colab import files
uploaded = files.upload()

Saving hm_test.csv to hm_test.csv


In [0]:
test_data =  pd.read_csv(io.BytesIO(uploaded['hm_test.csv']))

In [0]:
lb_predicted_category = LabelEncoder()
train_data["predicted_category_code"] = lb_predicted_category.fit_transform(train_data["predicted_category"])

In [9]:
train_data.head()

Unnamed: 0,hmid,reflection_period,cleaned_hm,num_sentence,predicted_category,predicted_category_code
0,27673,24h,I went on a successful date with someone I fel...,1,affection,1
1,27674,24h,I was happy when my son got 90% marks in his e...,1,affection,1
2,27675,24h,I went to the gym this morning and did yoga.,1,exercise,4
3,27676,24h,We had a serious talk with some friends of our...,2,bonding,2
4,27677,24h,I went with grandchildren to butterfly display...,1,affection,1


In [10]:
train_data.predicted_category_code.unique()

array([1, 4, 2, 5, 0, 3, 6])

In [0]:
train_data['predicted_category_code'] = train_data['predicted_category_code'].astype(int)

For us, our input data is the 'cleaned_hm' column and our label is the 'predicted_category_code' column.

In [0]:
DATA_COLUMN = 'cleaned_hm'
LABEL_COLUMN = 'predicted_category_code'

# label_list is the list of labels, 
# i.e. affection, achievement, bonding, enjoy_the_moment, leisure, nature, exercise
label_list = [0, 1, 2, 3, 4, 5, 6]

### Data Preprocessing

We'll need to transform our data into a format BERT understands. This involves two steps. First, we create InputExample's using the constructor provided in the BERT library.

text_a is the text we want to classify, which in this case, is the 'cleaned_hm' in our Dataframe.
text_b is used if we're training a model to understand the relationship between sentences (i.e. is text_b a translation of text_a? Is text_b an answer to the question asked by text_a?). This doesn't apply to our task, so we can leave text_b blank.
label is the label for our example, i.e. nature, leisure etc.

In [0]:
# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train_data.apply(lambda x: bert.run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this task
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (which is included in the Python library):

1. Lowercase our text (if we're using a BERT lowercase model)
2. Tokenize it (i.e. "I love nature" --> ['I', 'love', 'nature']
3. Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
4. Map our words to indexes using a vocab file that BERT provides
5. Add special "CLS" and "SEP" tokens
6. Append "index" and "segment" tokens to each input

To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:

In [14]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
    """Get the vocab file and casing info from the Hub module."""
    with tf.Graph().as_default():
        bert_module = hub.Module(BERT_MODEL_HUB)
        tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
        with tf.Session() as sess:
            vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])

    return bert.tokenization.FullTokenizer(vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

Instructions for updating:
Colocations handled automatically by placer.


W0419 22:03:01.336816 140089183954816 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py:3632: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0419 22:03:03.505524 140089183954816 saver.py:1483] Saver not created because there are no variables in the graph to restore


In [15]:
tokenizer.tokenize("Let's see if the tokenizer is working or not")

['let',
 "'",
 's',
 'see',
 'if',
 'the',
 'token',
 '##izer',
 'is',
 'working',
 'or',
 'not']

Using our tokenizer, we'll call run_classifier.convert_examples_to_features on our InputExamples to convert them into features BERT understands.

In [16]:
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128

# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)

INFO:tensorflow:Writing example 0 of 60321


I0419 22:03:06.210250 140089183954816 run_classifier.py:774] Writing example 0 of 60321


INFO:tensorflow:*** Example ***


I0419 22:03:06.214629 140089183954816 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0419 22:03:06.217499 140089183954816 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] i went on a successful date with someone i felt sympathy and connection with . [SEP]


I0419 22:03:06.219586 140089183954816 run_classifier.py:464] tokens: [CLS] i went on a successful date with someone i felt sympathy and connection with . [SEP]


INFO:tensorflow:input_ids: 101 1045 2253 2006 1037 3144 3058 2007 2619 1045 2371 11883 1998 4434 2007 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:03:06.221567 140089183954816 run_classifier.py:465] input_ids: 101 1045 2253 2006 1037 3144 3058 2007 2619 1045 2371 11883 1998 4434 2007 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:03:06.223455 140089183954816 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:03:06.225509 140089183954816 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0419 22:03:06.227090 140089183954816 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0419 22:03:06.229145 140089183954816 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0419 22:03:06.230811 140089183954816 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] i was happy when my son got 90 % marks in his examination [SEP]


I0419 22:03:06.232396 140089183954816 run_classifier.py:464] tokens: [CLS] i was happy when my son got 90 % marks in his examination [SEP]


INFO:tensorflow:input_ids: 101 1045 2001 3407 2043 2026 2365 2288 3938 1003 6017 1999 2010 7749 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:03:06.234189 140089183954816 run_classifier.py:465] input_ids: 101 1045 2001 3407 2043 2026 2365 2288 3938 1003 6017 1999 2010 7749 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:03:06.236294 140089183954816 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:03:06.237812 140089183954816 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0419 22:03:06.239482 140089183954816 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:*** Example ***


I0419 22:03:06.241514 140089183954816 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0419 22:03:06.243638 140089183954816 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] i went to the gym this morning and did yoga . [SEP]


I0419 22:03:06.245256 140089183954816 run_classifier.py:464] tokens: [CLS] i went to the gym this morning and did yoga . [SEP]


INFO:tensorflow:input_ids: 101 1045 2253 2000 1996 9726 2023 2851 1998 2106 13272 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:03:06.246793 140089183954816 run_classifier.py:465] input_ids: 101 1045 2253 2000 1996 9726 2023 2851 1998 2106 13272 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:03:06.248518 140089183954816 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:03:06.250577 140089183954816 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 4 (id = 4)


I0419 22:03:06.252462 140089183954816 run_classifier.py:468] label: 4 (id = 4)


INFO:tensorflow:*** Example ***


I0419 22:03:06.254774 140089183954816 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0419 22:03:06.256754 140089183954816 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] we had a serious talk with some friends of ours who have been fl ##ak ##y lately . they understood and we had a good evening hanging out . [SEP]


I0419 22:03:06.258399 140089183954816 run_classifier.py:464] tokens: [CLS] we had a serious talk with some friends of ours who have been fl ##ak ##y lately . they understood and we had a good evening hanging out . [SEP]


INFO:tensorflow:input_ids: 101 2057 2018 1037 3809 2831 2007 2070 2814 1997 14635 2040 2031 2042 13109 4817 2100 9906 1012 2027 5319 1998 2057 2018 1037 2204 3944 5689 2041 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:03:06.260077 140089183954816 run_classifier.py:465] input_ids: 101 2057 2018 1037 3809 2831 2007 2070 2814 1997 14635 2040 2031 2042 13109 4817 2100 9906 1012 2027 5319 1998 2057 2018 1037 2204 3944 5689 2041 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:03:06.261682 140089183954816 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:03:06.263325 140089183954816 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 2 (id = 2)


I0419 22:03:06.264862 140089183954816 run_classifier.py:468] label: 2 (id = 2)


INFO:tensorflow:*** Example ***


I0419 22:03:06.267252 140089183954816 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: None


I0419 22:03:06.268779 140089183954816 run_classifier.py:462] guid: None


INFO:tensorflow:tokens: [CLS] i went with grandchildren to butterfly display at cr ##oh ##n conservatory [SEP]


I0419 22:03:06.270433 140089183954816 run_classifier.py:464] tokens: [CLS] i went with grandchildren to butterfly display at cr ##oh ##n conservatory [SEP]


INFO:tensorflow:input_ids: 101 1045 2253 2007 13628 2000 9112 4653 2012 13675 11631 2078 11879 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:03:06.272073 140089183954816 run_classifier.py:465] input_ids: 101 1045 2253 2007 13628 2000 9112 4653 2012 13675 11631 2078 11879 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:03:06.273609 140089183954816 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:03:06.275197 140089183954816 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 1 (id = 1)


I0419 22:03:06.276764 140089183954816 run_classifier.py:468] label: 1 (id = 1)


INFO:tensorflow:Writing example 10000 of 60321


I0419 22:03:10.950135 140089183954816 run_classifier.py:774] Writing example 10000 of 60321


INFO:tensorflow:Writing example 20000 of 60321


I0419 22:03:15.436576 140089183954816 run_classifier.py:774] Writing example 20000 of 60321


INFO:tensorflow:Writing example 30000 of 60321


I0419 22:03:19.943905 140089183954816 run_classifier.py:774] Writing example 30000 of 60321


INFO:tensorflow:Writing example 40000 of 60321


I0419 22:03:24.387527 140089183954816 run_classifier.py:774] Writing example 40000 of 60321


INFO:tensorflow:Writing example 50000 of 60321


I0419 22:03:29.060139 140089183954816 run_classifier.py:774] Writing example 50000 of 60321


INFO:tensorflow:Writing example 60000 of 60321


I0419 22:03:33.036381 140089183954816 run_classifier.py:774] Writing example 60000 of 60321


### Creating a model

1. Loading the BERT tf hub module again (this time to extract the computation graph). 
2. Creates a single new layer that will be trained to adapt BERT to our classification task (i.e. identifying the source of happiness). 



In [0]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels, num_labels):
    """Creates a classification model."""

    bert_module = hub.Module(BERT_MODEL_HUB, trainable=True)
    bert_inputs = dict(input_ids=input_ids, input_mask=input_mask, segment_ids=segment_ids)
    bert_outputs = bert_module(inputs=bert_inputs, signature="tokens",as_dict=True)

    # Use "pooled_output" for classification tasks on an entire sentence.
    # Use "sequence_outputs" for token-level output.
    output_layer = bert_outputs["pooled_output"]
    hidden_size = output_layer.shape[-1].value

    # Create our own layer to tune for happiness data.
    output_weights = tf.get_variable( "output_weights", [num_labels, hidden_size],
                                     initializer=tf.truncated_normal_initializer(stddev=0.02))

    output_bias = tf.get_variable("output_bias", [num_labels], initializer=tf.zeros_initializer())

    with tf.variable_scope("loss"):

        # Dropout helps prevent overfitting
        output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

        logits = tf.matmul(output_layer, output_weights, transpose_b=True)
        logits = tf.nn.bias_add(logits, output_bias)
        log_probs = tf.nn.log_softmax(logits, axis=-1)

        # Convert labels into one-hot encoding
        one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

        predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
        # If we're predicting, we want predicted labels and the probabiltiies.
        if is_predicting:
            return (predicted_labels, log_probs)

        # If we're train/eval, compute loss between predicted and actual label
        per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
        loss = tf.reduce_mean(per_example_loss)
        return (loss, predicted_labels, log_probs)

wrapping our model function in a model_fn_builder function that adapts our model to work for training, evaluation, and prediction.

In [0]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps, num_warmup_steps):
    """Returns `model_fn` closure for TPUEstimator."""
    def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
        """The `model_fn` for TPUEstimator."""

        input_ids = features["input_ids"]
        input_mask = features["input_mask"]
        segment_ids = features["segment_ids"]
        label_ids = features["label_ids"]

        is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)

        # TRAIN and EVAL
        if not is_predicting:

            (loss, predicted_labels, log_probs) = create_model(is_predicting, input_ids, 
                                                               input_mask, segment_ids, label_ids, num_labels)

            train_op = bert.optimization.create_optimizer(loss, learning_rate, num_train_steps, 
                                                          num_warmup_steps, use_tpu=False)

            # Calculate evaluation metrics. 
            def metric_fn(label_ids, predicted_labels):
                
                accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
                f1_score = tf.contrib.metrics.f1_score(label_ids, predicted_labels)
                auc = tf.metrics.auc(label_ids, predicted_labels)
                recall = tf.metrics.recall(label_ids, predicted_labels)
                precision = tf.metrics.precision(label_ids, predicted_labels) 
                
                true_pos = tf.metrics.true_positives(label_ids, predicted_labels)
                true_neg = tf.metrics.true_negatives(label_ids, predicted_labels)   
                false_pos = tf.metrics.false_positives(label_ids, predicted_labels)  
                false_neg = tf.metrics.false_negatives(label_ids, predicted_labels)
                
                return {"eval_accuracy": accuracy, "f1_score": f1_score, "auc": auc, 
                        "precision": precision, "recall": recall, "true_positives": true_pos, 
                        "true_negatives": true_neg, "false_positives": false_pos, 
                        "false_negatives": false_neg}

            eval_metrics = metric_fn(label_ids, predicted_labels)

            if mode == tf.estimator.ModeKeys.TRAIN:
                return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)
            else:
                return tf.estimator.EstimatorSpec(mode=mode, loss=loss, eval_metric_ops=eval_metrics)
        
        else:
            (predicted_labels, log_probs) = create_model(is_predicting, input_ids, input_mask,
                                                         segment_ids, label_ids, num_labels)

            predictions = {'probabilities': log_probs, 'labels': predicted_labels}
            
            return tf.estimator.EstimatorSpec(mode, predictions=predictions)

    # Return the actual model function in the closure
    return model_fn

In [0]:
# Compute train and warmup steps from batch size
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 5000
SAVE_SUMMARY_STEPS = 100

In [0]:
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [0]:
run_config = tf.estimator.RunConfig(
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [22]:
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})



W0419 22:03:33.262340 140089183954816 estimator.py:1760] Using temporary folder as model directory: /tmp/tmp40mtjlkj


INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmp40mtjlkj', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 5000, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f68a07e4b00>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


I0419 22:03:33.266431 140089183954816 estimator.py:201] Using config: {'_model_dir': '/tmp/tmp40mtjlkj', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 5000, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f68a07e4b00>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


In [0]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

### Training the model

In [24]:
print(f'Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

Beginning Training!
INFO:tensorflow:Calling model_fn.


I0419 22:04:05.031569 140089183954816 estimator.py:1111] Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0419 22:04:09.187640 140089183954816 saver.py:1483] Saver not created because there are no variables in the graph to restore


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


W0419 22:04:09.339473 140089183954816 deprecation.py:506] From <ipython-input-17-20cd867c9bb4>:22: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


W0419 22:04:09.404798 140089183954816 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/learning_rate_decay_v2.py:321: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


Instructions for updating:
Use tf.cast instead.


W0419 22:04:09.504775 140089183954816 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


Instructions for updating:
Use tf.cast instead.


W0419 22:04:18.655939 140089183954816 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/metrics_impl.py:455: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.



For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

INFO:tensorflow:Done calling model_fn.


I0419 22:04:21.140556 140089183954816 estimator.py:1113] Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


I0419 22:04:21.144109 140089183954816 basic_session_run_hooks.py:527] Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


I0419 22:04:27.063959 140089183954816 monitored_session.py:222] Graph was finalized.


INFO:tensorflow:Running local_init_op.


I0419 22:04:32.908148 140089183954816 session_manager.py:491] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0419 22:04:33.213131 140089183954816 session_manager.py:493] Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmp40mtjlkj/model.ckpt.


I0419 22:05:03.955868 140089183954816 basic_session_run_hooks.py:594] Saving checkpoints for 0 into /tmp/tmp40mtjlkj/model.ckpt.


INFO:tensorflow:loss = 2.0785513, step = 0


I0419 22:05:24.162850 140089183954816 basic_session_run_hooks.py:249] loss = 2.0785513, step = 0


INFO:tensorflow:global_step/sec: 1.03418


I0419 22:07:00.856575 140089183954816 basic_session_run_hooks.py:680] global_step/sec: 1.03418


INFO:tensorflow:loss = 0.4943433, step = 100 (96.696 sec)


I0419 22:07:00.859305 140089183954816 basic_session_run_hooks.py:247] loss = 0.4943433, step = 100 (96.696 sec)


INFO:tensorflow:global_step/sec: 1.17101


I0419 22:08:26.252946 140089183954816 basic_session_run_hooks.py:680] global_step/sec: 1.17101


INFO:tensorflow:loss = 0.43459374, step = 200 (85.396 sec)


I0419 22:08:26.255776 140089183954816 basic_session_run_hooks.py:247] loss = 0.43459374, step = 200 (85.396 sec)


INFO:tensorflow:global_step/sec: 1.17364


I0419 22:09:51.458046 140089183954816 basic_session_run_hooks.py:680] global_step/sec: 1.17364


INFO:tensorflow:loss = 0.55894077, step = 300 (85.205 sec)


I0419 22:09:51.461276 140089183954816 basic_session_run_hooks.py:247] loss = 0.55894077, step = 300 (85.205 sec)


INFO:tensorflow:global_step/sec: 1.17366


I0419 22:11:16.661289 140089183954816 basic_session_run_hooks.py:680] global_step/sec: 1.17366


INFO:tensorflow:loss = 0.6865324, step = 400 (85.203 sec)


I0419 22:11:16.663795 140089183954816 basic_session_run_hooks.py:247] loss = 0.6865324, step = 400 (85.203 sec)


INFO:tensorflow:global_step/sec: 1.17498


I0419 22:12:41.769447 140089183954816 basic_session_run_hooks.py:680] global_step/sec: 1.17498


INFO:tensorflow:loss = 0.48819864, step = 500 (85.108 sec)


I0419 22:12:41.771913 140089183954816 basic_session_run_hooks.py:247] loss = 0.48819864, step = 500 (85.108 sec)


INFO:tensorflow:global_step/sec: 1.17623


I0419 22:14:06.786930 140089183954816 basic_session_run_hooks.py:680] global_step/sec: 1.17623


INFO:tensorflow:loss = 0.13297513, step = 600 (85.022 sec)


I0419 22:14:06.794112 140089183954816 basic_session_run_hooks.py:247] loss = 0.13297513, step = 600 (85.022 sec)


INFO:tensorflow:global_step/sec: 1.1758


I0419 22:15:31.835549 140089183954816 basic_session_run_hooks.py:680] global_step/sec: 1.1758


INFO:tensorflow:loss = 0.3712653, step = 700 (85.044 sec)


I0419 22:15:31.838244 140089183954816 basic_session_run_hooks.py:247] loss = 0.3712653, step = 700 (85.044 sec)


INFO:tensorflow:global_step/sec: 1.17697


I0419 22:16:56.799548 140089183954816 basic_session_run_hooks.py:680] global_step/sec: 1.17697


INFO:tensorflow:loss = 0.32117924, step = 800 (84.971 sec)


I0419 22:16:56.809179 140089183954816 basic_session_run_hooks.py:247] loss = 0.32117924, step = 800 (84.971 sec)


INFO:tensorflow:global_step/sec: 1.17778


I0419 22:18:21.705224 140089183954816 basic_session_run_hooks.py:680] global_step/sec: 1.17778


INFO:tensorflow:loss = 0.38167685, step = 900 (84.902 sec)


I0419 22:18:21.711339 140089183954816 basic_session_run_hooks.py:247] loss = 0.38167685, step = 900 (84.902 sec)


INFO:tensorflow:global_step/sec: 1.17438


I0419 22:19:46.856228 140089183954816 basic_session_run_hooks.py:680] global_step/sec: 1.17438


INFO:tensorflow:loss = 0.38030726, step = 1000 (85.153 sec)


I0419 22:19:46.864451 140089183954816 basic_session_run_hooks.py:247] loss = 0.38030726, step = 1000 (85.153 sec)


INFO:tensorflow:global_step/sec: 1.17168


I0419 22:21:12.203535 140089183954816 basic_session_run_hooks.py:680] global_step/sec: 1.17168


INFO:tensorflow:loss = 0.19110858, step = 1100 (85.347 sec)


I0419 22:21:12.211126 140089183954816 basic_session_run_hooks.py:247] loss = 0.19110858, step = 1100 (85.347 sec)


INFO:tensorflow:global_step/sec: 1.17277


I0419 22:22:37.471783 140089183954816 basic_session_run_hooks.py:680] global_step/sec: 1.17277


INFO:tensorflow:loss = 0.289229, step = 1200 (85.264 sec)


I0419 22:22:37.475381 140089183954816 basic_session_run_hooks.py:247] loss = 0.289229, step = 1200 (85.264 sec)


INFO:tensorflow:global_step/sec: 1.17303


I0419 22:24:02.721055 140089183954816 basic_session_run_hooks.py:680] global_step/sec: 1.17303


INFO:tensorflow:loss = 0.0908238, step = 1300 (85.251 sec)


I0419 22:24:02.725926 140089183954816 basic_session_run_hooks.py:247] loss = 0.0908238, step = 1300 (85.251 sec)


INFO:tensorflow:global_step/sec: 1.17409


I0419 22:25:27.893727 140089183954816 basic_session_run_hooks.py:680] global_step/sec: 1.17409


INFO:tensorflow:loss = 0.46103156, step = 1400 (85.175 sec)


I0419 22:25:27.901358 140089183954816 basic_session_run_hooks.py:247] loss = 0.46103156, step = 1400 (85.175 sec)


INFO:tensorflow:global_step/sec: 1.17198


I0419 22:26:53.219381 140089183954816 basic_session_run_hooks.py:680] global_step/sec: 1.17198


INFO:tensorflow:loss = 0.2972251, step = 1500 (85.321 sec)


I0419 22:26:53.222349 140089183954816 basic_session_run_hooks.py:247] loss = 0.2972251, step = 1500 (85.321 sec)


INFO:tensorflow:global_step/sec: 1.17826


I0419 22:28:18.090428 140089183954816 basic_session_run_hooks.py:680] global_step/sec: 1.17826


INFO:tensorflow:loss = 0.48737085, step = 1600 (84.875 sec)


I0419 22:28:18.097748 140089183954816 basic_session_run_hooks.py:247] loss = 0.48737085, step = 1600 (84.875 sec)


INFO:tensorflow:global_step/sec: 1.17529


I0419 22:29:43.175751 140089183954816 basic_session_run_hooks.py:680] global_step/sec: 1.17529


INFO:tensorflow:loss = 0.26944524, step = 1700 (85.086 sec)


I0419 22:29:43.183745 140089183954816 basic_session_run_hooks.py:247] loss = 0.26944524, step = 1700 (85.086 sec)


INFO:tensorflow:global_step/sec: 1.1753


I0419 22:31:08.260526 140089183954816 basic_session_run_hooks.py:680] global_step/sec: 1.1753


INFO:tensorflow:loss = 0.19136311, step = 1800 (85.083 sec)


I0419 22:31:08.266968 140089183954816 basic_session_run_hooks.py:247] loss = 0.19136311, step = 1800 (85.083 sec)


INFO:tensorflow:Saving checkpoints for 1885 into /tmp/tmp40mtjlkj/model.ckpt.


I0419 22:32:19.650875 140089183954816 basic_session_run_hooks.py:594] Saving checkpoints for 1885 into /tmp/tmp40mtjlkj/model.ckpt.


INFO:tensorflow:Loss for final step: 0.27776176.


I0419 22:32:31.048805 140089183954816 estimator.py:359] Loss for final step: 0.27776176.


Training took time  0:28:57.709335


Let's write code make prediction on the test data

In [0]:
def getPrediction(in_sentences):
  labels = ['achievement','affection','bonding','enjoy_the_moment','exercise','leisure','nature']
  
  input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, label = 0) for x in in_sentences] # here, "" is just a dummy label
  input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
  predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
  predictions = estimator.predict(predict_input_fn)
  #return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]
  return [labels[prediction['labels']] for sentence, prediction in zip(in_sentences, predictions)]

In [26]:
predictions = getPrediction(test_data.cleaned_hm)

INFO:tensorflow:Writing example 0 of 40213


I0419 22:32:31.254864 140089183954816 run_classifier.py:774] Writing example 0 of 40213


INFO:tensorflow:*** Example ***


I0419 22:32:31.257920 140089183954816 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: 


I0419 22:32:31.260064 140089183954816 run_classifier.py:462] guid: 


INFO:tensorflow:tokens: [CLS] i spent the weekend in chicago with my friends . [SEP]


I0419 22:32:31.262832 140089183954816 run_classifier.py:464] tokens: [CLS] i spent the weekend in chicago with my friends . [SEP]


INFO:tensorflow:input_ids: 101 1045 2985 1996 5353 1999 3190 2007 2026 2814 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:32:31.266111 140089183954816 run_classifier.py:465] input_ids: 101 1045 2985 1996 5353 1999 3190 2007 2026 2814 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:32:31.268155 140089183954816 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:32:31.270570 140089183954816 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0419 22:32:31.272596 140089183954816 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0419 22:32:31.275673 140089183954816 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: 


I0419 22:32:31.278080 140089183954816 run_classifier.py:462] guid: 


INFO:tensorflow:tokens: [CLS] we moved back into our house after a re ##mo ##del . we had lived in a hotel for 9 months due to our home being severely damaged in a tornado . [SEP]


I0419 22:32:31.280040 140089183954816 run_classifier.py:464] tokens: [CLS] we moved back into our house after a re ##mo ##del . we had lived in a hotel for 9 months due to our home being severely damaged in a tornado . [SEP]


INFO:tensorflow:input_ids: 101 2057 2333 2067 2046 2256 2160 2044 1037 2128 5302 9247 1012 2057 2018 2973 1999 1037 3309 2005 1023 2706 2349 2000 2256 2188 2108 8949 5591 1999 1037 11352 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:32:31.282053 140089183954816 run_classifier.py:465] input_ids: 101 2057 2333 2067 2046 2256 2160 2044 1037 2128 5302 9247 1012 2057 2018 2973 1999 1037 3309 2005 1023 2706 2349 2000 2256 2188 2108 8949 5591 1999 1037 11352 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:32:31.284299 140089183954816 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:32:31.286315 140089183954816 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0419 22:32:31.288253 140089183954816 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0419 22:32:31.290642 140089183954816 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: 


I0419 22:32:31.292334 140089183954816 run_classifier.py:462] guid: 


INFO:tensorflow:tokens: [CLS] my fiance proposed to me in front of my family in the beginning of march . [SEP]


I0419 22:32:31.295005 140089183954816 run_classifier.py:464] tokens: [CLS] my fiance proposed to me in front of my family in the beginning of march . [SEP]


INFO:tensorflow:input_ids: 101 2026 19154 3818 2000 2033 1999 2392 1997 2026 2155 1999 1996 2927 1997 2233 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:32:31.297600 140089183954816 run_classifier.py:465] input_ids: 101 2026 19154 3818 2000 2033 1999 2392 1997 2026 2155 1999 1996 2927 1997 2233 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:32:31.299574 140089183954816 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:32:31.301615 140089183954816 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0419 22:32:31.303539 140089183954816 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0419 22:32:31.306266 140089183954816 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: 


I0419 22:32:31.308245 140089183954816 run_classifier.py:462] guid: 


INFO:tensorflow:tokens: [CLS] i ate lobster at a fancy restaurant with some friends . [SEP]


I0419 22:32:31.310125 140089183954816 run_classifier.py:464] tokens: [CLS] i ate lobster at a fancy restaurant with some friends . [SEP]


INFO:tensorflow:input_ids: 101 1045 8823 27940 2012 1037 11281 4825 2007 2070 2814 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:32:31.312111 140089183954816 run_classifier.py:465] input_ids: 101 1045 8823 27940 2012 1037 11281 4825 2007 2070 2814 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:32:31.314385 140089183954816 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:32:31.316315 140089183954816 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0419 22:32:31.318285 140089183954816 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:*** Example ***


I0419 22:32:31.322739 140089183954816 run_classifier.py:461] *** Example ***


INFO:tensorflow:guid: 


I0419 22:32:31.324679 140089183954816 run_classifier.py:462] guid: 


INFO:tensorflow:tokens: [CLS] i went out to a nice restaurant on a date with my wife . it was a very popular restaurant and we could not get a reservation . but i have a friend who owns a famous hamburger place next door to this restaurant . he was able to get us a table ! we had a great table , great service , great food , and they even com ##ped most of our dinner , so we paid almost nothing ! [SEP]


I0419 22:32:31.326535 140089183954816 run_classifier.py:464] tokens: [CLS] i went out to a nice restaurant on a date with my wife . it was a very popular restaurant and we could not get a reservation . but i have a friend who owns a famous hamburger place next door to this restaurant . he was able to get us a table ! we had a great table , great service , great food , and they even com ##ped most of our dinner , so we paid almost nothing ! [SEP]


INFO:tensorflow:input_ids: 101 1045 2253 2041 2000 1037 3835 4825 2006 1037 3058 2007 2026 2564 1012 2009 2001 1037 2200 2759 4825 1998 2057 2071 2025 2131 1037 11079 1012 2021 1045 2031 1037 2767 2040 8617 1037 3297 24575 2173 2279 2341 2000 2023 4825 1012 2002 2001 2583 2000 2131 2149 1037 2795 999 2057 2018 1037 2307 2795 1010 2307 2326 1010 2307 2833 1010 1998 2027 2130 4012 5669 2087 1997 2256 4596 1010 2061 2057 3825 2471 2498 999 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:32:31.328116 140089183954816 run_classifier.py:465] input_ids: 101 1045 2253 2041 2000 1037 3835 4825 2006 1037 3058 2007 2026 2564 1012 2009 2001 1037 2200 2759 4825 1998 2057 2071 2025 2131 1037 11079 1012 2021 1045 2031 1037 2767 2040 8617 1037 3297 24575 2173 2279 2341 2000 2023 4825 1012 2002 2001 2583 2000 2131 2149 1037 2795 999 2057 2018 1037 2307 2795 1010 2307 2326 1010 2307 2833 1010 1998 2027 2130 4012 5669 2087 1997 2256 4596 1010 2061 2057 3825 2471 2498 999 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:32:31.334835 140089183954816 run_classifier.py:466] input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


I0419 22:32:31.336780 140089183954816 run_classifier.py:467] segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


INFO:tensorflow:label: 0 (id = 0)


I0419 22:32:31.338852 140089183954816 run_classifier.py:468] label: 0 (id = 0)


INFO:tensorflow:Writing example 10000 of 40213


I0419 22:32:35.306473 140089183954816 run_classifier.py:774] Writing example 10000 of 40213


INFO:tensorflow:Writing example 20000 of 40213


I0419 22:32:39.211763 140089183954816 run_classifier.py:774] Writing example 20000 of 40213


INFO:tensorflow:Writing example 30000 of 40213


I0419 22:32:42.920602 140089183954816 run_classifier.py:774] Writing example 30000 of 40213


INFO:tensorflow:Writing example 40000 of 40213


I0419 22:32:48.185726 140089183954816 run_classifier.py:774] Writing example 40000 of 40213


INFO:tensorflow:Calling model_fn.


I0419 22:33:08.746493 140089183954816 estimator.py:1111] Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0419 22:33:12.553517 140089183954816 saver.py:1483] Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Done calling model_fn.


I0419 22:33:12.732223 140089183954816 estimator.py:1113] Done calling model_fn.


INFO:tensorflow:Graph was finalized.


I0419 22:33:13.236671 140089183954816 monitored_session.py:222] Graph was finalized.


Instructions for updating:
Use standard file APIs to check for files with this prefix.


W0419 22:33:13.240444 140089183954816 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.


INFO:tensorflow:Restoring parameters from /tmp/tmp40mtjlkj/model.ckpt-1885


I0419 22:33:13.246273 140089183954816 saver.py:1270] Restoring parameters from /tmp/tmp40mtjlkj/model.ckpt-1885


INFO:tensorflow:Running local_init_op.


I0419 22:33:14.144381 140089183954816 session_manager.py:491] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0419 22:33:14.252605 140089183954816 session_manager.py:493] Done running local_init_op.


### Creating the submission file

In [0]:
subm = {'hmid': test_data['hmid'],
        'predicted_category': predictions}

In [0]:
submission = pd.DataFrame(subm)

In [0]:
submission.to_csv('submission_happiness_v6.csv', index =False)

In [0]:
from google.colab import files
files.download("submission_happiness_v6.csv")