#### Baseline Approach

I choose to use a DNN Regressor for my first model. Neural networks with multiple layers have be shown to be very flexible in fitting responses that depend nonlinearly on many predictors. I make the assumption that dengue cases can be mapped to from environmental features with some unknown nonlinear function, and so a DNN Regressor should be able to capture information about this mapping. To avoid overfitting, I will stick to about 3 hidden layers, though I will experiment with more, testing model accuracy using a validation set. For this first pass, all real-valued features will be considered.

In [1]:
import numpy as np
import pandas as pd
import tensorflow as tf
import time
import datetime
import math

import warnings
warnings.filterwarnings("ignore")

from sklearn import metrics
from sklearn import model_selection

tf.logging.set_verbosity(tf.logging.INFO)

In [19]:
train_features = pd.read_csv('dengue_features_train.csv')
train_labels = pd.read_csv('dengue_labels_train.csv')
test_features = pd.read_csv('dengue_features_test.csv')

In [20]:
# splitting by city

train_features_sj = train_features[:936]
train_features_iq = train_features[936:]

train_labels_sj = train_labels[:936]
train_labels_iq = train_labels[936:]

test_features_sj = test_features[:260]
test_features_iq = test_features[260:]

In [21]:
# introducing a datetime feature that contains the unix timestamp

def get_timestamp(features):
    dt = []
    for date in features['week_start_date']:
        dt.append(time.mktime(datetime.datetime.strptime(date, '%Y-%m-%d').timetuple()) / 1000)
    return dt

sj_times = get_timestamp(train_features_sj)
iq_times = get_timestamp(train_features_iq)
sj_times_test = get_timestamp(test_features_sj)
iq_times_test = get_timestamp(test_features_iq)

In [22]:
train_features_sj['datetime'] = sj_times
train_features_iq['datetime'] = iq_times
test_features_sj['datetime'] = sj_times_test
test_features_iq['datetime'] = iq_times_test

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  This is separate from the ipykernel package so we can avoid doing imports until
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See

In [23]:
# dropping the city feature and the other features previously used to describe time, now that I have a timestamp

train_features_sj = train_features_sj.drop(train_features_sj.columns[[0,1,2,3]], axis=1)
train_features_iq = train_features_iq.drop(train_features_iq.columns[[0,1,2,3]], axis=1)

test_features_sj = test_features_sj.drop(test_features_sj.columns[[0,1,2,3]], axis=1)
test_features_iq = test_features_iq.drop(test_features_iq.columns[[0,1,2,3]], axis=1)

In [24]:
# filling in missing data

train_features_sj.fillna(method='bfill', inplace=True)
train_features_iq.fillna(method='bfill', inplace=True)

test_features_sj.fillna(method='bfill', inplace=True)
test_features_iq.fillna(method='bfill', inplace=True)

In [25]:
train_labels_sj = train_labels_sj.total_cases
train_labels_iq = train_labels_iq.total_cases

In [26]:
# splitting data into training set and validation set

sj_x_train, sj_x_test, sj_y_train, sj_y_test = model_selection.train_test_split(train_features_sj, train_labels_sj,
                                                                               test_size=0.2, random_state=41)

iq_x_train, iq_x_test, iq_y_train, iq_y_test = model_selection.train_test_split(train_features_iq, train_labels_iq,
                                                                               test_size=0.2, random_state=41)

In [69]:
feature_columns_city = tf.contrib.learn.infer_real_valued_columns_from_input(train_features_sj)

In [70]:
dnnreg_sj = tf.contrib.learn.DNNRegressor(feature_columns=feature_columns_city, hidden_units=[10, 20, 20])
dnnreg_iq = tf.contrib.learn.DNNRegressor(feature_columns=feature_columns_city, hidden_units=[10, 20, 20])

In [73]:
dnnreg_sj.fit(sj_x_train, sj_y_train, steps=500)

Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
Instructions for updating:
Please switch to tf.summary.scalar. Note that tf.summary.scalar uses the node name instead of the tag. This means that TensorFlow will automatically de-duplicate summary names based on the scope they are created in. Also, passing a tensor or list of tags to a scalar summary op is no longer supported.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Res

DNNRegressor(params={'head': <tensorflow.contrib.learn.python.learn.estimators.head._RegressionHead object at 0x00000256F196B940>, 'hidden_units': [10, 20, 20], 'feature_columns': (_RealValuedColumn(column_name='', dimension=21, default_value=None, dtype=tf.float64, normalizer=None),), 'optimizer': None, 'activation_fn': <function relu at 0x00000256E9442EA0>, 'dropout': None, 'gradient_clip_norm': None, 'embedding_lr_multipliers': None, 'input_layer_min_slice_size': None})

In [74]:
dnnreg_iq.fit(iq_x_train, iq_y_train, steps=600)

Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
Instructions for updating:
Please switch to tf.summary.scalar. Note that tf.summary.scalar uses the node name instead of the tag. This means that TensorFlow will automatically de-duplicate summary names based on the scope they are created in. Also, passing a tensor or list of tags to a scalar summary op is no longer supported.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Sav

DNNRegressor(params={'head': <tensorflow.contrib.learn.python.learn.estimators.head._RegressionHead object at 0x00000256F196BB70>, 'hidden_units': [10, 20, 20], 'feature_columns': (_RealValuedColumn(column_name='', dimension=21, default_value=None, dtype=tf.float64, normalizer=None),), 'optimizer': None, 'activation_fn': <function relu at 0x00000256E9442EA0>, 'dropout': None, 'gradient_clip_norm': None, 'embedding_lr_multipliers': None, 'input_layer_min_slice_size': None})

In [75]:
predictions_sj = list(dnnreg_sj.predict(sj_x_test, as_iterable=True))
predictions_iq = list(dnnreg_iq.predict(iq_x_test, as_iterable=True))

Instructions for updating:
Please switch to predict_scores, or set `outputs` argument.
Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
INFO:tensorflow:Restoring parameters from C:\Users\8050116\AppData\Local\Temp\tmpz8xri1wp\model.ckpt-1000
Instructions for updating:
Please switch to predict_scores, or set `outputs` argument.
Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
INFO:tensorflow:Restoring parameters from C:\Users\8050116\AppData\Local\Temp\tmpo3u3qlb7\model.ckpt-600


In [36]:
# turning predictions into integers, and removing negative values

def to_integer(predictions):
    predicts = []
    for prediction in predictions:
        prediction = int(prediction)
        if prediction < 0:
            prediction = 0
        predicts.append(prediction)
    return predicts

#pred_sj_int = to_integer(predictions_sj)
#pred_iq_int = to_integer(predictions_iq)

In [77]:
# evaluating model on validation set

score_sj = metrics.mean_absolute_error(sj_y_test, pred_sj_int)
score_iq = metrics.mean_absolute_error(iq_y_test, pred_iq_int)

print('Loss sj: {0:f}'.format(score_sj))

Loss sj: 25.686170


In [78]:
print('Loss iq: {0:f}'.format(score_iq))

Loss iq: 10.480769


#### Re-initializing Models to Train Over Entire Training Data

In [28]:
dnnreg_sj = tf.contrib.learn.DNNRegressor(feature_columns=feature_columns_city, hidden_units=[10, 20, 20])
dnnreg_iq = tf.contrib.learn.DNNRegressor(feature_columns=feature_columns_city, hidden_units=[10, 20, 20])

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_task_type': None, '_task_id': 0, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x00000256EFC0CEB8>, '_master': '', '_num_ps_replicas': 0, '_num_worker_replicas': 0, '_environment': 'local', '_is_chief': True, '_evaluation_master': '', '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1
}
, '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_secs': 600, '_save_checkpoints_steps': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_model_dir': None}
INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_task_type': None, '_task_id': 0, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x00000256EFBE9B70>, '_master': '', '_num_ps_replicas': 0, '_num_worker_replicas': 0, '_environment': 'local', '_is_chief': True, '_evaluation_master': '', '_tf_config': gpu_options {
  per_process_gpu_memo

In [29]:
dnnreg_sj.fit(train_features_sj, train_labels_sj_cases, steps=500)

Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
Instructions for updating:
Please switch to tf.summary.scalar. Note that tf.summary.scalar uses the node name instead of the tag. This means that TensorFlow will automatically de-duplicate summary names based on the scope they are created in. Also, passing a tensor or list of tags to a scalar summary op is no longer supported.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Sav

DNNRegressor(params={'head': <tensorflow.contrib.learn.python.learn.estimators.head._RegressionHead object at 0x00000256EF629518>, 'hidden_units': [10, 20, 20], 'feature_columns': (_RealValuedColumn(column_name='', dimension=21, default_value=None, dtype=tf.float64, normalizer=None),), 'optimizer': None, 'activation_fn': <function relu at 0x00000256E9442EA0>, 'dropout': None, 'gradient_clip_norm': None, 'embedding_lr_multipliers': None, 'input_layer_min_slice_size': None})

In [30]:
dnnreg_iq.fit(train_features_iq, train_labels_iq_cases, steps=500)

Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
Instructions for updating:
Please switch to tf.summary.scalar. Note that tf.summary.scalar uses the node name instead of the tag. This means that TensorFlow will automatically de-duplicate summary names based on the scope they are created in. Also, passing a tensor or list of tags to a scalar summary op is no longer supported.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Sav

DNNRegressor(params={'head': <tensorflow.contrib.learn.python.learn.estimators.head._RegressionHead object at 0x00000256EF69F860>, 'hidden_units': [10, 20, 20], 'feature_columns': (_RealValuedColumn(column_name='', dimension=21, default_value=None, dtype=tf.float64, normalizer=None),), 'optimizer': None, 'activation_fn': <function relu at 0x00000256E9442EA0>, 'dropout': None, 'gradient_clip_norm': None, 'embedding_lr_multipliers': None, 'input_layer_min_slice_size': None})

In [32]:
submissions_sj = list(dnnreg_sj.predict(test_features_sj, as_iterable=True))
submissions_iq = list(dnnreg_iq.predict(test_features_iq, as_iterable=True))

sub_sj_int = to_integer(submissions_sj)
sub_iq_int = to_integer(submissions_iq)

Instructions for updating:
Please switch to predict_scores, or set `outputs` argument.
Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
INFO:tensorflow:Restoring parameters from C:\Users\8050116\AppData\Local\Temp\tmpbc6upcog\model.ckpt-500
Instructions for updating:
Please switch to predict_scores, or set `outputs` argument.
Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
INFO:tensorflow:Restoring parameters from C:\Users\8050116\AppData\Local\Temp\tmp5bdhkzf8\model.ckpt-500


#### Idea for Improving from the Basline 1

The baseline above got a best score of 28.0601

The current network uses all features originally in the data, with the exception of the year, week of year, and week start date being replaced by a single datetime. Selecting a meaningful subset of data could greatly help in separating the signal from the noise. In particular, trying to avoid including two variables that are themselves correlated would be a good method to reduce the amount of degrees of freedom of the model, and consequently its flexibility, in order to reduce variance at the potential cost of more bias.

#### Idea for Improving from the Basline 2

Since Dengue cases may in fact depend weakly on many predictors, the use of boosting algorithms may be appropriate. sklearn.ensemble has a function GradientBoostingRegressor which may outperform a DNNRegressor in this context.

#### Idea for Improving from the Basline 3

Since Dengue fever is a mosquito-borne illness, engineering a new feature that captures the prevalence of mosquitos on a given day may prove to be powerful. For instance, if we know that mosquitos do not go out when the temperatures are outside of a certain range, the new feature can depend on this, and if we furthermore know how mosquito density correlates with things like humidity, then the new feature can be made to capture all of this information, while only adding a single degree of freedom to the model.

### DNNRegressor on a smaller subset of data, and a new engineered feature

In [27]:
# selecting a subset of the features
X_sj = train_features_sj[['reanalysis_avg_temp_k', 
                          'reanalysis_dew_point_temp_k', 'reanalysis_min_air_temp_k', 
                          'reanalysis_specific_humidity_g_per_kg']]
y_sj = train_labels_sj
X_iq = train_features_iq[['reanalysis_avg_temp_k', 
                          'reanalysis_dew_point_temp_k', 'reanalysis_min_air_temp_k', 
                          'reanalysis_specific_humidity_g_per_kg']]
y_iq = train_labels_iq

# test X values
X_sj_test = test_features_sj[['reanalysis_avg_temp_k', 
                              'reanalysis_dew_point_temp_k', 'reanalysis_min_air_temp_k', 
                              'reanalysis_specific_humidity_g_per_kg']]

X_iq_test = test_features_iq[['reanalysis_avg_temp_k', 
                              'reanalysis_dew_point_temp_k', 'reanalysis_min_air_temp_k', 
                              'reanalysis_specific_humidity_g_per_kg']]

In [28]:
# new feature, 0 if cold (<300 Kelvin), 1 if warm

def is_warm(features):
    warm = []
    for observation in features['reanalysis_avg_temp_k']:
        if observation < 300:
            warm.append(0)
        else:
            warm.append(1)
    return warm

warmth_sj = is_warm(X_sj)
warmth_iq = is_warm(X_iq)
warmth_sj_test = is_warm(X_sj_test)
warmth_iq_test = is_warm(X_iq_test)

X_sj['warmth'] = warmth_sj
X_iq['warmth'] = warmth_iq
X_sj_test['warmth'] = warmth_sj_test
X_iq_test['warmth'] = warmth_iq_test

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy


In [29]:
feature_columns_city = tf.contrib.learn.infer_real_valued_columns_from_input(X_sj)



In [30]:
dnnreg_sj = tf.contrib.learn.DNNRegressor(feature_columns=feature_columns_city, hidden_units=[10, 20, 20])
dnnreg_iq = tf.contrib.learn.DNNRegressor(feature_columns=feature_columns_city, hidden_units=[10, 20, 20])

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_task_type': None, '_task_id': 0, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x000001D44B7D9AC8>, '_master': '', '_num_ps_replicas': 0, '_num_worker_replicas': 0, '_environment': 'local', '_is_chief': True, '_evaluation_master': '', '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1
}
, '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_secs': 600, '_save_checkpoints_steps': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_model_dir': None}
INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_task_type': None, '_task_id': 0, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x000001D44B7D99E8>, '_master': '', '_num_ps_replicas': 0, '_num_worker_replicas': 0, '_environment': 'local', '_is_chief': True, '_evaluation_master': '', '_tf_config': gpu_options {
  per_process_gpu_memo

##### I would do the train/validation set split just by rerunning code above. Omitted here hehe..

In [31]:
dnnreg_sj.fit(X_sj, y_sj, steps=500)

Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
Instructions for updating:
Please switch to tf.summary.scalar. Note that tf.summary.scalar uses the node name instead of the tag. This means that TensorFlow will automatically de-duplicate summary names based on the scope they are created in. Also, passing a tensor or list of tags to a scalar summary op is no longer supported.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Sav

DNNRegressor(params={'head': <tensorflow.contrib.learn.python.learn.estimators.head._RegressionHead object at 0x000001D44B7D9978>, 'hidden_units': [10, 20, 20], 'feature_columns': (_RealValuedColumn(column_name='', dimension=5, default_value=None, dtype=tf.float64, normalizer=None),), 'optimizer': None, 'activation_fn': <function relu at 0x000001D4494BAEA0>, 'dropout': None, 'gradient_clip_norm': None, 'embedding_lr_multipliers': None, 'input_layer_min_slice_size': None})

In [32]:
dnnreg_iq.fit(X_iq, y_iq, steps=500)

Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
Instructions for updating:
Please switch to tf.summary.scalar. Note that tf.summary.scalar uses the node name instead of the tag. This means that TensorFlow will automatically de-duplicate summary names based on the scope they are created in. Also, passing a tensor or list of tags to a scalar summary op is no longer supported.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Sav

DNNRegressor(params={'head': <tensorflow.contrib.learn.python.learn.estimators.head._RegressionHead object at 0x000001D44E6CFE48>, 'hidden_units': [10, 20, 20], 'feature_columns': (_RealValuedColumn(column_name='', dimension=5, default_value=None, dtype=tf.float64, normalizer=None),), 'optimizer': None, 'activation_fn': <function relu at 0x000001D4494BAEA0>, 'dropout': None, 'gradient_clip_norm': None, 'embedding_lr_multipliers': None, 'input_layer_min_slice_size': None})

In [37]:
submissions_sj = list(dnnreg_sj.predict(X_sj_test, as_iterable=True))
submissions_iq = list(dnnreg_iq.predict(X_iq_test, as_iterable=True))

sub_sj_int = to_integer(submissions_sj)
sub_iq_int = to_integer(submissions_iq)

Instructions for updating:
Please switch to predict_scores, or set `outputs` argument.
Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
INFO:tensorflow:Restoring parameters from C:\Users\8050116\AppData\Local\Temp\tmp6k7n8jnd\model.ckpt-500
Instructions for updating:
Please switch to predict_scores, or set `outputs` argument.
Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
INFO:tensorflow:Restoring parameters from C:\Users\8050116\AppData\Local\Temp\tmpzn4osb4q\model.ckpt-500


## This model scores 27.3582, an improvement over the baseline!

In [38]:
for sub in sub_sj_int:
    print(sub)

33
33
33
33
33
34
34
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
34
35
35
35
34
33
33
33
33
32
32
32
33
33
33
33
33
32
32
33
32
32
32
32
32
33
33
33
33
33
33
34
34
34
34
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
34
35
34
33
33
33
33
33
33
33
33
33
33
34
33
33
34
34
33
33
33
34
35
34
34
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
34
33
33
33
32
33
33
33
32
33
33
33
33
33
33
33
32
32
32
32
33
33
33
33
33
33
33
33
34
34
34
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
34
34
33
33
33
33
33
33
33
32
32
33
33
32
33
33
33
32
33
33
33
33
33
33
33
34
34
33
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
34
33
33
33
33
33
32
33
33
33
33
33
33
33
32
32
33
33
33
33
33


In [39]:
for sub in sub_iq_int:
    print(sub)

7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
