In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
from __future__ import division, print_function, absolute_import

In [None]:
import tensorflow as tf
import pandas as pd
import numpy as np
import seaborn as sns

# Using the Model to Make Predictions 

First, let's locate the latest model: It's called ```saved_model.pb```.

In [None]:
with open('temp_dir.txt') as file:
    temp_dir = file.read()
import os
model_dir = os.path.join(temp_dir, "models/export/exporter")
versions = !ls $model_dir
print( "Versions: %s" % versions)
latest_version = max(versions)
latest_model = os.path.join(model_dir, str(latest_version))
!echo $latest_model
!ls $latest_model

Create an estimator from that model:

In [None]:
estimator = tf.contrib.predictor.from_saved_model(latest_model)

Use it to predict the humidity for a single record:

In [None]:
sample = {
    'beta1': [[1.234],[1.234]],
    'beta2': [[1.234],[1.234]],
    'weekday': [[5], [6]],
    'hour': [[16], [17]]
}

In [None]:
estimator(sample)

### Verifying prediction quality against the test set

In [None]:
!ls $temp_dir

In [None]:
test_data = pd.read_csv(os.path.join(temp_dir, "signature_test.csv"))
test_data.head()

In [None]:
test_dict = test_data.drop('humidity', axis=1).to_dict(orient='list')
test_dict = { key: np.reshape(item, [-1,1]) for key, item in test_dict.items()}

In [None]:
predicted = estimator(test_dict)
len(predicted['output'])

In [None]:
test_data['predicted'] = predicted['output'].reshape(-1)

In [None]:
test_data.head()

In [None]:
%matplotlib inline
test_data[:500].plot.scatter(x='humidity', y='predicted');

Our prediction strongly correlates with the measured humidity.

In [None]:
test_data['diff'] = test_data['humidity'] - test_data['predicted']

In [None]:
test_data['diff'].hist(bins=100);

And also, the error distribution is truly convincing. The remaining error is almost perfectly Gaussian.

In [None]:
from matplotlib import pyplot as plt
plt.figure(figsize=(8,4))
sns.heatmap(test_data.pivot_table(
    index='weekday', columns='hour', 
    values='predicted', aggfunc='mean'), cmap='BuPu');

The heatmap of predictions, averaged over $\beta_1$ and $\beta_2$ clearly shows that the model has figured out the anomaly that can be observed Mon-Wed between 18:00 and 21:00 and Fri-Sat between 14:00 and 16:00 hours.