# Using Tensorflow with H2O 

This notebook shows how to use the tensorflow backend to tackle a simple image classification problem.

We start by connecting to our h2o cluster:

In [12]:
import h2o
h2o.init(port=54321, nthreads=-1)

Checking whether there is an H2O instance running at http://localhost:54321. connected.


0,1
H2O cluster uptime:,40 secs
H2O cluster version:,3.11.0.99999
H2O cluster version age:,1 day
H2O cluster name:,fmilo
H2O cluster total nodes:,1
H2O cluster free memory:,12.48 Gb
H2O cluster total cores:,40
H2O cluster allowed cores:,40
H2O cluster status:,"accepting new members, healthy"
H2O connection url:,http://localhost:54321


Then we make sure that the H2O cluster has the DeepWater distribution

In [13]:
from h2o.estimators.deepwater import H2ODeepWaterEstimator
if not H2ODeepWaterEstimator.available(): exit

Load some python utilities library 

In [14]:
import sys, os
import os.path
import pandas as pd
import numpy as np
import random

and finally we configure the IPython notebook to have nice visualizations

In [15]:
%matplotlib inline
from IPython.display import Image, display, HTML
import matplotlib.pyplot as plt

## Configuration

Set the path to your h2o installation
and download the 'bigdata' dataset using `./gradlew syncBigdataLaptop` from the H2O source distribution.

In [18]:
H2O_PATH=os.path.expanduser("~/h2o-3/")

## Image Classification Task

H2O DeepWater allows you to specify a list of URIs (file paths) or URLs (links) to images, together with a response column (either a class membership (enum) or regression target (numeric)).

For this example, we use a small dataset that has a few hundred images, and three classes: cat, dog and mouse.

In [19]:
frame = h2o.import_file(H2O_PATH + "/bigdata/laptop/deepwater/imagenet/cat_dog_mouse.csv")
print(frame.dim)
print(frame.head(5))

Parse progress: |█████████████████████████████████████████████████████████| 100%
[267, 2]


C1,C2
bigdata/laptop/deepwater/imagenet/cat/102194502_49f003abd9.jpg,cat
bigdata/laptop/deepwater/imagenet/cat/11146807_00a5f35255.jpg,cat
bigdata/laptop/deepwater/imagenet/cat/1140846215_70e326f868.jpg,cat
bigdata/laptop/deepwater/imagenet/cat/114170569_6cbdf4bbdb.jpg,cat
bigdata/laptop/deepwater/imagenet/cat/1217664848_de4c7fc296.jpg,cat





To build a LeNet image classification model in H2O, simply specify `network = "lenet"` and the **Tensorflow** backend to use the tensorflow lenet implementation:

In [20]:
model = H2ODeepWaterEstimator(epochs=500, network = "lenet", backend="tensorflow")
model.train(x=[0],y=1, training_frame=frame)
model.show()

deepwater Model Build progress: |█████████████████████████████████████████| 100%
Model Details
H2ODeepWaterEstimator :  Deep Water
Model Key:  DeepWater_model_python_1479433139451_1
Status of Deep Learning Model: lenet, 4.9 MB, predicting C2, 3-class classification, 134,144 training samples, mini-batch size 32



0,1,2,3
,input_neurons,rate,momentum
,2352,0.0044086,0.99




ModelMetricsMultinomial: deepwater
** Reported on train data. **

MSE: 0.243990757963
RMSE: 0.493954206342
LogLoss: 0.877531867391
Mean Per-Class Error: 0.315439992422
Confusion Matrix: vertical: actual; across: predicted



0,1,2,3,4
cat,dog,mouse,Error,Rate
74.0,3.0,13.0,0.1777778,16 / 90
29.0,40.0,16.0,0.5294118,45 / 85
20.0,2.0,70.0,0.2391304,22 / 92
123.0,45.0,99.0,0.3108614,83 / 267


Top-3 Hit Ratios: 


0,1
k,hit_ratio
1,0.6891386
2,0.8876405
3,1.0


Scoring History: 


0,1,2,3,4,5,6,7,8,9
,timestamp,duration,training_speed,epochs,iterations,samples,training_rmse,training_logloss,training_classification_error
,2016-11-17 17:40:51,0.000 sec,,0.0,0,0.0,,,
,2016-11-17 17:40:54,4.655 sec,419 obs/sec,3.8352060,1,1024.0,0.8095751,7.6691595,0.6554307
,2016-11-17 17:40:59,9.717 sec,3967 obs/sec,111.2209738,29,29696.0,0.8095645,7.0473993,0.6554307
,2016-11-17 17:41:04,14.750 sec,4671 obs/sec,218.6067416,57,58368.0,0.6207586,1.0507603,0.5093633
,2016-11-17 17:41:09,19.802 sec,4965 obs/sec,325.9925094,85,87040.0,0.4939542,0.8775319,0.3108614
,2016-11-17 17:41:14,24.965 sec,5194 obs/sec,441.0486891,115,117760.0,0.8256194,12.4554604,0.6816479
,2016-11-17 17:41:17,27.778 sec,5267 obs/sec,502.4119850,131,134144.0,0.8256193,12.4109633,0.6816479
,2016-11-17 17:41:17,27.979 sec,5252 obs/sec,502.4119850,131,134144.0,0.4939542,0.8775319,0.3108614


If you'd like to build your own Tensorflow network architecture, then this is easy as well.
In this example script, we are using the **Tensorflow** backend. 
Models can easily be imported/exported between H2O and Tensorflow since H2O uses Tensorflow's format for model definition.

In [21]:
def simple_model(w, h, channels, classes):
    import json
    import tensorflow as tf    
    # always create a new graph inside ipython or
    # the default one will be used and can lead to
    # unexpected behavior
    graph = tf.Graph() 
    with graph.as_default():
        size = w * h * channels
        x = tf.placeholder(tf.float32, [None, size])
        W = tf.Variable(tf.zeros([size, classes]))
        b = tf.Variable(tf.zeros([classes]))
        y = tf.matmul(x, W) + b

        # labels
        y_ = tf.placeholder(tf.float32, [None, classes])
     
        # accuracy
        correct_prediction = tf.equal(tf.argmax(y, 1),                                                                                                                                                                                                                                   
                                       tf.argmax(y_, 1))                       
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
        
        # train
        cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y, y_))
        train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
        
        tf.add_to_collection("train", train_step)
        # this is required by the h2o tensorflow backend
        global_step = tf.Variable(0, name="global_step", trainable=False)
        
        init = tf.initialize_all_variables()
        tf.add_to_collection("init", init)
        tf.add_to_collection("logits", y)
        saver = tf.train.Saver()
        meta = json.dumps({
                "inputs": {"batch_image_input": x.name, "categorical_labels": y_.name}, 
                "outputs": {"categorical_logits": y.name}, 
                "metrics": {"accuracy": accuracy.name, "total_loss": cross_entropy.name},
                "parameters": {"global_step": global_step.name},
        })
        print(meta)
        tf.add_to_collection("meta", meta)
        filename = "/tmp/lenet_tensorflow.meta"
        tf.train.export_meta_graph(filename, saver_def=saver.as_saver_def())
    return filename

In [22]:
filename = simple_model(28, 28, 3, classes=3)

{"metrics": {"total_loss": "Mean_1:0", "accuracy": "Mean:0"}, "inputs": {"categorical_labels": "Placeholder_1:0", "batch_image_input": "Placeholder:0"}, "parameters": {"global_step": "global_step:0"}, "outputs": {"categorical_logits": "add:0"}}


In [23]:
model = H2ODeepWaterEstimator(epochs=500, 
                              network_definition_file=filename,  ## specify the model
                              image_shape=[28,28],  ## provide expected (or matching) image size
                              channels=3,
                              backend="tensorflow", 
                             ) 
model.train(x=[0], y=1, training_frame=frame)
model.show()

deepwater Model Build progress: |█████████████████████████████████████████| 100%
Model Details
H2ODeepWaterEstimator :  Deep Water
Model Key:  DeepWater_model_python_1479433139451_2
Status of Deep Learning Model: user, 41.7 KB, predicting C2, 3-class classification, 134,144 training samples, mini-batch size 32



0,1,2,3
,input_neurons,rate,momentum
,2352,0.0044086,0.99




ModelMetricsMultinomial: deepwater
** Reported on train data. **

MSE: 6.60075876885e+12
RMSE: 2569194.18668
LogLoss: -14.4921790248
Mean Per-Class Error: 0.0
Confusion Matrix: vertical: actual; across: predicted



0,1,2,3,4
cat,dog,mouse,Error,Rate
90.0,0.0,0.0,0.0,0 / 90
0.0,85.0,0.0,0.0,0 / 85
0.0,0.0,92.0,0.0,0 / 92
90.0,85.0,92.0,0.0,0 / 267


Top-3 Hit Ratios: 


0,1
k,hit_ratio
1,1.0
2,1.0
3,1.0


Scoring History: 


0,1,2,3,4,5,6,7,8,9
,timestamp,duration,training_speed,epochs,iterations,samples,training_rmse,training_logloss,training_classification_error
,2016-11-17 17:44:13,0.000 sec,,0.0,0,0.0,,,
,2016-11-17 17:44:15,1.936 sec,575 obs/sec,3.8352060,1,1024.0,1327446.7778600,,0.5655431
,2016-11-17 17:44:20,7.009 sec,7934 obs/sec,203.2659176,53,54272.0,2569194.1866800,-14.4921790,0.0
,2016-11-17 17:44:25,12.155 sec,8733 obs/sec,391.1910112,102,104448.0,2569194.1866800,-14.4921790,0.0
,2016-11-17 17:44:30,16.732 sec,8118 obs/sec,502.4119850,131,134144.0,2569194.1866800,-14.4921790,0.0
