Table of Contents
===

<a href="#about">About this notebook</a> <br />
<a href="#setup">Setting things up</a>

Local Experience
1. <a href="#local_preprocessing">Local preprocessing starting from csv files</a>
1. <a href="#local_training">Local training</a>
1. <a href="#local_prediction">Local prediction</a>
1. <a href="#local_batch_prediction">Local batch prediction</a>

Cloud Experience
1. <a href="#cloud_preprocessing">Cloud preprocessing starting from csv files</a>
1. <a href="#cloud_training">Cloud training</a>
1. <a href="#cloud_online_prediction">Cloud prediction</a>
1. <a href="#cloud_batch_prediction">Clod batch prediction</a>

<a name="about"></a>
About this notebook
======

This notebook uses the datalab structured data package for building and running a Tensorflow classification problems locally, and using Google Compute Platform services. This notebook uses the classic <a href="https://en.wikipedia.org/wiki/Iris_flower_data_set">Iris flower data set.</a>

<a name="setup"></a>
Setting things up
=====

In [1]:
%projects set cloud-ml-dev

In [2]:
%bash
gcloud config set project cloud-ml-dev

Updated property [core/project].


In [3]:
%bash
gcloud config set compute/region us-central1

Updated property [compute/region].


In [4]:
!gcloud beta ml init-project -q

Added serviceAccount:cloud-ml-service@cml-236417448818.iam.gserviceaccount.com as an Editor to project 'cloud-ml-dev'.


In [5]:
!pip install -U tensorflow==0.12.1
#!wget https://storage.googleapis.com/cloud-datalab/deploy/tf/tensorflow-1.0.0rc1-cp27-none-linux_x86_64.whl && \
#   pip install --upgrade-strategy only-if-needed --no-cache-dir tensorflow-1.0.0rc1-cp27-none-linux_x86_64.whl && \
#   rm tensorflow-1.0.0rc1-cp27-none-linux_x86_64.whl
#!pip install --upgrade --force-reinstall /content/pydatalab/solutionbox/structured_data/dist/structured_data-0.0.1.tar.gz


['Collecting tensorflow==1.0.0',
 '  Using cached tensorflow-1.0.0-cp27-cp27mu-manylinux1_x86_64.whl',
 'Requirement already up-to-date: numpy>=1.11.0 in /usr/local/lib/python2.7/dist-packages (from tensorflow==1.0.0)',
 'Requirement already up-to-date: mock>=2.0.0 in /usr/local/lib/python2.7/dist-packages (from tensorflow==1.0.0)',
 'Requirement already up-to-date: wheel in /usr/local/lib/python2.7/dist-packages (from tensorflow==1.0.0)',
 'Requirement already up-to-date: six>=1.10.0 in /usr/local/lib/python2.7/dist-packages (from tensorflow==1.0.0)',
 'Requirement already up-to-date: protobuf>=3.1.0 in /usr/local/lib/python2.7/dist-packages (from tensorflow==1.0.0)',
 'Requirement already up-to-date: funcsigs>=1; python_version < "3.3" in /usr/local/lib/python2.7/dist-packages (from mock>=2.0.0->tensorflow==1.0.0)',
 'Requirement already up-to-date: pbr>=0.11 in /usr/local/lib/python2.7/dist-packages (from mock>=2.0.0->tensorflow==1.0.0)',
 'Requirement already up-to-date: setuptools

In [93]:

import datalab_solutions.structured_data as sd

Lets look at the versions of structured_data and TF we have

In [7]:
import tensorflow as tf
import os
print('tf ' + str(tf.__version__))
print('sd ' + str(sd.__version__))

tf 0.12.1
sd 0.0.1


This notebook will write files during preprocessing, training, and prediction. Please give a root folder you wish to write use for local and cloud usage.

In [62]:
LOCAL_ROOT = '/content/Downloads/iris_notebook_workspace'
if not os.path.exists(LOCAL_ROOT):
  os.mkdirs(LOCAL_ROOT)

In [63]:
CLOUD_ROOT = 'gs://cloud-ml-dev_bdt/iris_notebook_workspace'
from tensorflow.python.lib.io import file_io
if not file_io.file_exists(CLOUD_ROOT):
  file_io.recursive_create_dir(CLOUD_ROOT)

The iris dataset is small, so the data is embedded into this notebook. Write the iris data set into 3 files: training, eval, prediction. Note that the target column has to be the first column in the dataset. Also, the prediction dataset does not have target values. 

In [64]:
%writefile {LOCAL_ROOT}/train.csv
Iris-setosa,4,4.6,3.1,1.5,0.2
Iris-setosa,20,5.1,3.8,1.5,0.3
Iris-setosa,43,4.4,3.2,1.3,0.2
Iris-versicolor,88,6.3,2.3,4.4,1.3
Iris-versicolor,76,6.6,3,4.4,1.4
Iris-versicolor,63,6,2.2,4,1
Iris-setosa,47,5.1,3.8,1.6,0.2
Iris-virginica,146,6.7,3,5.2,2.3
Iris-versicolor,53,6.9,3.1,4.9,1.5
Iris-versicolor,71,5.9,3.2,4.8,1.8
Iris-virginica,144,6.8,3.2,5.9,2.3
Iris-virginica,124,6.3,2.7,4.9,1.8
Iris-virginica,122,5.6,2.8,4.9,2
Iris-setosa,17,5.4,3.9,1.3,0.4
Iris-setosa,7,4.6,3.4,1.4,0.3
Iris-versicolor,87,6.7,3.1,4.7,1.5
Iris-virginica,131,7.4,2.8,6.1,1.9
Iris-setosa,2,4.9,3,1.4,0.2
Iris-virginica,147,6.3,2.5,5,1.9
Iris-setosa,29,5.2,3.4,1.4,0.2
Iris-versicolor,91,5.5,2.6,4.4,1.2
Iris-virginica,110,7.2,3.6,6.1,2.5
Iris-virginica,121,6.9,3.2,5.7,2.3
Iris-setosa,45,5.1,3.8,1.9,0.4
Iris-setosa,10,4.9,3.1,1.5,0.1
Iris-setosa,36,5,3.2,1.2,0.2
Iris-virginica,112,6.4,2.7,5.3,1.9
Iris-setosa,46,4.8,3,1.4,0.3
Iris-virginica,132,7.9,3.8,6.4,2
Iris-versicolor,77,6.8,2.8,4.8,1.4
Iris-setosa,6,5.4,3.9,1.7,0.4
Iris-versicolor,90,5.5,2.5,4,1.3
Iris-virginica,137,6.3,3.4,5.6,2.4
Iris-setosa,31,4.8,3.1,1.6,0.2
Iris-virginica,120,6,2.2,5,1.5
Iris-virginica,138,6.4,3.1,5.5,1.8
Iris-setosa,24,5.1,3.3,1.7,0.5
Iris-versicolor,96,5.7,3,4.2,1.2
Iris-versicolor,68,5.8,2.7,4.1,1
Iris-virginica,150,5.9,3,5.1,1.8
Iris-setosa,26,5,3,1.6,0.2
Iris-versicolor,98,6.2,2.9,4.3,1.3
Iris-versicolor,80,5.7,2.6,3.5,1
Iris-versicolor,72,6.1,2.8,4,1.3
Iris-versicolor,75,6.4,2.9,4.3,1.3
Iris-setosa,38,4.9,3.1,1.5,0.1
Iris-setosa,35,4.9,3.1,1.5,0.1
Iris-versicolor,89,5.6,3,4.1,1.3
Iris-versicolor,84,6,2.7,5.1,1.6
Iris-versicolor,51,7,3.2,4.7,1.4
Iris-virginica,116,6.4,3.2,5.3,2.3
Iris-versicolor,54,5.5,2.3,4,1.3
Iris-virginica,130,7.2,3,5.8,1.6
Iris-virginica,115,5.8,2.8,5.1,2.4
Iris-setosa,32,5.4,3.4,1.5,0.4
Iris-virginica,104,6.3,2.9,5.6,1.8
Iris-versicolor,64,6.1,2.9,4.7,1.4
Iris-setosa,18,5.1,3.5,1.4,0.3
Iris-versicolor,66,6.7,3.1,4.4,1.4
Iris-setosa,15,5.8,4,1.2,0.2
Iris-versicolor,52,6.4,3.2,4.5,1.5
Iris-virginica,103,7.1,3,5.9,2.1
Iris-setosa,9,4.4,2.9,1.4,0.2
Iris-versicolor,83,5.8,2.7,3.9,1.2
Iris-virginica,135,6.1,2.6,5.6,1.4
Iris-virginica,139,6,3,4.8,1.8
Iris-versicolor,85,5.4,3,4.5,1.5
Iris-virginica,106,7.6,3,6.6,2.1
Iris-setosa,27,5,3.4,1.6,0.4
Iris-virginica,140,6.9,3.1,5.4,2.1
Iris-versicolor,67,5.6,3,4.5,1.5
Iris-setosa,12,4.8,3.4,1.6,0.2
Iris-versicolor,56,5.7,2.8,4.5,1.3
Iris-virginica,113,6.8,3,5.5,2.1
Iris-versicolor,62,5.9,3,4.2,1.5
Iris-virginica,145,6.7,3.3,5.7,2.5
Iris-virginica,111,6.5,3.2,5.1,2
Iris-virginica,141,6.7,3.1,5.6,2.4
Iris-setosa,34,5.5,4.2,1.4,0.2
Iris-versicolor,81,5.5,2.4,3.8,1.1
Iris-setosa,8,5,3.4,1.5,0.2
Iris-virginica,129,6.4,2.8,5.6,2.1
Iris-versicolor,57,6.3,3.3,4.7,1.6
Iris-virginica,128,6.1,3,4.9,1.8
Iris-virginica,119,7.7,2.6,6.9,2.3
Iris-virginica,126,7.2,3.2,6,1.8
Iris-versicolor,58,4.9,2.4,3.3,1
Iris-virginica,117,6.5,3,5.5,1.8
Iris-virginica,127,6.2,2.8,4.8,1.8
Iris-setosa,16,5.7,4.4,1.5,0.4
Iris-setosa,3,4.7,3.2,1.3,0.2
Iris-virginica,108,7.3,2.9,6.3,1.8
Iris-virginica,118,7.7,3.8,6.7,2.2
Iris-setosa,42,4.5,2.3,1.3,0.3
Iris-virginica,142,6.9,3.1,5.1,2.3
Iris-setosa,14,4.3,3,1.1,0.1
Iris-virginica,134,6.3,2.8,5.1,1.5
Iris-versicolor,94,5,2.3,3.3,1
Iris-setosa,19,5.7,3.8,1.7,0.3
Iris-virginica,133,6.4,2.8,5.6,2.2
Iris-virginica,114,5.7,2.5,5,2
Iris-versicolor,86,6,3.4,4.5,1.6
Iris-versicolor,93,5.8,2.6,4,1.2
Iris-versicolor,92,6.1,3,4.6,1.4
Iris-virginica,109,6.7,2.5,5.8,1.8
Iris-virginica,102,5.8,2.7,5.1,1.9
Iris-setosa,41,5,3.5,1.3,0.3
Iris-versicolor,60,5.2,2.7,3.9,1.4
Iris-virginica,105,6.5,3,5.8,2.2
Iris-versicolor,65,5.6,2.9,3.6,1.3
Iris-setosa,28,5.2,3.5,1.5,0.2
Iris-versicolor,82,5.5,2.4,3.7,1
Iris-setosa,25,4.8,3.4,1.9,0.2
Iris-versicolor,79,6,2.9,4.5,1.5
Iris-setosa,1,5.1,3.5,1.4,0.2
Iris-versicolor,61,5,2,3.5,1
Iris-virginica,149,6.2,3.4,5.4,2.3
Iris-setosa,48,4.6,3.2,1.4,0.2
Iris-setosa,22,5.1,3.7,1.5,0.4
Iris-setosa,30,4.7,3.2,1.6,0.2

Overwriting /content/Downloads/iris_notebook_workspace/train.csv


In [65]:
%writefile {LOCAL_ROOT}/eval.csv
Iris-virginica,107,4.9,2.5,4.5,1.7
Iris-versicolor,100,5.7,2.8,4.1,1.3
Iris-versicolor,99,5.1,2.5,3,1.1
Iris-setosa,13,4.8,3,1.4,0.1
Iris-versicolor,70,5.6,2.5,3.9,1.1
Iris-setosa,11,5.4,3.7,1.5,0.2
Iris-setosa,37,5.5,3.5,1.3,0.2
Iris-versicolor,69,6.2,2.2,4.5,1.5
Iris-setosa,40,5.1,3.4,1.5,0.2
Iris-virginica,101,6.3,3.3,6,2.5
Iris-setosa,39,4.4,3,1.3,0.2
Iris-versicolor,74,6.1,2.8,4.7,1.2
Iris-versicolor,97,5.7,2.9,4.2,1.3
Iris-setosa,50,5,3.3,1.4,0.2
Iris-versicolor,95,5.6,2.7,4.2,1.3
Iris-setosa,44,5,3.5,1.6,0.6
Iris-virginica,123,7.7,2.8,6.7,2
Iris-setosa,23,4.6,3.6,1,0.2
Iris-versicolor,59,6.6,2.9,4.6,1.3
Iris-virginica,148,6.5,3,5.2,2
Iris-versicolor,55,6.5,2.8,4.6,1.5
Iris-setosa,49,5.3,3.7,1.5,0.2
Iris-versicolor,78,6.7,3,5,1.7
Iris-versicolor,73,6.3,2.5,4.9,1.5
Iris-virginica,136,7.7,3,6.1,2.3
Iris-setosa,33,5.2,4.1,1.5,0.1
Iris-virginica,125,6.7,3.3,5.7,2.1
Iris-virginica,143,5.8,2.7,5.1,1.9
Iris-setosa,21,5.4,3.4,1.7,0.2
Iris-setosa,5,5,3.6,1.4,0.2

Overwriting /content/Downloads/iris_notebook_workspace/eval.csv


In [66]:
%writefile {LOCAL_ROOT}/predict.csv
107,4.9,2.5,4.5,1.7
100,5.7,2.8,4.1,1.3
99,5.1,2.5,3,1.1
13,4.8,3,1.4,0.1
70,5.6,2.5,3.9,1.1
11,5.4,3.7,1.5,0.2
37,5.5,3.5,1.3,0.2
69,6.2,2.2,4.5,1.5
40,5.1,3.4,1.5,0.2
101,6.3,3.3,6,2.5
39,4.4,3,1.3,0.2
74,6.1,2.8,4.7,1.2
97,5.7,2.9,4.2,1.3
50,5,3.3,1.4,0.2
95,5.6,2.7,4.2,1.3
44,5,3.5,1.6,0.6
123,7.7,2.8,6.7,2
23,4.6,3.6,1,0.2
59,6.6,2.9,4.6,1.3
148,6.5,3,5.2,2
55,6.5,2.8,4.6,1.5
49,5.3,3.7,1.5,0.2
78,6.7,3,5,1.7
73,6.3,2.5,4.9,1.5
136,7.7,3,6.1,2.3
33,5.2,4.1,1.5,0.1
125,6.7,3.3,5.7,2.1
143,5.8,2.7,5.1,1.9
21,5.4,3.4,1.7,0.2
5,5,3.6,1.4,0.2

Overwriting /content/Downloads/iris_notebook_workspace/predict.csv


<a name="local_preprocessing"></a>
Local preprocessing starting from csv files
=====

A schema file is used to describe each column of the csv files. It is assumed that the train, eval, and prediction csv files all have the same schema, but the prediction file is allowed to have a missing target column. The format of the  schema file is a valid BigQuery table schema file. This allows BigQuery to be used later in cloud preprocessing. Only 3 BigQuery types are supported: STRING (for categorical columns) and INTEGER and FLOAT (for numerical columns).

In [67]:
%writefile {LOCAL_ROOT}/schema.json
[
    {
        "mode": "NULLABLE",
        "name": "flower",
        "type": "STRING"
    },
    {
        "mode": "REQUIRED",
        "name": "key",
        "type": "INTEGER"
    },
    {
        "mode": "NULLABLE",
        "name": "sepal_length",
        "type": "FLOAT"
    },
    {
        "mode": "NULLABLE",
        "name": "sepal_width",
        "type": "FLOAT"
    },
    {
        "mode": "NULLABLE",
        "name": "petal_length",
        "type": "FLOAT"
    },
    {
        "mode": "NULLABLE",
        "name": "petal_width",
        "type": "FLOAT"
    }   
]

Overwriting /content/Downloads/iris_notebook_workspace/schema.json


In [68]:
!rm -f {LOCAL_ROOT}/preprocess

rm: cannot remove '/content/Downloads/iris_notebook_workspace/preprocess': Is a directory


In [69]:
sd.local_preprocess(
  input_file_pattern=os.path.join(LOCAL_ROOT, 'train.*'),
  output_dir=os.path.join(LOCAL_ROOT, 'preprocess'),
  schema_file=os.path.join(LOCAL_ROOT, 'schema.json'),
)

Starting local preprocessing.
Local preprocessing done.


The output of preprocessing is a numerical_analysis file that contains analysis from the numerical columns, and a vocab file from each categorical column. The files preoduced by preprocessing are consumed in training, and you should not have to worry about these files. Just for fun, lets look at them.

In [70]:
!ls  {LOCAL_ROOT}/preprocess

numerical_analysis.json  schema.json  vocab_flower.csv


In [71]:
!cat {LOCAL_ROOT}/preprocess/numerical_analysis.json

{
  "sepal_width": {
    "max": 4.4,
    "mean": 3.050833333333332,
    "min": 2.0
  },
  "petal_width": {
    "max": 2.5,
    "mean": 1.2324999999999995,
    "min": 0.1
  },
  "sepal_length": {
    "max": 7.9,
    "mean": 5.867500000000002,
    "min": 4.3
  },
  "key": {
    "max": 150.0,
    "mean": 76.73333333333333,
    "min": 1.0
  },
  "petal_length": {
    "max": 6.9,
    "mean": 3.830833333333335,
    "min": 1.1
  }
}

In [72]:
!cat {LOCAL_ROOT}/preprocess/schema.json

[
    {
        "mode": "NULLABLE",
        "name": "flower",
        "type": "STRING"
    },
    {
        "mode": "REQUIRED",
        "name": "key",
        "type": "INTEGER"
    },
    {
        "mode": "NULLABLE",
        "name": "sepal_length",
        "type": "FLOAT"
    },
    {
        "mode": "NULLABLE",
        "name": "sepal_width",
        "type": "FLOAT"
    },
    {
        "mode": "NULLABLE",
        "name": "petal_length",
        "type": "FLOAT"
    },
    {
        "mode": "NULLABLE",
        "name": "petal_width",
        "type": "FLOAT"
    }   
]

In [73]:
!cat {LOCAL_ROOT}/preprocess/vocab_flower.csv

Iris-virginica
Iris-setosa
Iris-versicolor

<a name="local_training"></a>
Local Training
===========

The files in the output folder of preprocessing is consumed by the trainer. The structured data package will automatically pick transforms to perform on each column of data depending on the problem and the data type. Lets first run the trainer with all the defaults. When using all the defaults the key_column parameter must be used to tell which column is the key column.

In [74]:
!rm -fr {LOCAL_ROOT}/training

In [75]:
sd.local_train(
  train_file_pattern=os.path.join(LOCAL_ROOT, 'train.*'),
  eval_file_pattern=os.path.join(LOCAL_ROOT, 'eval.*'),
  preprocess_output_dir=os.path.join(LOCAL_ROOT, 'preprocess'),
  output_dir=os.path.join(LOCAL_ROOT, 'training'),
  key_column='key',
  model_type='dnn_classification',
  top_n=3,
  max_steps=250,
  layer_sizes=[10, 8, 5]
)

Starting local training.
INFO:tensorflow:Using config: {'save_summary_steps': 100, '_num_ps_replicas': 0, '_task_type': None, '_environment': 'local', '_is_chief': True, 'save_checkpoints_secs': 600, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fde24057b10>, 'tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1.0
}
, '_task_id': 0, 'tf_random_seed': None, 'keep_checkpoint_every_n_hours': 10000, '_evaluation_master': '', 'save_checkpoints_steps': None, '_master': '', 'keep_checkpoint_max': 5}


INFO:tensorflow:Using config: {'save_summary_steps': 100, '_num_ps_replicas': 0, '_task_type': None, '_environment': 'local', '_is_chief': True, 'save_checkpoints_secs': 600, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fde24057b10>, 'tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1.0
}
, '_task_id': 0, 'tf_random_seed': None, 'keep_checkpoint_every_n_hours': 10000, '_evaluation_master': '', 'save_checkpoints_steps': None, '_master': '', 'keep_checkpoint_max': 5}


Instructions for updating:
Monitors are deprecated. Please use tf.train.SessionRunHook.


Instructions for updating:
Monitors are deprecated. Please use tf.train.SessionRunHook.


Instructions for updating:
local_eval_frequency is deprecated as local_run will be renamed to train_and_evaluate. Use min_eval_frequency and call train_and_evaluate instead. Note, however, that the default for min_eval_frequency is 1, meaning models will be evaluated every time a new checkpoint is available. In contrast, the default for local_eval_frequency is None, resulting in evaluation occurring only after training has completed. min_eval_frequency is ignored when calling the deprecated local_run.


Instructions for updating:
local_eval_frequency is deprecated as local_run will be renamed to train_and_evaluate. Use min_eval_frequency and call train_and_evaluate instead. Note, however, that the default for min_eval_frequency is 1, meaning models will be evaluated every time a new checkpoint is available. In contrast, the default for local_eval_frequency is None, resulting in evaluation occurring only after training has completed. min_eval_frequency is ignored when calling the deprecated local_run.


Instructions for updating:
Monitors are deprecated. Please use tf.train.SessionRunHook.


Instructions for updating:
Monitors are deprecated. Please use tf.train.SessionRunHook.


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


INFO:tensorflow:Summary name dnn/hiddenlayer_0:fraction_of_zero_values is illegal; using dnn/hiddenlayer_0_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:fraction_of_zero_values is illegal; using dnn/hiddenlayer_0_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:activation is illegal; using dnn/hiddenlayer_0_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:activation is illegal; using dnn/hiddenlayer_0_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:fraction_of_zero_values is illegal; using dnn/hiddenlayer_1_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:fraction_of_zero_values is illegal; using dnn/hiddenlayer_1_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:activation is illegal; using dnn/hiddenlayer_1_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:activation is illegal; using dnn/hiddenlayer_1_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:fraction_of_zero_values is illegal; using dnn/hiddenlayer_2_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:fraction_of_zero_values is illegal; using dnn/hiddenlayer_2_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:activation is illegal; using dnn/hiddenlayer_2_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:activation is illegal; using dnn/hiddenlayer_2_activation instead.


INFO:tensorflow:Summary name dnn/logits:fraction_of_zero_values is illegal; using dnn/logits_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/logits:fraction_of_zero_values is illegal; using dnn/logits_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/logits:activation is illegal; using dnn/logits_activation instead.


INFO:tensorflow:Summary name dnn/logits:activation is illegal; using dnn/logits_activation instead.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:loss = 1.38405, step = 1


INFO:tensorflow:loss = 1.38405, step = 1


INFO:tensorflow:Saving checkpoints for 1 into /content/Downloads/iris_notebook_workspace/training/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 1 into /content/Downloads/iris_notebook_workspace/training/train/model.ckpt.


























INFO:tensorflow:Summary name dnn/hiddenlayer_0:fraction_of_zero_values is illegal; using dnn/hiddenlayer_0_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:fraction_of_zero_values is illegal; using dnn/hiddenlayer_0_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:activation is illegal; using dnn/hiddenlayer_0_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:activation is illegal; using dnn/hiddenlayer_0_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:fraction_of_zero_values is illegal; using dnn/hiddenlayer_1_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:fraction_of_zero_values is illegal; using dnn/hiddenlayer_1_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:activation is illegal; using dnn/hiddenlayer_1_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:activation is illegal; using dnn/hiddenlayer_1_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:fraction_of_zero_values is illegal; using dnn/hiddenlayer_2_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:fraction_of_zero_values is illegal; using dnn/hiddenlayer_2_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:activation is illegal; using dnn/hiddenlayer_2_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:activation is illegal; using dnn/hiddenlayer_2_activation instead.


INFO:tensorflow:Summary name dnn/logits:fraction_of_zero_values is illegal; using dnn/logits_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/logits:fraction_of_zero_values is illegal; using dnn/logits_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/logits:activation is illegal; using dnn/logits_activation instead.


INFO:tensorflow:Summary name dnn/logits:activation is illegal; using dnn/logits_activation instead.


























INFO:tensorflow:/content/Downloads/iris_notebook_workspace/training/intermediate_models/00000001-tmp/export is not in all_model_checkpoint_paths. Manually adding it.


INFO:tensorflow:/content/Downloads/iris_notebook_workspace/training/intermediate_models/00000001-tmp/export is not in all_model_checkpoint_paths. Manually adding it.


INFO:tensorflow:loss = 0.529453, step = 101


INFO:tensorflow:loss = 0.529453, step = 101


INFO:tensorflow:global_step/sec: 41.0711


INFO:tensorflow:global_step/sec: 41.0711


INFO:tensorflow:loss = 0.0818374, step = 201


INFO:tensorflow:loss = 0.0818374, step = 201


INFO:tensorflow:global_step/sec: 48.5422


INFO:tensorflow:global_step/sec: 48.5422


INFO:tensorflow:Saving checkpoints for 250 into /content/Downloads/iris_notebook_workspace/training/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 250 into /content/Downloads/iris_notebook_workspace/training/train/model.ckpt.


























INFO:tensorflow:Summary name dnn/hiddenlayer_0:fraction_of_zero_values is illegal; using dnn/hiddenlayer_0_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:fraction_of_zero_values is illegal; using dnn/hiddenlayer_0_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:activation is illegal; using dnn/hiddenlayer_0_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:activation is illegal; using dnn/hiddenlayer_0_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:fraction_of_zero_values is illegal; using dnn/hiddenlayer_1_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:fraction_of_zero_values is illegal; using dnn/hiddenlayer_1_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:activation is illegal; using dnn/hiddenlayer_1_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:activation is illegal; using dnn/hiddenlayer_1_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:fraction_of_zero_values is illegal; using dnn/hiddenlayer_2_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:fraction_of_zero_values is illegal; using dnn/hiddenlayer_2_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:activation is illegal; using dnn/hiddenlayer_2_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:activation is illegal; using dnn/hiddenlayer_2_activation instead.


INFO:tensorflow:Summary name dnn/logits:fraction_of_zero_values is illegal; using dnn/logits_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/logits:fraction_of_zero_values is illegal; using dnn/logits_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/logits:activation is illegal; using dnn/logits_activation instead.


INFO:tensorflow:Summary name dnn/logits:activation is illegal; using dnn/logits_activation instead.


























INFO:tensorflow:/content/Downloads/iris_notebook_workspace/training/intermediate_models/00000250-tmp/export is not in all_model_checkpoint_paths. Manually adding it.


INFO:tensorflow:/content/Downloads/iris_notebook_workspace/training/intermediate_models/00000250-tmp/export is not in all_model_checkpoint_paths. Manually adding it.


INFO:tensorflow:Loss for final step: 0.0775295.


INFO:tensorflow:Loss for final step: 0.0775295.


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


INFO:tensorflow:Summary name dnn/hiddenlayer_0:fraction_of_zero_values is illegal; using dnn/hiddenlayer_0_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:fraction_of_zero_values is illegal; using dnn/hiddenlayer_0_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:activation is illegal; using dnn/hiddenlayer_0_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:activation is illegal; using dnn/hiddenlayer_0_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:fraction_of_zero_values is illegal; using dnn/hiddenlayer_1_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:fraction_of_zero_values is illegal; using dnn/hiddenlayer_1_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:activation is illegal; using dnn/hiddenlayer_1_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:activation is illegal; using dnn/hiddenlayer_1_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:fraction_of_zero_values is illegal; using dnn/hiddenlayer_2_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:fraction_of_zero_values is illegal; using dnn/hiddenlayer_2_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:activation is illegal; using dnn/hiddenlayer_2_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:activation is illegal; using dnn/hiddenlayer_2_activation instead.


INFO:tensorflow:Summary name dnn/logits:fraction_of_zero_values is illegal; using dnn/logits_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/logits:fraction_of_zero_values is illegal; using dnn/logits_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/logits:activation is illegal; using dnn/logits_activation instead.


INFO:tensorflow:Summary name dnn/logits:activation is illegal; using dnn/logits_activation instead.






INFO:tensorflow:Restored model from /content/Downloads/iris_notebook_workspace/training/train


INFO:tensorflow:Restored model from /content/Downloads/iris_notebook_workspace/training/train


INFO:tensorflow:Eval steps [0,100) for training step 250.


INFO:tensorflow:Eval steps [0,100) for training step 250.


INFO:tensorflow:Results after 10 steps (0.004 sec/batch): accuracy = 0.933, loss = 0.20425.


INFO:tensorflow:Results after 10 steps (0.004 sec/batch): accuracy = 0.933, loss = 0.20425.


INFO:tensorflow:Results after 20 steps (0.004 sec/batch): accuracy = 0.9335, loss = 0.202992.


INFO:tensorflow:Results after 20 steps (0.004 sec/batch): accuracy = 0.9335, loss = 0.202992.


INFO:tensorflow:Results after 30 steps (0.003 sec/batch): accuracy = 0.933333, loss = 0.20353.


INFO:tensorflow:Results after 30 steps (0.003 sec/batch): accuracy = 0.933333, loss = 0.20353.


INFO:tensorflow:Results after 40 steps (0.003 sec/batch): accuracy = 0.93325, loss = 0.20371.


INFO:tensorflow:Results after 40 steps (0.003 sec/batch): accuracy = 0.93325, loss = 0.20371.


INFO:tensorflow:Results after 50 steps (0.003 sec/batch): accuracy = 0.9334, loss = 0.203315.


INFO:tensorflow:Results after 50 steps (0.003 sec/batch): accuracy = 0.9334, loss = 0.203315.


INFO:tensorflow:Results after 60 steps (0.003 sec/batch): accuracy = 0.933333, loss = 0.20353.


INFO:tensorflow:Results after 60 steps (0.003 sec/batch): accuracy = 0.933333, loss = 0.20353.


INFO:tensorflow:Results after 70 steps (0.002 sec/batch): accuracy = 0.933286, loss = 0.203633.


INFO:tensorflow:Results after 70 steps (0.002 sec/batch): accuracy = 0.933286, loss = 0.203633.


INFO:tensorflow:Results after 80 steps (0.002 sec/batch): accuracy = 0.933375, loss = 0.203396.


INFO:tensorflow:Results after 80 steps (0.002 sec/batch): accuracy = 0.933375, loss = 0.203396.


INFO:tensorflow:Results after 90 steps (0.003 sec/batch): accuracy = 0.933333, loss = 0.20353.


INFO:tensorflow:Results after 90 steps (0.003 sec/batch): accuracy = 0.933333, loss = 0.20353.


INFO:tensorflow:Results after 100 steps (0.003 sec/batch): accuracy = 0.9333, loss = 0.203602.


INFO:tensorflow:Results after 100 steps (0.003 sec/batch): accuracy = 0.9333, loss = 0.203602.


INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.CancelledError'>, Run call was cancelled


INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.CancelledError'>, Run call was cancelled






INFO:tensorflow:Saving evaluation summary for step 250: accuracy = 0.9333, loss = 0.203602


INFO:tensorflow:Saving evaluation summary for step 250: accuracy = 0.9333, loss = 0.203602


Local training done.


Hum, iris is an easy problem. 

Ok, an accuracy of ~0.86 is not bad. Lets see if we can do better. For this we need to pass a some transform configureation in the form of a json transfrom file. See the doc string of local_train for a description of the transforms supported.

In [76]:
%writefile {LOCAL_ROOT}/transforms.json
{
  "sepal_length": {"transform": "scale"},
  "sepal_width": {"transform": "scale", "value": 4},
  "petal_length": {"transform": "scale", "default": 0},
  "petal_width": {"transform": "scale"},
  "key": {"transform": "key"}
 }

Overwriting /content/Downloads/iris_notebook_workspace/transforms.json


In [77]:
!rm -fr {LOCAL_ROOT}/training

In [78]:
sd.local_train(
  train_file_pattern=os.path.join(LOCAL_ROOT, 'train.*'),
  eval_file_pattern=os.path.join(LOCAL_ROOT, 'eval.*'),
  preprocess_output_dir=os.path.join(LOCAL_ROOT, 'preprocess'),
  output_dir=os.path.join(LOCAL_ROOT, 'training'),
  transforms_file=os.path.join(LOCAL_ROOT, 'transforms.json'),
  model_type='dnn_classification',
  top_n=3,
  max_steps=250,
  layer_sizes=[10, 8, 5]
)


Starting local training.
INFO:tensorflow:Using config: {'save_summary_steps': 100, '_num_ps_replicas': 0, '_task_type': None, '_environment': 'local', '_is_chief': True, 'save_checkpoints_secs': 600, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fde1593b6d0>, 'tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1.0
}
, '_task_id': 0, 'tf_random_seed': None, 'keep_checkpoint_every_n_hours': 10000, '_evaluation_master': '', 'save_checkpoints_steps': None, '_master': '', 'keep_checkpoint_max': 5}


INFO:tensorflow:Using config: {'save_summary_steps': 100, '_num_ps_replicas': 0, '_task_type': None, '_environment': 'local', '_is_chief': True, 'save_checkpoints_secs': 600, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fde1593b6d0>, 'tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1.0
}
, '_task_id': 0, 'tf_random_seed': None, 'keep_checkpoint_every_n_hours': 10000, '_evaluation_master': '', 'save_checkpoints_steps': None, '_master': '', 'keep_checkpoint_max': 5}


Instructions for updating:
Monitors are deprecated. Please use tf.train.SessionRunHook.


Instructions for updating:
Monitors are deprecated. Please use tf.train.SessionRunHook.


Instructions for updating:
local_eval_frequency is deprecated as local_run will be renamed to train_and_evaluate. Use min_eval_frequency and call train_and_evaluate instead. Note, however, that the default for min_eval_frequency is 1, meaning models will be evaluated every time a new checkpoint is available. In contrast, the default for local_eval_frequency is None, resulting in evaluation occurring only after training has completed. min_eval_frequency is ignored when calling the deprecated local_run.


Instructions for updating:
local_eval_frequency is deprecated as local_run will be renamed to train_and_evaluate. Use min_eval_frequency and call train_and_evaluate instead. Note, however, that the default for min_eval_frequency is 1, meaning models will be evaluated every time a new checkpoint is available. In contrast, the default for local_eval_frequency is None, resulting in evaluation occurring only after training has completed. min_eval_frequency is ignored when calling the deprecated local_run.


Instructions for updating:
Monitors are deprecated. Please use tf.train.SessionRunHook.


Instructions for updating:
Monitors are deprecated. Please use tf.train.SessionRunHook.


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


INFO:tensorflow:Summary name dnn/hiddenlayer_0:fraction_of_zero_values is illegal; using dnn/hiddenlayer_0_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:fraction_of_zero_values is illegal; using dnn/hiddenlayer_0_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:activation is illegal; using dnn/hiddenlayer_0_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:activation is illegal; using dnn/hiddenlayer_0_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:fraction_of_zero_values is illegal; using dnn/hiddenlayer_1_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:fraction_of_zero_values is illegal; using dnn/hiddenlayer_1_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:activation is illegal; using dnn/hiddenlayer_1_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:activation is illegal; using dnn/hiddenlayer_1_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:fraction_of_zero_values is illegal; using dnn/hiddenlayer_2_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:fraction_of_zero_values is illegal; using dnn/hiddenlayer_2_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:activation is illegal; using dnn/hiddenlayer_2_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:activation is illegal; using dnn/hiddenlayer_2_activation instead.


INFO:tensorflow:Summary name dnn/logits:fraction_of_zero_values is illegal; using dnn/logits_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/logits:fraction_of_zero_values is illegal; using dnn/logits_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/logits:activation is illegal; using dnn/logits_activation instead.


INFO:tensorflow:Summary name dnn/logits:activation is illegal; using dnn/logits_activation instead.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:loss = 1.13006, step = 1


INFO:tensorflow:loss = 1.13006, step = 1


INFO:tensorflow:Saving checkpoints for 1 into /content/Downloads/iris_notebook_workspace/training/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 1 into /content/Downloads/iris_notebook_workspace/training/train/model.ckpt.


























INFO:tensorflow:Summary name dnn/hiddenlayer_0:fraction_of_zero_values is illegal; using dnn/hiddenlayer_0_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:fraction_of_zero_values is illegal; using dnn/hiddenlayer_0_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:activation is illegal; using dnn/hiddenlayer_0_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:activation is illegal; using dnn/hiddenlayer_0_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:fraction_of_zero_values is illegal; using dnn/hiddenlayer_1_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:fraction_of_zero_values is illegal; using dnn/hiddenlayer_1_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:activation is illegal; using dnn/hiddenlayer_1_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:activation is illegal; using dnn/hiddenlayer_1_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:fraction_of_zero_values is illegal; using dnn/hiddenlayer_2_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:fraction_of_zero_values is illegal; using dnn/hiddenlayer_2_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:activation is illegal; using dnn/hiddenlayer_2_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:activation is illegal; using dnn/hiddenlayer_2_activation instead.


INFO:tensorflow:Summary name dnn/logits:fraction_of_zero_values is illegal; using dnn/logits_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/logits:fraction_of_zero_values is illegal; using dnn/logits_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/logits:activation is illegal; using dnn/logits_activation instead.


INFO:tensorflow:Summary name dnn/logits:activation is illegal; using dnn/logits_activation instead.


























INFO:tensorflow:/content/Downloads/iris_notebook_workspace/training/intermediate_models/00000001-tmp/export is not in all_model_checkpoint_paths. Manually adding it.


INFO:tensorflow:/content/Downloads/iris_notebook_workspace/training/intermediate_models/00000001-tmp/export is not in all_model_checkpoint_paths. Manually adding it.


INFO:tensorflow:loss = 0.0290386, step = 101


INFO:tensorflow:loss = 0.0290386, step = 101


INFO:tensorflow:global_step/sec: 42.9038


INFO:tensorflow:global_step/sec: 42.9038


INFO:tensorflow:loss = 0.022508, step = 201


INFO:tensorflow:loss = 0.022508, step = 201


INFO:tensorflow:global_step/sec: 50.6452


INFO:tensorflow:global_step/sec: 50.6452


INFO:tensorflow:Saving checkpoints for 250 into /content/Downloads/iris_notebook_workspace/training/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 250 into /content/Downloads/iris_notebook_workspace/training/train/model.ckpt.


























INFO:tensorflow:Summary name dnn/hiddenlayer_0:fraction_of_zero_values is illegal; using dnn/hiddenlayer_0_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:fraction_of_zero_values is illegal; using dnn/hiddenlayer_0_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:activation is illegal; using dnn/hiddenlayer_0_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:activation is illegal; using dnn/hiddenlayer_0_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:fraction_of_zero_values is illegal; using dnn/hiddenlayer_1_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:fraction_of_zero_values is illegal; using dnn/hiddenlayer_1_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:activation is illegal; using dnn/hiddenlayer_1_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:activation is illegal; using dnn/hiddenlayer_1_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:fraction_of_zero_values is illegal; using dnn/hiddenlayer_2_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:fraction_of_zero_values is illegal; using dnn/hiddenlayer_2_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:activation is illegal; using dnn/hiddenlayer_2_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:activation is illegal; using dnn/hiddenlayer_2_activation instead.


INFO:tensorflow:Summary name dnn/logits:fraction_of_zero_values is illegal; using dnn/logits_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/logits:fraction_of_zero_values is illegal; using dnn/logits_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/logits:activation is illegal; using dnn/logits_activation instead.


INFO:tensorflow:Summary name dnn/logits:activation is illegal; using dnn/logits_activation instead.


























INFO:tensorflow:/content/Downloads/iris_notebook_workspace/training/intermediate_models/00000250-tmp/export is not in all_model_checkpoint_paths. Manually adding it.


INFO:tensorflow:/content/Downloads/iris_notebook_workspace/training/intermediate_models/00000250-tmp/export is not in all_model_checkpoint_paths. Manually adding it.


INFO:tensorflow:Loss for final step: 0.023279.


INFO:tensorflow:Loss for final step: 0.023279.


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


INFO:tensorflow:Summary name dnn/hiddenlayer_0:fraction_of_zero_values is illegal; using dnn/hiddenlayer_0_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:fraction_of_zero_values is illegal; using dnn/hiddenlayer_0_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:activation is illegal; using dnn/hiddenlayer_0_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_0:activation is illegal; using dnn/hiddenlayer_0_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:fraction_of_zero_values is illegal; using dnn/hiddenlayer_1_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:fraction_of_zero_values is illegal; using dnn/hiddenlayer_1_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:activation is illegal; using dnn/hiddenlayer_1_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_1:activation is illegal; using dnn/hiddenlayer_1_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:fraction_of_zero_values is illegal; using dnn/hiddenlayer_2_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:fraction_of_zero_values is illegal; using dnn/hiddenlayer_2_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:activation is illegal; using dnn/hiddenlayer_2_activation instead.


INFO:tensorflow:Summary name dnn/hiddenlayer_2:activation is illegal; using dnn/hiddenlayer_2_activation instead.


INFO:tensorflow:Summary name dnn/logits:fraction_of_zero_values is illegal; using dnn/logits_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/logits:fraction_of_zero_values is illegal; using dnn/logits_fraction_of_zero_values instead.


INFO:tensorflow:Summary name dnn/logits:activation is illegal; using dnn/logits_activation instead.


INFO:tensorflow:Summary name dnn/logits:activation is illegal; using dnn/logits_activation instead.






INFO:tensorflow:Restored model from /content/Downloads/iris_notebook_workspace/training/train


INFO:tensorflow:Restored model from /content/Downloads/iris_notebook_workspace/training/train


INFO:tensorflow:Eval steps [0,100) for training step 250.


INFO:tensorflow:Eval steps [0,100) for training step 250.


INFO:tensorflow:Results after 10 steps (0.005 sec/batch): accuracy = 0.9, loss = 0.334344.


INFO:tensorflow:Results after 10 steps (0.005 sec/batch): accuracy = 0.9, loss = 0.334344.


INFO:tensorflow:Results after 20 steps (0.003 sec/batch): accuracy = 0.9005, loss = 0.331383.


INFO:tensorflow:Results after 20 steps (0.003 sec/batch): accuracy = 0.9005, loss = 0.331383.


INFO:tensorflow:Results after 30 steps (0.003 sec/batch): accuracy = 0.9, loss = 0.331736.


INFO:tensorflow:Results after 30 steps (0.003 sec/batch): accuracy = 0.9, loss = 0.331736.


INFO:tensorflow:Results after 40 steps (0.002 sec/batch): accuracy = 0.9, loss = 0.332388.


INFO:tensorflow:Results after 40 steps (0.002 sec/batch): accuracy = 0.9, loss = 0.332388.


INFO:tensorflow:Results after 50 steps (0.002 sec/batch): accuracy = 0.9002, loss = 0.331595.


INFO:tensorflow:Results after 50 steps (0.002 sec/batch): accuracy = 0.9002, loss = 0.331595.


INFO:tensorflow:Results after 60 steps (0.002 sec/batch): accuracy = 0.9, loss = 0.331736.


INFO:tensorflow:Results after 60 steps (0.002 sec/batch): accuracy = 0.9, loss = 0.331736.


INFO:tensorflow:Results after 70 steps (0.004 sec/batch): accuracy = 0.9, loss = 0.332109.


INFO:tensorflow:Results after 70 steps (0.004 sec/batch): accuracy = 0.9, loss = 0.332109.


INFO:tensorflow:Results after 80 steps (0.003 sec/batch): accuracy = 0.900125, loss = 0.331648.


INFO:tensorflow:Results after 80 steps (0.003 sec/batch): accuracy = 0.900125, loss = 0.331648.


INFO:tensorflow:Results after 90 steps (0.002 sec/batch): accuracy = 0.9, loss = 0.331737.


INFO:tensorflow:Results after 90 steps (0.002 sec/batch): accuracy = 0.9, loss = 0.331737.


INFO:tensorflow:Results after 100 steps (0.002 sec/batch): accuracy = 0.9, loss = 0.331997.


INFO:tensorflow:Results after 100 steps (0.002 sec/batch): accuracy = 0.9, loss = 0.331997.


INFO:tensorflow:Saving evaluation summary for step 250: accuracy = 0.9, loss = 0.331997


INFO:tensorflow:Saving evaluation summary for step 250: accuracy = 0.9, loss = 0.331997


Local training done.


We gained ~0.1 accuracy by just scaling the numbers! Try playing with other transforms. The output of training contains some folders. The final model used for prediction is saved in {LOCAL_ROOT}/training/model

In [80]:
!ls {LOCAL_ROOT}/training/*

/content/Downloads/iris_notebook_workspace/training/intermediate_models:
00000001  00000250

/content/Downloads/iris_notebook_workspace/training/model:
checkpoint  export  export.meta  schema.json  transforms.json  vocab_flower.csv

/content/Downloads/iris_notebook_workspace/training/train:
checkpoint				     model.ckpt-1-00000-of-00001
eval					     model.ckpt-1.meta
events.out.tfevents.1487209337.5fd6f503cb2d  model.ckpt-250-00000-of-00001
graph.pbtxt				     model.ckpt-250.meta


<a name="local_prediction"></a>
Local prediction
================

Local predict uses the model produced by training. The input data can be a csv string or Pandas DataFrame, but the schema must match the data set used for training. The target column is optional. As this was a classification problem, a probability is computed for each target label. As top_n=3 was used, the 3 labels associated with the 3 largest probabilities are returned by the model.

In [81]:
sd.local_predict(
  model_dir=os.path.join(LOCAL_ROOT, 'training/model'),
  data=['Iris-virginica,101,6.3,3.3,6,2.5',
        'Iris-virginica,107,4.9,2.5,4.5,1.7',
        'Iris-versicolor,100,5.7,2.8,4.1,1.3',
        'Iris-versicolor,70,5.6,2.5,3.9,1.1',
        'Iris-setosa,13,4.8,3,1.4,0.1',
        'Iris-setosa,11,5.4,3.7,1.5,0.2']
)

Starting local prediction.
Local prediction done.


Unnamed: 0,key_from_input,target_from_input,top_1_label,top_1_score,top_2_label,top_2_score,top_3_label,top_3_score
0,101.0,Iris-virginica,Iris-virginica,0.999946,Iris-versicolor,5.4e-05,Iris-setosa,1.735619e-11
1,107.0,Iris-virginica,Iris-versicolor,0.996429,Iris-virginica,0.003571,Iris-setosa,6.51429e-10
2,100.0,Iris-versicolor,Iris-versicolor,0.999383,Iris-virginica,0.000616,Iris-setosa,7.00445e-07
3,70.0,Iris-versicolor,Iris-versicolor,0.999864,Iris-virginica,0.000136,Iris-setosa,1.569044e-08
4,13.0,Iris-setosa,Iris-setosa,0.999907,Iris-versicolor,9.1e-05,Iris-virginica,2.154754e-06
5,11.0,Iris-setosa,Iris-setosa,0.999855,Iris-versicolor,0.000138,Iris-virginica,6.613328e-06


In [82]:
import pandas as pd
sd.local_predict(
  model_dir=os.path.join(LOCAL_ROOT, 'training/model'),
  data=pd.DataFrame(
    [[101,6.3,3.3,6,2.5],
     [107,4.9,2.5,4.5,1.7],
     [100,5.7,2.8,4.1,1.3]])
)

Starting local prediction.
Local prediction done.


Unnamed: 0,key_from_input,target_from_input,top_1_label,top_1_score,top_2_label,top_2_score,top_3_label,top_3_score
0,101.0,UNKNOWN,Iris-virginica,0.999946,Iris-versicolor,5.4e-05,Iris-setosa,1.735619e-11
1,107.0,UNKNOWN,Iris-versicolor,0.996429,Iris-virginica,0.003571,Iris-setosa,6.514278e-10
2,100.0,UNKNOWN,Iris-versicolor,0.999383,Iris-virginica,0.000616,Iris-setosa,7.00445e-07


<a name="local_batch_prediction"></a>
Local batch prediction
============

Local batch prediction runs prediction on batched input data. This is ideal if the input dataset is very large or you have limited avaliable main memeory. However, for trully large datasets, it is better to run batch prediction using the cloudml services. Two output formats are supported, csv and json. The output may also be shardded.

In [83]:
!rm -fr {LOCAL_ROOT}/predict_out

In [84]:
sd.local_batch_predict(
  model_dir=os.path.join(LOCAL_ROOT, 'training/model'),
  prediction_input_file=os.path.join(LOCAL_ROOT, 'eval.*'),
  output_dir=os.path.join(LOCAL_ROOT, 'predict_out'),
  output_format='csv'
)

Starting local batch prediction.
Local batch prediction done.


In [85]:
!ls {LOCAL_ROOT}/predict_out

csv_header.txt	errors-00000-of-00001.txt  predictions-00000-of-00001.csv


In [86]:
!cat {LOCAL_ROOT}/predict_outt/csv_header.txt

cat: /content/Downloads/iris_notebook_workspace/predict_outt/csv_header.txt: No such file or directory


In [87]:
!cat {LOCAL_ROOT}/predict_out/errors*

In [88]:
!head {LOCAL_ROOT}/predict_out/predictions-00000*

107.0,Iris-virginica,Iris-versicolor,0.996429145336,Iris-virginica,0.00357083650306,Iris-setosa,6.51429021836e-10
100.0,Iris-versicolor,Iris-versicolor,0.999382972717,Iris-virginica,0.000616324367002,Iris-setosa,7.0044501399e-07
99.0,Iris-versicolor,Iris-versicolor,0.997950255871,Iris-setosa,0.00104523065966,Iris-virginica,0.00100453919731
13.0,Iris-setosa,Iris-setosa,0.999907016754,Iris-versicolor,9.08611036721e-05,Iris-virginica,2.15475415644e-06
70.0,Iris-versicolor,Iris-versicolor,0.999863505363,Iris-virginica,0.000136479648063,Iris-setosa,1.56904427229e-08
11.0,Iris-setosa,Iris-setosa,0.999855399132,Iris-versicolor,0.000137973052915,Iris-virginica,6.6133211476e-06
37.0,Iris-setosa,Iris-setosa,0.99981969595,Iris-versicolor,0.000173512031324,Iris-virginica,6.75060709909e-06
69.0,Iris-versicolor,Iris-versicolor,0.750553905964,Iris-virginica,0.249446064234,Iris-setosa,3.83428640401e-13
40.0,Iris-setosa,Iris-setosa,0.999869585037,Iris-versicolor,0.0001257959957,Iris-virginica,4

In [89]:
!rm -fr {LOCAL_ROOT}/predict_out

In [90]:
sd.local_batch_predict(
  model_dir=os.path.join(LOCAL_ROOT, 'training/model'),
  prediction_input_file=os.path.join(LOCAL_ROOT, 'predict.*'),
  output_dir=os.path.join(LOCAL_ROOT, 'predict_out'),
  output_format='json'
)

Starting local batch prediction.
Local batch prediction done.


In [91]:
!ls {LOCAL_ROOT}/predict_out

errors-00000-of-00001.txt  predictions-00000-of-00001.json


In [92]:
!head {LOCAL_ROOT}/predict_out/predictions-00000*

{"top_2_label": "Iris-virginica","top_3_score": 6.514290218362362e-10,"top_1_label": "Iris-versicolor","top_2_score": 0.003570836503058672,"top_3_label": "Iris-setosa","target_from_input": "UNKNOWN","top_1_score": 0.9964291453361511,"key_from_input": 107.0}
{"top_2_label": "Iris-virginica","top_3_score": 7.004450139902474e-07,"top_1_label": "Iris-versicolor","top_2_score": 0.0006163243670016527,"top_3_label": "Iris-setosa","target_from_input": "UNKNOWN","top_1_score": 0.9993829727172852,"key_from_input": 100.0}
{"top_2_label": "Iris-setosa","top_3_score": 0.0010045391973108053,"top_1_label": "Iris-versicolor","top_2_score": 0.0010452306596562266,"top_3_label": "Iris-virginica","target_from_input": "UNKNOWN","top_1_score": 0.9979502558708191,"key_from_input": 99.0}
{"top_2_label": "Iris-versicolor","top_3_score": 2.1547541564359562e-06,"top_1_label": "Iris-setosa","top_2_score": 9.086110367206857e-05,"top_3_label": "Iris-virginica","target_from_input": "UNKNOWN","top_1_score": 0.9999

<a name="cloud_preprocessing"></a>
Cloud preprocessing from csv files
====


First, lets move our local fiels to gcs

In [38]:
!gsutil cp {LOCAL_ROOT}/*.csv {CLOUD_ROOT}
!gsutil cp {LOCAL_ROOT}/schema.json {CLOUD_ROOT}/schema.json

Copying file:///content/Downloads/iris_notebook_workspace/eval.csv [Content-Type=text/csv]...
Copying file:///content/Downloads/iris_notebook_workspace/predict.csv [Content-Type=text/csv]...
Copying file:///content/Downloads/iris_notebook_workspace/train.csv [Content-Type=text/csv]...
- [3 files][  5.3 KiB/  5.3 KiB]                                                
Operation completed over 3 objects/5.3 KiB.                                      
Copying file:///content/Downloads/iris_notebook_workspace/schema.json [Content-Type=application/json]...
/ [1 files][  573.0 B/  573.0 B]                                                
Operation completed over 1 objects/573.0 B.                                      


In [39]:
!gsutil rm -r {CLOUD_ROOT}/preprocess

Removing gs://cloud-ml-dev_bdt/iris_notebook_workspace/preprocess/numerical_analysis.json#1487199911166963...
Removing gs://cloud-ml-dev_bdt/iris_notebook_workspace/preprocess/schema.json#1487199915017000...
Removing gs://cloud-ml-dev_bdt/iris_notebook_workspace/preprocess/vocab_flower.csv#1487199914320299...
/ [3 objects]                                                                   
Operation completed over 3 objects.                                              


In [41]:
sd.cloud_preprocess(
  input_file_pattern=os.path.join(CLOUD_ROOT, 'train.*'),
  output_dir=os.path.join(CLOUD_ROOT, 'preprocess'),
  schema_file=os.path.join(CLOUD_ROOT, 'schema.json'),
)

Starting cloud preprocessing.
Track BigQuery status at
https://bigquery.cloud.google.com/queries/cloud-ml-dev
Running numerical analysis...done.
Running categorical analysis...done.
Cloud preprocessing done.


In [42]:
!gsutil ls {CLOUD_ROOT}/preprocess

gs://cloud-ml-dev_bdt/iris_notebook_workspace/preprocess/numerical_analysis.json
gs://cloud-ml-dev_bdt/iris_notebook_workspace/preprocess/schema.json
gs://cloud-ml-dev_bdt/iris_notebook_workspace/preprocess/vocab_flower.csv


In [43]:
!gsutil cat {CLOUD_ROOT}/preprocess/numerical_analysis.json

{
  "sepal_width": {
    "max": 4.4000000000000004,
    "mean": 3.050833333333332,
    "min": 2.0
  },
  "petal_width": {
    "max": 2.5,
    "mean": 1.2324999999999995,
    "min": 0.10000000000000001
  },
  "sepal_length": {
    "max": 7.9000000000000004,
    "mean": 5.8675000000000024,
    "min": 4.2999999999999998
  },
  "key": {
    "max": 150.0,
    "mean": 76.733333333333334,
    "min": 1.0
  },
  "petal_length": {
    "max": 6.9000000000000004,
    "mean": 3.8308333333333349,
    "min": 1.1000000000000001
  }
}

In [44]:
!gsutil cat {CLOUD_ROOT}/preprocess/vocab_flower.csv

Iris-setosa
Iris-versicolor
Iris-virginica


<a name="cloud_training"></a>
Cloud Training
===========

For cloud training, all input files must be placed on GCS. The functino cloud_train builds the trainer, uploads it to GCS, and submits a job request to the CloudML service.

In [46]:
!gsutil cp {LOCAL_ROOT}/transforms.json {CLOUD_ROOT}/transforms.json

Copying file:///content/Downloads/iris_notebook_workspace/transforms.json [Content-Type=application/json]...
/ [1 files][  226.0 B/  226.0 B]                                                
Operation completed over 1 objects/226.0 B.                                      


In [47]:
!gsutil rm -r {CLOUD_ROOT}/training

Removing gs://cloud-ml-dev_bdt/iris_notebook_workspace/training/intermediate_models/#1487203863800309...
Removing gs://cloud-ml-dev_bdt/iris_notebook_workspace/training/intermediate_models/00000000/#1487203867746000...
Removing gs://cloud-ml-dev_bdt/iris_notebook_workspace/training/intermediate_models/00000000/checkpoint#1487203868102000...
Removing gs://cloud-ml-dev_bdt/iris_notebook_workspace/training/intermediate_models/00000000/export#1487203868571000...
/ [4 objects]                                                                   
==> NOTE: You are performing a sequence of gsutil operations that may
run significantly faster if you instead use gsutil -m -o ... Please
see the -m section under "gsutil help options" for further information
about when gsutil -m can be advantageous.

Removing gs://cloud-ml-dev_bdt/iris_notebook_workspace/training/intermediate_models/00000000/export.meta#1487203869134000...
Removing gs://cloud-ml-dev_bdt/iris_notebook_workspace/training/model/#14872038

In [49]:
sd.cloud_train(
  train_file_pattern=os.path.join(CLOUD_ROOT, 'train.cs*'),
  eval_file_pattern=os.path.join(CLOUD_ROOT, 'eval.cs*'),
  preprocess_output_dir=os.path.join(CLOUD_ROOT, 'preprocess'),
  output_dir=os.path.join(CLOUD_ROOT, 'training'),
  transforms_file=os.path.join(CLOUD_ROOT, 'transforms.json'),
  model_type='dnn_classification',
  top_n=3,
  max_steps=500,
  layer_sizes=[10, 10, 5],
  region='us-central1',
  scale_tier='BASIC'
)

Building package and uploading to gs://cloud-ml-dev_bdt/iris_notebook_workspace/training/staging/sd.tar.gz


Before running the next steps, follow the above link and wait ~10 minutes for training to finish.

In [51]:
!gsutil ls {CLOUD_ROOT}/training/model

CommandException: One or more URLs matched no objects.


<a name="cloud_online_prediction"></a>
CloudML online prediction
========================

After training a model, it can be depolyed and requests can be sent to it. We first have to create the model, and its version.


In [52]:
MODEL_NAME = 'irismodeldatalab'
MODEL_VERSION = 'v1'

In [53]:
!gcloud beta ml models create {MODEL_NAME} 
!gcloud beta ml versions create {MODEL_VERSION} --model {MODEL_NAME} --origin {CLOUD_ROOT}/training/model/

Creating version (this might take a few minutes)......done.


In [54]:
sd.cloud_predict(
  model_name=MODEL_NAME,
  model_version=MODEL_VERSION,
  data=['Iris-virginica,101,6.3,3.3,6,2.5',
        'Iris-virginica,107,4.9,2.5,4.5,1.7',
        'Iris-versicolor,100,5.7,2.8,4.1,1.3',
        'Iris-versicolor,70,5.6,2.5,3.9,1.1',
        'Iris-setosa,13,4.8,3,1.4,0.1',
        'Iris-setosa,11,5.4,3.7,1.5,0.2']
)



Unnamed: 0,key_from_input,target_from_input,top_1_label,top_1_score,top_2_label,top_2_score,top_3_label,top_3_score
0,101,Iris-virginica,Iris-virginica,1.0,Iris-versicolor,2.61211e-09,Iris-setosa,4.48302e-23
1,107,Iris-virginica,Iris-versicolor,1.0,Iris-setosa,4.24274e-09,Iris-virginica,4.41757e-10
2,100,Iris-versicolor,Iris-versicolor,0.999991,Iris-setosa,8.67259e-06,Iris-virginica,5.83575e-14
3,70,Iris-versicolor,Iris-versicolor,0.999999,Iris-setosa,1.1382e-06,Iris-virginica,1.26933e-17
4,13,Iris-setosa,Iris-setosa,0.999835,Iris-versicolor,0.000164935,Iris-virginica,0.0
5,11,Iris-setosa,Iris-setosa,0.999937,Iris-versicolor,6.28423e-05,Iris-virginica,0.0


<a name="cloud_batch_prediction"></a>
CloudML Batch prediction
=====

In [55]:
!gsutil rm -fr {CLOUD_ROOT}/predict_out

CommandException: "rm" command does not support "file://" URLs. Did you mean to use a gs:// URL?


In [56]:
sd.cloud_batch_predict(
  model_dir=os.path.join(CLOUD_ROOT, 'training/model'),
  prediction_input_file=os.path.join(CLOUD_ROOT, 'predict.cs*'),
  output_dir=os.path.join(CLOUD_ROOT, 'predict_out'),
  output_format='json'
)

Building package and uploading to gs://cloud-ml-dev_bdt/iris_notebook_workspace/predict_out/staging/sd.tar.gz
['predict.py', '--cloud', '--project_id=cloud-ml-dev', '--predict_data=gs://cloud-ml-dev_bdt/iris_notebook_workspace/predict.cs*', '--trained_model_dir=gs://cloud-ml-dev_bdt/iris_notebook_workspace/training/model', '--output_dir=gs://cloud-ml-dev_bdt/iris_notebook_workspace/predict_out', '--output_format=json', '--batch_size=1000', '--extra_package=gs://cloud-ml-dev_bdt/iris_notebook_workspace/predict_out/staging/sd.tar.gz']
Starting cloud batch prediction.
Dataflow Job submitted, see Job structured-data-batch-prediction-20170216013432 at https://console.developers.google.com/dataflow?project=cloud-ml-dev



Using fallback coder for typehint: Any.



See above link for job status.


In [58]:
!gsutil ls {CLOUD_ROOT}/predict_out

gs://cloud-ml-dev_bdt/iris_notebook_workspace/predict_out/errors-00000-of-00001.txt
gs://cloud-ml-dev_bdt/iris_notebook_workspace/predict_out/predictions-00000-of-00003.json
gs://cloud-ml-dev_bdt/iris_notebook_workspace/predict_out/predictions-00001-of-00003.json
gs://cloud-ml-dev_bdt/iris_notebook_workspace/predict_out/predictions-00002-of-00003.json
gs://cloud-ml-dev_bdt/iris_notebook_workspace/predict_out/staging/
gs://cloud-ml-dev_bdt/iris_notebook_workspace/predict_out/tmp/


In [61]:
!gsutil cat {CLOUD_ROOT}/predict_out/predictions-00000-*

{"top_2_label": "Iris-virginica","top_3_score": 3.611174692608188e-09,"top_1_label": "Iris-versicolor","top_2_score": 0.008518863469362259,"top_3_label": "Iris-setosa","target_from_input": "UNKNOWN","top_1_score": 0.9914811849594116,"key_from_input": 55.0}
{"top_2_label": "Iris-versicolor","top_3_score": 0.0,"top_1_label": "Iris-setosa","top_2_score": 5.8058707509189844e-05,"top_3_label": "Iris-virginica","target_from_input": "UNKNOWN","top_1_score": 0.9999419450759888,"key_from_input": 49.0}
{"top_2_label": "Iris-versicolor","top_3_score": 4.999267059352386e-13,"top_1_label": "Iris-virginica","top_2_score": 0.002931889146566391,"top_3_label": "Iris-setosa","target_from_input": "UNKNOWN","top_1_score": 0.9970681071281433,"key_from_input": 78.0}
{"top_2_label": "Iris-versicolor","top_3_score": 3.8751211441090394e-12,"top_1_label": "Iris-virginica","top_2_score": 0.25968462228775024,"top_3_label": "Iris-setosa","target_from_input": "UNKNOWN","top_1_score": 0.7403153777122498,"key_from