# CIML Predictions

In this notebook we train and evaluate CIML experiments using the functions `gather_results` and `tf_trainer functions` of the [ciml project](https://github.com/mtreinish/ciml). 
<br>Then we save the predictions of the experiments for a deeper analysis of the metrics of the trained models (see [CIML Metric Report](https://nbviewer.jupyter.org/github/kwulffert/ciml_experiments/blob/master/Metrics%20report.ipynb)).

In [1]:
from ciml import gather_results
from ciml import tf_trainer
import numpy as np
import pandas as pd
import collections
import tensorflow as tf
from tensorflow.python.training import adagrad
from tensorflow.python.util import deprecation
deprecation._PRINT_DEPRECATION_WARNINGS = False

First we define the data path, dataset and experiment to gather the right input dataset and the configuration for the experiment.

In [2]:
data_path = '/Users/kw/ciml_data/cimlodsceu2019seed'

#Dataset and experiment combination for multiple classification
#dataset = 'usr_1m-1min-node_provider'
#experiment = 'dnn-3x100-500epochs-bs128'

#Dataset and experiment combination for binary classification
dataset = 'usr_1m-1min-status'
experiment = 'dnn-5x100-500epochs-bs128'

In [3]:
experiment_data = gather_results.load_experiment(
        experiment, data_path=data_path)
dataset_data = gather_results.load_model_config(
        dataset, data_path=data_path)

The dataset and experiment_data are dictionaries with the following structure:

In [4]:
dataset_data.keys()

dict_keys(['build_name', 'sample_interval', 'features_regex', 'class_label', 'aggregation_functions', 'training_set', 'dev_set', 'test_set', 'normalized_length', 'labels', 'num_columns', 'num_features', 'normalization_params'])

In [5]:
experiment_data

{'estimator': 'tf.estimator.DNNClassifier',
 'params': {},
 'hyper_params': {'steps': 9500,
  'batch_size': '128',
  'epochs': '500',
  'hidden_units': [100, 100, 100, 100, 100],
  'optimizer': 'Adagrad',
  'learning_rate': 0.05}}

We now set up the experiment and configure the estimator.

In [6]:
estimator = experiment_data['estimator']
hyper_params = experiment_data['hyper_params']
params = experiment_data['params']
steps = int(hyper_params['steps'])
num_epochs = int(hyper_params['epochs'])
batch_size = int(hyper_params['batch_size'])
optimizer = hyper_params['optimizer']
learning_rate = float(hyper_params['learning_rate'])
class_label = dataset_data['class_label']
labels = gather_results.load_dataset(dataset, 'labels', data_path=data_path)['labels']
training_data = gather_results.load_dataset(dataset, 'training', data_path=data_path)
test_data = gather_results.load_dataset(dataset, 'test', data_path=data_path)

#label_vocabulary = None
if class_label == 'node_provider':
    label_vocabulary = set(['rax', 'ovh', 'packethost-us-west-1',
                            'vexxhost', 'limestone-regionone',
                            'inap-mtl01', 'fortnebula-regionone'])
elif class_label == 'node_provider_all':
    label_vocabulary = set(['rax-iad', 'ovh-bhs1', 'packethost-us-west-1',
                            'rax-dfw', 'vexxhost-ca-ymq-1', 'ovh-gra1',
                            'limestone-regionone', 'inap-mtl01', 'rax-ord',
                            'vexxhost-sjc1', 'fortnebula-regionone'])
else:
    label_vocabulary = None

model_dir = gather_results.get_model_folder(dataset, experiment, data_path=data_path)


In [7]:
estimator = tf_trainer.get_estimator(
        estimator, hyper_params, params, labels, model_dir,
        optimizer=adagrad.AdagradOptimizer(learning_rate=0.05),
        label_vocabulary=label_vocabulary, gpu=False)

INFO:tensorflow:Using config: {'_model_dir': '/Users/kw/ciml_data/cimlodsceu2019seed/usr_1m-1min-status/dnn-5x100-500epochs-bs128', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 300, '_session_config': allow_soft_placement: true
, '_keep_checkpoint_max': 10, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


We train the model.

In [8]:
input_fn=tf_trainer.get_input_fn(shuffle=True,
                    batch_size=batch_size, num_epochs=num_epochs,
                    labels=labels, **training_data)





In [9]:
training_result = tf_trainer.get_training_method(estimator)(
                    input_fn=tf_trainer.get_input_fn(shuffle=True,
                    batch_size=batch_size, num_epochs=num_epochs,
                    labels=labels, **training_data), steps=steps)

INFO:tensorflow:Calling model_fn.


To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.

INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /Users/kw/ciml_data/cimlodsceu2019seed/usr_1m-1min-status/dnn-5x100-500epochs-bs128/model.ckpt-37144
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 37144...
INFO:tensorflow:Saving checkpoints for 37144 into /Users/kw/ciml_data/cimlodsceu2019seed/usr_1m-1min-status/dnn-5x100-500epochs-bs128/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 37144...
INFO:tensorflow:

INFO:tensorflow:loss = 3.7563398e-06, step = 43744 (3.713 sec)
INFO:tensorflow:global_step/sec: 27.2461
INFO:tensorflow:loss = 1.5482317e-05, step = 43844 (3.673 sec)
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 43929...
INFO:tensorflow:Saving checkpoints for 43929 into /Users/kw/ciml_data/cimlodsceu2019seed/usr_1m-1min-status/dnn-5x100-500epochs-bs128/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 43929...
INFO:tensorflow:global_step/sec: 24.8964
INFO:tensorflow:loss = 1.274326e-05, step = 43944 (4.021 sec)
INFO:tensorflow:global_step/sec: 26.3871
INFO:tensorflow:loss = 3.14791e-06, step = 44044 (3.797 sec)
INFO:tensorflow:global_step/sec: 20.4954
INFO:tensorflow:loss = 1.28753745e-05, step = 44144 (4.892 sec)
INFO:tensorflow:global_step/sec: 24.8688
INFO:tensorflow:loss = 1.4783594e-05, step = 44244 (3.995 sec)
INFO:tensorflow:global_step/sec: 24.0669
INFO:tensorflow:loss = 2.0092357e-05, step = 44344 (4.166 sec)
INFO:tensorf

We evaluate the trained model with the testset.

In [10]:
eval_data = gather_results.load_dataset(dataset, 'test', data_path=data_path)

In [11]:
eval_size = len(eval_data['example_ids'])
eval_data.keys()

dict_keys(['examples', 'example_ids', 'classes'])

We analyse the predictions of our trained model.
<br>Info logging is enabled to monitor the status of the training.

In [12]:
prediction = estimator.predict(input_fn=tf_trainer.get_input_fn(
                                batch_size=eval_size, num_epochs=1,
                                labels=labels, **eval_data))

In [13]:
predictions = [x for x in prediction]

INFO:tensorflow:Calling model_fn.


To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.

INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /Users/kw/ciml_data/cimlodsceu2019seed/usr_1m-1min-status/dnn-5x100-500epochs-bs128/model.ckpt-46430
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.


In [14]:
p_classes = [x["class_ids"][0] for x in predictions]
actual_classes = eval_data["classes"]
classes = zip(p_classes, actual_classes)
counter = collections.Counter(classes)
sorted(counter.values(), reverse=True)[:15]

[744, 37, 7, 6]

In [15]:
counter

Counter({(0, 0): 744, (1, 1): 37, (0, 1): 7, (1, 0): 6})

Let's save the predictions of the trained model in a zipped json file.

In [16]:
serializable_pred = []
for pred in predictions:
    _classes = pred['classes']
    _all_classes = pred['all_classes']
    pred['classes'] = [x.decode("utf-8") for x in _classes]
    pred['all_classes'] = [x.decode("utf-8") for x in _all_classes]
    serializable_pred.append(pred)

prediction_name = "prediction_" + dataset
pred_data = zip(eval_data['example_ids'], serializable_pred,
                eval_data['classes'])
gather_results.save_data_json(
    dataset, [x for x in pred_data],
    prediction_name, sub_folder=experiment, data_path=data_path)