# Anomaly detection using LSTM

This notebook is an example for how to use Long-Short Term Memory neural networks for anomaly detection.

Data used in this example is taken from hardware sensors attached to a robot arm performing simple tasks. We will be using 3 values - linear acceleration X/Y/Z. The goal is to predict robot malfunction before it crashes (and before a person is able to notice it).

Network graph will be prepared using Keras/Tensorflow and the training itself will be performed using DeepWater.

## Setup

Import required modules (h2o for deepwater and pandas for initial data munging).

Bootstrap a single node H2O instance.

In [1]:
import h2o
from h2o.estimators.deepwater import H2ODeepWaterEstimator
import pandas as pd

h2o.init(port=54321, nthreads=-1)
if not H2ODeepWaterEstimator.available(): exit

Checking whether there is an H2O instance running at http://localhost:54321..... not found.
Attempting to start a local H2O server...
  Java Version: java version "1.8.0_112"; Java(TM) SE Runtime Environment (build 1.8.0_112-b16); Java HotSpot(TM) 64-Bit Server VM (build 25.112-b16, mixed mode)
  Starting server from /Users/mateusz/anaconda3/envs/tensorflow/lib/python3.5/site-packages/h2o/backend/bin/h2o.jar
  Ice root: /var/folders/05/vvsycrl162zbzlh1wy8bdslh0000gn/T/tmpq_aml1fu
  JVM stdout: /var/folders/05/vvsycrl162zbzlh1wy8bdslh0000gn/T/tmpq_aml1fu/h2o_mateusz_started_from_python.out
  JVM stderr: /var/folders/05/vvsycrl162zbzlh1wy8bdslh0000gn/T/tmpq_aml1fu/h2o_mateusz_started_from_python.err
  Server is running at http://127.0.0.1:54321
Connecting to H2O server at http://127.0.0.1:54321... successful.


0,1
H2O cluster uptime:,04 secs
H2O cluster version:,3.11.0.99999
H2O cluster version age:,5 months and 24 days !!!
H2O cluster name:,H2O_from_python_mateusz_1mv4hy
H2O cluster total nodes:,1
H2O cluster free memory:,3.556 Gb
H2O cluster total cores:,8
H2O cluster allowed cores:,8
H2O cluster status:,"accepting new members, healthy"
H2O connection url:,http://127.0.0.1:54321


## Custom TF LSTM model

Build a custom LSTM network graph using Keras and save it for later use.

In [2]:
import tensorflow as tf
import json
from keras.layers.core import Reshape
from keras.layers import LSTM
from keras import backend as K
from keras.objectives import categorical_crossentropy
from tensorflow.python.framework import ops

def keras_model(h, w):
    classes = 1
    # always create a new graph inside ipython or
    # the default one will be used and can lead to
    # unexpected behavior
    graph = tf.Graph() 
    with graph.as_default():
        size = w * h
        # Input images fed via H2O
        inp = tf.placeholder(tf.float32, [None, size])
        # Actual labels used for training fed via H2O
        labels = tf.placeholder(tf.float32, [None, classes])

        # Keras network
        x = Reshape((h, w))(inp)

        x = LSTM(32, input_shape=(h, w))(x)
        x = Reshape((h, classes))(x)
        out = x

        predictions = out

        loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=labels,logits=out))
        train_step = tf.train.AdamOptimizer(1e-3).minimize(loss)

        init_op = tf.global_variables_initializer()

        # Metadata required by H2O
        tf.add_to_collection(ops.GraphKeys.INIT_OP, init_op.name)
        tf.add_to_collection(ops.GraphKeys.TRAIN_OP, train_step)
        tf.add_to_collection("logits", out)
        tf.add_to_collection("predictions", predictions)

        meta = json.dumps({
                "inputs": {"batch_image_input": inp.name,
                           "categorical_labels": labels.name},
                "outputs": {"categorical_logits": out.name,
                            "layers": ','.join([m.name for m in tf.get_default_graph().get_operations()])},
                "parameters": {}
            })
        tf.add_to_collection("meta", meta)

        # Save the meta file with the graph
        saver = tf.train.Saver()
        filename = "/tmp/keras_tf_lstm.meta"
        tf.train.export_meta_graph(filename, saver_def=saver.as_saver_def())

        return filename

Using TensorFlow backend.


## Load and prepare data

In [3]:
# Read training data into memory
iot_raw = pd.read_csv('../model-builder/resources/normal_20170202_2229.csv')

# Select training columns
iot_train = iot_raw[[" LinAccX (g)"," LinAccY (g)"," LinAccZ (g)"]]

y = iot_train.sum(axis=1)
iot_train["y"] = y

# Send the data to H2O
iot_train_frame  = h2o.H2OFrame(iot_train)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy


Parse progress: |█████████████████████████████████████████████████████████| 100%


## Train the model

In [4]:
batch_size = 32
(h, w) = iot_train.shape
model_filename = keras_model(batch_size, w-1)
model = H2ODeepWaterEstimator(epochs=500, 
                              network_definition_file=model_filename,
                              backend="tensorflow", 
                             ) 
model.train(x = [" LinAccX (g)"," LinAccY (g)"," LinAccZ (g)"], y = 3, training_frame=iot_train_frame)
model.show()

deepwater Model Build progress: | (failed)


OSError: Job with key $03017f00000132d4ffffffff$_9abc1fac1e1fd656ae2b346aee704b99 failed with an exception: java.lang.IllegalArgumentException: cannot copy Tensor with 3 dimensions into an object with 2
stacktrace: 
java.lang.IllegalArgumentException: cannot copy Tensor with 3 dimensions into an object with 2
	at org.tensorflow.Tensor.throwExceptionIfTypeIsIncompatible(Tensor.java:574)
	at org.tensorflow.Tensor.copyTo(Tensor.java:352)
	at deepwater.backends.tensorflow.models.TensorflowModel.getPredictions(TensorflowModel.java:81)
	at deepwater.backends.tensorflow.TensorflowBackend.predict(TensorflowBackend.java:446)
	at hex.deepwater.DeepWaterModelInfo.predict(DeepWaterModelInfo.java:67)
	at hex.deepwater.DeepWaterModel$DeepWaterBigScore.setupLocal(DeepWaterModel.java:827)
	at water.MRTask.setupLocal0(MRTask.java:550)
	at water.MRTask.dfork(MRTask.java:456)
	at water.MRTask.doAll(MRTask.java:389)
	at water.MRTask.doAll(MRTask.java:385)
	at hex.deepwater.DeepWaterModel.scoreMetrics(DeepWaterModel.java:946)
	at hex.deepwater.DeepWaterModel.doScoring(DeepWaterModel.java:368)
	at hex.deepwater.DeepWater$DeepWaterDriver.trainModel(DeepWater.java:353)
	at hex.deepwater.DeepWater$DeepWaterDriver.buildModel(DeepWater.java:205)
	at hex.deepwater.DeepWater$DeepWaterDriver.computeImpl(DeepWater.java:118)
	at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:169)
	at hex.deepwater.DeepWater$DeepWaterDriver.compute2(DeepWater.java:111)
	at water.H2O$H2OCountedCompleter.compute(H2O.java:1190)
	at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
	at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
	at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
	at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
	at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
