# Using Orion on Multivariate Input

In this notebook, we demonstrate how you can use multivariate time series in Orion. We will walk through the process using NASA's dataset, you can find the original data in [Telemanom](https://github.com/khundman/telemanom) github or directly from their [S3 bucket](https://s3-us-west-2.amazonaws.com/telemanom/data.zip).

## 1. Load the data

In the first step, we setup the environment and load the CSV that we want to process.

To do so, we need to import the `orion.data.load_signal` function and call it passing
the path to the CSV file.

In this case, we will be loading the `S-1.csv` file from inside the `data/multivariate` folder.

In [1]:
from orion.data import load_signal

# signal_path = 'multivariate/S-1'

# data = load_signal(signal_path)
# data.head()

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv('swat.csv')
data = data.iloc[:7500]
data.head(5)

Unnamed: 0,timestamp,FIT101,LIT101,MV101,P101,P102,AIT201,AIT202,AIT203,FIT201,...,P501,P502,PIT501,PIT502,PIT503,FIT601,P601,P602,P603,Normal/Attack
0,1451296800000000000,2.427057,522.8467,2,2,1,262.0161,8.396437,328.6337,2.445391,...,2,1,250.8652,1.649953,189.5988,0.000128,1,1,1,Normal
1,1451296801000000000,2.446274,522.886,2,2,1,262.0161,8.396437,328.6337,2.445391,...,2,1,250.8652,1.649953,189.6789,0.000128,1,1,1,Normal
2,1451296802000000000,2.489191,522.8467,2,2,1,262.0161,8.394514,328.6337,2.442316,...,2,1,250.8812,1.649953,189.6789,0.000128,1,1,1,Normal
3,1451296803000000000,2.53435,522.9645,2,2,1,262.0161,8.394514,328.6337,2.442316,...,2,1,250.8812,1.649953,189.6148,0.000128,1,1,1,Normal
4,1451296804000000000,2.56926,523.4748,2,2,1,262.0161,8.394514,328.6337,2.443085,...,2,1,250.8812,1.649953,189.5027,0.000128,1,1,1,Normal


## 2. Detect anomalies using Orion

Once we have the data, let us try to use the LSTM pipeline to analyze it and search for anomalies.

In order to do so, we will import the `Orion` class from `orion.core` and pass it
the loaded data and the path to the pipeline JSON that we want to use.

In this case, we will be using the `lstm_dynamic_threshold` pipeline from inside the `orion` folder. 

In addition, we setup the hyperparameters to correctly identify the signal we are trying to predict. In this case, dimension `0` is the signal value and such we set `target_column` to `0`. Note that `0` refers to the location of the channel rather than the name.

In [None]:
from orion import Orion

hyperparameters = {
    "mlprimitives.custom.timeseries_preprocessing.rolling_window_sequences#1": {
        'target_column': 0 
    },
    'keras.Sequential.LSTMTimeSeriesRegressor#1': {
        'epochs': 5,
        'verbose': True
    }
}

orion = Orion(
    pipeline='lstm_dynamic_threshold',
    hyperparameters=hyperparameters
)

orion.fit(data)

Using TensorFlow backend.


The output will be a ``pandas.DataFrame`` containing a table with the detected anomalies.

In [None]:
orion.detect(data)

For reconstruction based pipelines, we need to specify the shape of the **input** and **target** sequences. For example, assume we are using the `lstm_autoencoder` pipeline, we set the hyperparameter values 

```python3
hyperparameters = {
    "mlprimitives.custom.timeseries_preprocessing.rolling_window_sequences#1": {
        'window_size': 100,
        'target_column': 0 
    },
    'keras.Sequential.LSTMSeq2Seq#1': {
        'epochs': 5,
        'verbose': True,
        'input_shape': [100, 25],
        'target_shape': [100, 1],
    }
}
```

where the shape of the input is dependent on 

1. `window_size` and 
2. the number of channels in the data.

Similarly, the shape of the output is dependent on the `window_size`. Currently, we are focusing on multivariate input and univariate output, therefore the target shape should always be [`window_size`, 1].

In [None]:
hyperparameters = {
    "mlprimitives.custom.timeseries_preprocessing.rolling_window_sequences#1": {
        "window_size": 100,
        "target_column": 0 
    },
    'keras.Sequential.LSTMSeq2Seq#1': {
        'epochs': 5,
        'verbose': True,
        'input_shape': [100, 51],
        'target_shape': [100, 1],
    }
}

orion = Orion(
    pipeline='lstm_autoencoder',
    hyperparameters=hyperparameters
)

orion.fit(data)

TadGAN is also a reconstruction based pipeline, thus we specify the `input_shape` to be of multivariate shape as needed.

```python3
hyperparameters = {
    'orion.primitives.tadgan.TadGAN#1': {
        'epochs': 5,
        'verbose': True,
        'input_shape': [100, 25]
    }
}
```

In [None]:
hyperparameters = {
    'orion.primitives.tadgan.TadGAN#1': {
        'epochs': 5,
        'verbose': True,
        'input_shape': [100, 51]
    }
}

orion = Orion(
    pipeline='lstm_autoencoder',
    hyperparameters=hyperparameters
)

orion.fit(data)