# Process CSV Signals

This is a demo notebook showing how to use the `lstm_dynamic_threshold.json` pipeline to analyze a collection of signal CSV files and later on retrieve the list of Events found.

## 1. Create an OrionExlorer Instance

The first step is to import the `OrionExplorer` and create an instance.

In this case, we will provide no arguments, since we want to connect to the default database, named `orion`, at localhost.

In [1]:
from orion.explorer import OrionExplorer

In [2]:
explorer = OrionExplorer()
explorer.drop_database()  # in case name signals already exist

## 2. Add the pipeline that we will be using

The second step is to register the pipeline that we are going to use.

For this, we will enter the path to the `lstm_dynamic_threshold` json.

In [4]:
pipeline = explorer.add_pipeline(
    'lstm_dynamic_threshold',
    '../orion/pipelines/lstm_dynamic_threshold.json'
)

Afterwards, we can obtain the list of pipelines to see if it has been properly registered

In [5]:
explorer.get_pipelines()

Unnamed: 0,pipeline_id,insert_time,mlpipeline,name
0,5c9bbf556c1cea7f0a5e6a34,2019-03-27 18:22:13.097,{'primitives': ['mlprimitives.custom.timeserie...,lstm_dynamic_threshold


## 3. Get the list of CSV files

In this example we will use the `os` module to find the list of CSV files that exist inside the directory
`data` that we have created inside this `notebooks` folder.

Another way to do it would be to provide an explicit list of filenames

In [6]:
import os

CSVS_FOLDER = './data'

csvs = os.listdir(CSVS_FOLDER)
csvs

['S-1.csv', 'S-2.csv', 'P-1.csv', 'E-1.csv']

## 3. Register the new datasets

We will execute a loop in which, for each CSV file, we will register a new Dataset in the Database.

For each CSV, the name that we will use for dataset and the signal will be name of the file without the `.csv` extension, and will be leaving the satellite_id blank.

In this case we need no additional arguments, such as timestamp_column or value_column, but if they were required
we would add them to the `add_dataset` call.

We will also capture the output of the `add_dataset` call in a list, so we can use these datasets later on.

In [7]:
datasets = list()
cwd = os.getcwd()

for path in csvs:
    name = os.path.basename(path)[:-4]
    location = os.path.join(CSVS_FOLDER, path)
    print('Adding dataset {} for CSV {}'.format(name, location))
    dataset = explorer.add_dataset(
        name,
        name,
        location=location,
        timestamp_column=None,    # Replace if needed
        value_column=None,        # Replace if needed
    )
    datasets.append(dataset)

Adding dataset S-1 for CSV ./data/S-1.csv
Adding dataset S-2 for CSV ./data/S-2.csv
Adding dataset P-1 for CSV ./data/P-1.csv
Adding dataset E-1 for CSV ./data/E-1.csv


Afterwards we can check that the datasets were properly registered

In [8]:
explorer.get_datasets()

Unnamed: 0,dataset_id,data_location,insert_time,name,signal_set,start_time,stop_time
0,5c9bbf556c1cea7f0a5e6a35,./data/S-1.csv,2019-03-27 18:22:13.330,S-1,S-1,1222819200,1442016000
1,5c9bbf556c1cea7f0a5e6a36,./data/S-2.csv,2019-03-27 18:22:13.431,S-2,S-2,1222819200,1282262400
2,5c9bbf556c1cea7f0a5e6a37,./data/P-1.csv,2019-03-27 18:22:13.457,P-1,P-1,1222819200,1468540800
3,5c9bbf556c1cea7f0a5e6a38,./data/E-1.csv,2019-03-27 18:22:13.469,E-1,E-1,1222819200,1468951200


## 4. Run the pipeline on the datasets

Once the pipeline and the datasets are registered, we can start the processing loop.

In [13]:
for dataset in datasets:
    print('Analyzing dataset {}'.format(dataset.name))
    explorer.analyze(dataset.name, pipeline.name)

Analyzing dataset S-1


Using TensorFlow backend.


Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
Instructions for updating:
Use tf.cast instead.
Epoch 1/1


  numpy.max(numpy.abs(fsim[0] - fsim[1:])) <= fatol):


Analyzing dataset S-2
Epoch 1/1
Analyzing dataset P-1
Epoch 1/1


  numpy.max(numpy.abs(fsim[0] - fsim[1:])) <= fatol):


Analyzing dataset E-1
Epoch 1/1


  numpy.max(numpy.abs(fsim[0] - fsim[1:])) <= fatol):


## 5. Analyze the results

Once the execution has finished, we can explore the Dataruns and the detected Events.

In [14]:
explorer.get_dataruns()

Unnamed: 0,datarun_id,dataset,end_time,events,insert_time,pipeline,start_time,status
0,5c9bbfc36c1cea7f0a5e6a39,5c9bbf556c1cea7f0a5e6a35,2019-03-27 18:25:28.173,1,2019-03-27 18:24:03.281,5c9bbf556c1cea7f0a5e6a34,2019-03-27 18:24:03.280,done
1,5c9bc0186c1cea7f0a5e6a3b,5c9bbf556c1cea7f0a5e6a36,2019-03-27 18:25:51.752,1,2019-03-27 18:25:28.300,5c9bbf556c1cea7f0a5e6a34,2019-03-27 18:25:28.299,done
2,5c9bc02f6c1cea7f0a5e6a3d,5c9bbf556c1cea7f0a5e6a37,2019-03-27 18:27:27.937,10,2019-03-27 18:25:51.849,5c9bbf556c1cea7f0a5e6a34,2019-03-27 18:25:51.848,done
3,5c9bc0906c1cea7f0a5e6a48,5c9bbf556c1cea7f0a5e6a38,2019-03-27 18:29:03.660,1,2019-03-27 18:27:28.032,5c9bbf556c1cea7f0a5e6a34,2019-03-27 18:27:28.031,done


In [15]:
explorer.get_events()

Unnamed: 0,event_id,datarun,insert_time,score,start_time,stop_time,comments
0,5c9bc0186c1cea7f0a5e6a3a,5c9bbfc36c1cea7f0a5e6a39,2019-03-27 18:25:28.121,0.074029,1398729600,1399356000,0
1,5c9bc02f6c1cea7f0a5e6a3c,5c9bc0186c1cea7f0a5e6a3b,2019-03-27 18:25:51.752,5.523098,1256990400,1257120000,0
2,5c9bc08f6c1cea7f0a5e6a3e,5c9bc02f6c1cea7f0a5e6a3d,2019-03-27 18:27:27.928,0.045721,1223661600,1223704800,0
3,5c9bc08f6c1cea7f0a5e6a3f,5c9bc02f6c1cea7f0a5e6a3d,2019-03-27 18:27:27.929,0.05634,1232280000,1232388000,0
4,5c9bc08f6c1cea7f0a5e6a40,5c9bc02f6c1cea7f0a5e6a3d,2019-03-27 18:27:27.930,0.06079,1277467200,1277618400,0
5,5c9bc08f6c1cea7f0a5e6a41,5c9bc02f6c1cea7f0a5e6a3d,2019-03-27 18:27:27.931,0.032236,1285200000,1285372800,0
6,5c9bc08f6c1cea7f0a5e6a42,5c9bc02f6c1cea7f0a5e6a3d,2019-03-27 18:27:27.932,0.016799,1285437600,1285437600,0
7,5c9bc08f6c1cea7f0a5e6a43,5c9bc02f6c1cea7f0a5e6a3d,2019-03-27 18:27:27.933,0.043102,1296928800,1297036800,0
8,5c9bc08f6c1cea7f0a5e6a44,5c9bc02f6c1cea7f0a5e6a3d,2019-03-27 18:27:27.934,0.001325,1344319200,1344319200,0
9,5c9bc08f6c1cea7f0a5e6a45,5c9bc02f6c1cea7f0a5e6a3d,2019-03-27 18:27:27.935,0.046676,1351728000,1351792800,0
