Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.

SPDX-License-Identifier: Apache-2.0

## Table of Contents

1. Specify data folder containing individual CSVs
2. Specify location containing label JSON
3. Train Context OSE Model from NAB library
4. Perform inference on test set

In [1]:
import sys
sys.path.append("../../src/")

In [2]:
from anomaly_detection_spatial_temporal_data.model.time_series import NABAnomalyDetector

In [3]:
import pandas as pd

# Train NAB Context OSE model

In [4]:
model_name = "contextOSE"
model_path = "../../src/anomaly_detection_spatial_temporal_data/model/NAB"
input_dir = "../../data/02_intermediate/iot/ts_data"
label_file = "../../data/02_intermediate/iot/ts_label/labels-test.json"
output_dir =  "../../data/07_model_output/iot/ts_result-notebook"

In [5]:
nab = NABAnomalyDetector(
    model_name, 
    model_path,
    input_dir,
    label_file,
    output_dir,
)

In [6]:
nab.predict()

Predicting ...
3: Beginning detection with contextOSE for S_PU8.csv
. 0: Beginning detection with contextOSE for P_J306.csv
. 6: Beginning detection with contextOSE for F_PU3.csv
. 9: Beginning detection with contextOSE for P_J269.csv
. . . . 6: Completed processing 2089 records at 2022-08-10 19:17:17.406284
6: Results have been written to /home/ec2-user/SageMaker/external-repos/anomaly-detection-spatial-temporal-data-workshop/data/07_model_output/iot/ts_result-notebook/contextOSE/contextOSE_F_PU3.csv
7: Beginning detection with contextOSE for L_T1.csv
. . . . . 3: Completed processing 2089 records at 2022-08-10 19:17:18.869856
3: Results have been written to /home/ec2-user/SageMaker/external-repos/anomaly-detection-spatial-temporal-data-workshop/data/07_model_output/iot/ts_result-notebook/contextOSE/contextOSE_S_PU8.csv
4: Beginning detection with contextOSE for P_J300.csv
. . 9: Completed processing 2089 records at 2022-08-10 19:17:20.287837
9: Results have been written to /home/ec2-

# Load an inference result
The NAB time series anomaly models use the historical context in a specific time series to identify if a future time step is an anomaly. Each time series is treated independently, so we do not expect this model to perform well on this dataset.

In [7]:
!ls ../../data/07_model_output/iot/ts_result-notebook/contextOSE/

contextOSE_F_PU10.csv  contextOSE_L_T4.csv    contextOSE_P_J422.csv
contextOSE_F_PU11.csv  contextOSE_L_T5.csv    contextOSE_S_PU10.csv
contextOSE_F_PU1.csv   contextOSE_L_T6.csv    contextOSE_S_PU11.csv
contextOSE_F_PU2.csv   contextOSE_L_T7.csv    contextOSE_S_PU1.csv
contextOSE_F_PU3.csv   contextOSE_P_J14.csv   contextOSE_S_PU2.csv
contextOSE_F_PU4.csv   contextOSE_P_J256.csv  contextOSE_S_PU3.csv
contextOSE_F_PU5.csv   contextOSE_P_J269.csv  contextOSE_S_PU4.csv
contextOSE_F_PU6.csv   contextOSE_P_J280.csv  contextOSE_S_PU5.csv
contextOSE_F_PU7.csv   contextOSE_P_J289.csv  contextOSE_S_PU6.csv
contextOSE_F_PU8.csv   contextOSE_P_J300.csv  contextOSE_S_PU7.csv
contextOSE_F_PU9.csv   contextOSE_P_J302.csv  contextOSE_S_PU8.csv
contextOSE_F_V2.csv    contextOSE_P_J306.csv  contextOSE_S_PU9.csv
contextOSE_L_T1.csv    contextOSE_P_J307.csv  contextOSE_S_V2.csv
contextOSE_L_T2.csv    contextOSE_P_J317.csv
contextOSE_L_T3.csv    contextOSE_P_J415.csv


In [8]:
output_csv = "contextOSE_L_T1.csv"

df_out = pd.read_csv(f"{output_dir}/contextOSE/{output_csv}")

In [9]:
df_out.head(20)

Unnamed: 0,timestamp,value,anomaly_score,label
0,2017-01-04 00:00:00,0.73,0.0,0
1,2017-01-04 01:00:00,0.69,0.0,0
2,2017-01-04 02:00:00,0.9,0.0,0
3,2017-01-04 03:00:00,1.11,0.0,0
4,2017-01-04 04:00:00,1.27,0.0,0
5,2017-01-04 05:00:00,1.54,0.0,0
6,2017-01-04 06:00:00,1.85,0.0,0
7,2017-01-04 07:00:00,2.0,0.0,0
8,2017-01-04 08:00:00,2.06,0.0,0
9,2017-01-04 09:00:00,2.04,0.0,0


# References
Riccardo Taormina and Stefano Galelli and Nils Ole Tippenhauer and Elad Salomons and Avi Ostfeld and Demetrios G. Eliades and Mohsen Aghashahi and Raanju Sundararajan and Mohsen Pourahmadi and M. Katherine Banks and B. M. Brentan and Enrique Campbell and G. Lima and D. Manzi and D. Ayala-Cabrera and M. Herrera and I. Montalvo and J. Izquierdo and E. Luvizotto and Sarin E. Chandy and Amin Rasekh and Zachary A. Barker and Bruce Campbell and M. Ehsan Shafiee and Marcio Giacomoni and Nikolaos Gatsis and Ahmad Taha and Ahmed A. Abokifa and Kelsey Haddad and Cynthia S. Lo and Pratim Biswas and M. Fayzul K. Pasha and Bijay Kc and Saravanakumar Lakshmanan Somasundaram and Mashor Housh and Ziv Ohar; "The Battle Of The Attack Detection Algorithms: Disclosing Cyber Attacks On Water Distribution Networks." Journal of Water Resources Planning and Management, 144 (8), August 2018

Alexander Lavin and Subutai Ahmad. 2015. Evaluating Real-Time Anomaly Detection Algorithms – The Numenta Anomaly Benchmark.