## Build TCN input mask

This file is usefull to build a mask in order to avoid useless computation during model training. 
We'd like to skip the following observations:

Observations to be skipped:
* Included by default in TimeseriesGenerator : 
    * first `timeseries_receptive_field-1` observations for inputs (Ref. the whole dataset)
    * last observation for inputs (Ref. the whole dataset)
* Not included by default in TimeseriesGenerator :
    * first `timeseries_receptive_field-1` observations for new nodes
    * last observation for each node
    * first `timeseries_receptive_field-1` observations after a hole
    * last observation before a hole
    * first `timeseries_receptive_field-1` observations after ones having `is_non_rising_anomaly=1`
    * last observation before ones having `is_non_rising_anomaly=1`

In [64]:
import pandas as pd
import numpy as np
from itables import show, options
from datetime import timedelta

options.maxBytes=0

In [10]:
receptive_field = 128

In [187]:
dataset = pd.read_csv('./final_data/final_data_full.csv').set_index(['node', 'timestamp'])

##### Build mask

In [182]:
dataset_to_mask = dataset.reset_index()[['node', 'timestamp', 'is_non_rising_anomaly']]
dataset_to_mask['timestamp'] = pd.to_datetime(dataset_to_mask['timestamp'])

# Initialize mask with 1
mask = np.ones(len(dataset_to_mask))

prev_node, prev_timestamp = dataset_to_mask['node'][0], dataset_to_mask['timestamp'][0]

for idx, (node, timestamp, is_non_rising_anomaly) in dataset_to_mask.iterrows() :
#     print('idx: {}; node: {}; timestamp: {}; is_non_rising_anomaly: {}'\
#           .format(idx, node, timestamp, is_non_rising_anomaly))
    
    # Check new node, holes and rising anomaly
    if (prev_node != node) or \
       (timestamp - prev_timestamp > timedelta(minutes=5)) or \
       (is_non_rising_anomaly == 1) :
        
        # Hide following timeseries_receptive_field -1 observations
        if idx+receptive_field <= len(mask) :
            mask[idx:idx+receptive_field] = 0
        else
            mask[idx:len(mask)] = 0
        # Hide previous observations (which label is invalid)
        if idx > 0 :
            mask[idx-1] = 0
     
    prev_node = node
    prev_timestamp = timestamp