# Fault Detection Experiment

- The main goal here is to show the simple interface for running a wide range of experiments in forecasting consumer energy utilization.
- We aim to show modularity with respect to:
    1. Dataset: the underlying data being forecasted
    2. Models: take sensor measurements (mostly time series data) and output forecast of these results
    3. Tasks: here we define some of the basic configurations of time series experiments such as _horizon_ and _history_.
- Here we use synthetic data from [Smart-DS](https://www.nrel.gov/grid/smart-ds.html) and downloaded from [BetterGrids.org](https://db.bettergrids.org/bettergrids/handle/1001/94)


In [1]:
import sys
from gridds.experimenter import Experimenter
from gridds.data import SmartDS 
from gridds.tools.viz import visualize_output
import os
import matplotlib.pyplot as plt
from gridds.methods import VRAE, LSTM
# from gridds.tools.utils import *
import gridds.tools.tasks as tasks
import shutil
import numpy as np

- Run all experiments from root directory

In [2]:
# run experiments from root dir ( twp up)
os.chdir('../../')

## Instantiate Dataset
- The first thing we do is build the dataset class.
- We choose a train/test percentage and the the "size" or number of total points we want to use.
- Since this is an example we use a fairly small number of data points.
- We provide the dataset reader class instructions about what part of this dataset we would like to fetch and how we would like it ordered. This is applicable across many types of data. Ie; `sources` might be "transformers", `modalities` would be "[phase angle, voltage]" and `replicates` might be "sites'.
    - Each of these entries needs to be a folder.
- Prepare data converts these into X and y for machine learning.

In [5]:
dataset = SmartDS('univariate_nrel', sites=1, test_pct=.5, normalize=False, size=300)

reader_instructions = {
    'sources': ['P1U'],
    'modalities': ['load_data'],
    'target': '',  # NREL synthetic data doesn't have faults
    'replicates': ['customers']
}
dataset.prepare_data(reader_instructions)

# add faults
dataset.pull_data_during_shuffle = False
dataset.add_faults(n_faults=30, fault_duration=1)


## Instantiate Methods
- Here the modularity and instantiation procedure for methods is very clear since they are all just getting stored in a list.
- We can set some parameters for each method.

In [6]:
methods = []
methods += [VRAE('VRAE',train_iters=50, batch_size=5, learning_rate=.001)]
methods += [LSTM('LSTM', train_iters=50, batch_size=5, learning_rate=.003, layer_dim=1, hidden_size=32, dropout=0)]
    

## Instantiate Task and Run Experiment
- `Experimenter` handles the bulk of running these experiments.
- we choose a task when we run `experimenter.run_experiment`.  Here we chose `tasks.default_autoregression`

In [7]:
tasks.default_fault_pred

{'name': 'fault_pred',
 'procedure': ['fit', 'predict'],
 'metrics': [<function gridds.tools.metrics.recall(y_pred, y)>,
  <function gridds.tools.metrics.acc(y_pred, y)>,
  <function gridds.tools.metrics.precision(y_pred, y)>,
  <function gridds.tools.metrics.f1(y_pred, y)>]}

In [9]:
exp = Experimenter('basic_impute', runs=1)
exp.run_experiment(dataset,methods,task=tasks.default_fault_pred, clean=False)

In [None]:
visualize_output(os.path.join('outputs', exp.name))

In [None]:
# copy this file into output folder for archive 
curr_filepath = os.path.join(os.getcwd(), 'experiments', __file__)
shutil.copy(curr_filepath, os.path.join('outputs', exp.name))