# Anomaly Detection Framework Example

This notebook provides a walkthrough of using the anomaly detection framework in a test environment. This test environment was used as UDL's InfluxDB instance was still being setup with SkySpark data during the project. The test environment populates an instance of InfluxDB (created using Docker) with sensor data from `../../data/labelled-skyspark-data/`. The sensor data was manually downloaded from SkySpark and corresponds with five sensors used in Phase 1 model testing.

In [1]:
import time
import sys

import numpy as np
import pandas as pd

# influxdb_client is used to populate InfluxDB with the csv data
from influxdb_client import InfluxDBClient
from influxdb_client.client.write_api import SYNCHRONOUS

Import the model package:

In [2]:
# files are contained in a sibling folder
sys.path.append("..")

import model.clean as cl
import model.model_trainer as mt
import model.model_predict as mp
from model.influx_interact import influx_class

## Step 1 - Create Local InfluxDB Instance

Copy `docker-compose.yml` located in this directory to a local directory. Then run the command `docker-compose up` from this local directory.

Go to `http://localhost:8086/` and enter `MDS2021` as user name and `mypassword` to log in to the user interface.

## Step 2 - Populate InfluxDB with Sensor Data

This step will populate InfluxDB with csv files located in `../../data/labelled-skyspark-data/`. These files correspond with the Phase 1 model testing. The code presented in this section is also available in `populate_influx.py`.

Note that this step is just for creating the data in this test environment.


In [3]:
PATH_TO_CSVS = "../../data/labelled-skyspark-data/"
CSVS_TO_LOAD = [
    "CEC_compiled_data_1b_updated.csv",
    "CEC_compiled_data_2b_updated.csv",
    "CEC_compiled_data_3b_updated.csv",
    "CEC_compiled_data_4b_updated.csv",
    "CEC_compiled_data_5b_updated.csv",
]

Create a look up table for sensors and their manually labeled data sets.

In [4]:
JOIN_MANUAL_ANOMALIES = True
PATH_TO_LABELLED_CSVS = "../../data/labelled-skyspark-data/"
LABELLED_LOOKUP = {
    "Campus Energy Centre Campus HW Main Meter Power" : "CEC_compiled_data_1b_updated.csv",
    "Campus Energy Centre Campus HW Main Meter Entering Water Temperature" : "CEC_compiled_data_2b_updated.csv",
    "Campus Energy Centre Campus HW Main Meter Flow" : "CEC_compiled_data_3b_updated.csv",
    "Campus Energy Centre Boiler B-1 Gas Pressure" : "CEC_compiled_data_4b_updated.csv",
    "Campus Energy Centre Boiler B-1 Exhaust O2" : "CEC_compiled_data_5b_updated.csv",
}

In [6]:
# for data viewing in notebook
pd.set_option('display.expand_frame_repr', False)

Set up InfluxDB connection:

In [5]:
# as setup in docker-compose.yml
token = "mytoken"
org = "UBC"
bucket = "MDS2021"

# setup InfluxDB client
client = InfluxDBClient(url="http://localhost:8086", token=token, timeout=999_000)
write_api = client.write_api(write_options=SYNCHRONOUS)

Read each csv file and write the data to InfluxDB. This sets up the sensor data in InfluxDB in the READINGS measurement mimicing how SkySpark data exists in InfluxDB. Note that only the tags/field required for anomaly detection are populated.

Important note: If the influx write times out, re-run and it should work on the second try.

In [11]:
for csv in CSVS_TO_LOAD:

    # load and set up dataframes
    df = pd.read_csv(PATH_TO_CSVS + csv, parse_dates=["Datetime"])
    df.rename(columns={"Value": "val_num"}, inplace=True)
    df.rename(columns={"ID": "uniqueID"}, inplace=True)
    df.rename(columns={"Anomaly": "AH"}, inplace=True)
    df["navName"] = "Energy"
    df["siteRef"] = "Campus Energy Centre"
    df.set_index("Datetime", drop=True, inplace=True)
    df = df.drop(["AH"], axis=1)

    print("writing: {}".format(csv))
    # write values
    write_api.write(
        bucket,
        org,
        record=df,
        data_frame_measurement_name="READINGS",
        data_frame_tag_columns=["uniqueID", "navName", "siteRef"],
    )
    time.sleep(5)

writing: CEC_compiled_data_1b_updated.csv
writing: CEC_compiled_data_2b_updated.csv
writing: CEC_compiled_data_3b_updated.csv
writing: CEC_compiled_data_4b_updated.csv
writing: CEC_compiled_data_5b_updated.csv


Look at the `df` object to see what was written to influx

In [12]:
df.head()

Unnamed: 0_level_0,val_num,uniqueID,navName,siteRef
Datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2020-01-01 07:45:00,2.9,Campus Energy Centre Boiler B-1 Exhaust O2,Energy,Campus Energy Centre
2020-01-01 08:00:00,2.9,Campus Energy Centre Boiler B-1 Exhaust O2,Energy,Campus Energy Centre
2020-01-01 08:15:00,2.9,Campus Energy Centre Boiler B-1 Exhaust O2,Energy,Campus Energy Centre
2020-01-01 08:30:00,2.9,Campus Energy Centre Boiler B-1 Exhaust O2,Energy,Campus Energy Centre
2020-01-01 08:45:00,2.9,Campus Energy Centre Boiler B-1 Exhaust O2,Energy,Campus Energy Centre


Sensor data has now been written to the InfluxDB READINGS measurement. A screenshot of what this looks like in InfluxDB is shown below.

![](demo_screenshots/step2.PNG)

## Step 3 - Test Anomaly Detection Model Training

This step tests model training. This would be typically run on a selected interval (for example every month) to update the anomaly detection models. A script for model training that can be used with UDL's InfluxDB instance is available in `../code/sensor_training.py`. Code that is only applicable to this test environment or differs from what would exist in `../code/sensor_training.py` is noted.

The code presented in this section is also available in `test_env_scheduled_training.py`.  

This provides the option to subset the training data for faster testing. Model training can be completed using the entire sensors record by setting this to `False`.

In [7]:
TESTING = True

Provide sensor threshold ratios for anomaly detection. In the prediciton stage a 99.5 percentile will be calculated on the loss and saved.  
On prediction this percentile value will be loaded.  
In each case the threshold_ratio will be multipled with the percentile to get the threshold

In [8]:
THRESHOLD_RATIOS = {
    "Campus Energy Centre Campus HW Main Meter Power": 1.8,
    "Campus Energy Centre Boiler B-1 Exhaust O2": 1,
    "Campus Energy Centre Boiler B-1 Gas Pressure": 0.23,
    "Campus Energy Centre Campus HW Main Meter Entering Water Temperature": 0.3,
    "Campus Energy Centre Campus HW Main Meter Flow": 1.72,
}

Set up the sequence time step sizes.  
They are currently all the same, but can individually be changed

In [9]:
TIME_STEP_SIZES = {
    "Campus Energy Centre Campus HW Main Meter Power":15,
    "Campus Energy Centre Boiler B-1 Exhaust O2":15,
    "Campus Energy Centre Boiler B-1 Gas Pressure":15,
    "Campus Energy Centre Campus HW Main Meter Entering Water Temperature":15,
    "Campus Energy Centre Campus HW Main Meter Flow":15,
}

Same thing for window sizes.

In [10]:
WINDOW_SIZES = {
    "Campus Energy Centre Campus HW Main Meter Power":15,
    "Campus Energy Centre Boiler B-1 Exhaust O2":15,
    "Campus Energy Centre Boiler B-1 Gas Pressure":15,
    "Campus Energy Centre Campus HW Main Meter Entering Water Temperature":15,
    "Campus Energy Centre Campus HW Main Meter Flow":15,
}

End time to be used for model training such that data that will be predicted during Step 4 of this test environment is not used in model training In the `sensor_training.py` there is no need to set an end time as the model will train on all available data. 

In [11]:
END_TIME = 1613109600

The following code provides data removal of manually labelled anomalous data for "Campus Energy Centre Campus HW Main Meter Entering Water Temperature".

In [12]:
REMOVE_ANOMALOUS = True
REMOVE_ANOMALOUS_DATA = [
    "Campus Energy Centre Campus HW Main Meter Entering Water Temperature"
]

Specify the paths to save the model and standard scaler from the cleaning pipeline and create the InfluxDB client.

In [22]:
model_path = "./test_env_models/"
scaler_path = "./test_env_standardizers/"

# setup InfluxDB client
token = "mytoken"
org = "UBC"
bucket = "MDS2021"
url = "http://localhost:8086"

influx_conn = influx_class(
    org=org,
    url=url,
    bucket=bucket,
    token=token,
)

Read data for model training from the InfluxDB READINGS measurement

In [14]:
influx_read_df = influx_conn.make_query(
    location="Campus Energy Centre",
    measurement="READINGS",
    end=END_TIME,
)

Split the data based on uniqueID into individual sensor dataframes

In [15]:
main_bucket = cl.split_sensors(influx_read_df)

The `main_bucket` object is a dictionary with the name of the sensor as the key and then the value is another dict of data objects

In [16]:
main_bucket.keys()

dict_keys(['Campus Energy Centre Boiler B-1 Exhaust O2', 'Campus Energy Centre Boiler B-1 Gas Pressure', 'Campus Energy Centre Campus HW Main Meter Entering Water Temperature', 'Campus Energy Centre Campus HW Main Meter Flow', 'Campus Energy Centre Campus HW Main Meter Power'])

Read the csvs again to get manual anomalies, in the live implementation this would be a second measument and there would be a read then a join

In [17]:
# populating AH with False
# TODO
for key, df in main_bucket.items():
    influx_df = df.copy(deep=True)
    if JOIN_MANUAL_ANOMALIES:
        csv = LABELLED_LOOKUP[key]
        df_with_manual_anomaly = pd.read_csv(
                PATH_TO_CSVS + csv, parse_dates=["Datetime"]
        )
        df_with_manual_anomaly = df_with_manual_anomaly.drop_duplicates()
        df_with_manual_anomaly["Datetime"] = pd.to_datetime(
            df_with_manual_anomaly["Datetime"], utc=True
        )
        df = df.merge(
            df_with_manual_anomaly[["Datetime", "Anomaly"]],
            how="left",
            left_on="DateTime",
            right_on="Datetime",
        )
        df = df.drop(columns=["DateTime"], axis=1)
        df.rename(columns={"Anomaly": "manual_anomaly"}, inplace=True)
        main_bucket[key] = df


The following cell provides model training by iterating over each sensor in `main_bucket` and:

1. Removes anomalous data based on manual_anomaly labels available in the TRAINING_ANOMALY measurement
2. Standardizes the values for training and saves the standardizer
3. Subsets the data for faster training if specified in the `TESTING` variable
4. Sequences the values into windows for the LSTM-ED anomaly detection model
5. Fits the LSTM-ED and saves the model 
6. Writes model training anomaly predictions to the TRAINING_ANOMALY Measurement model_anomaly field in InfluxDB

**Note:** 3. only applies to this test environment and would not exist in `sensor_training.py`.

In [102]:
for key, df in main_bucket.items():
    print("Training for : {}".format(key))

    import importlib
    importlib.reload(mt)
    importlib.reload(mp)

    # creates standardized column for each sensor in main bucket
    df["Stand_Val"] = cl.std_val_train(
        df[["Value"]],
        main_bucket[key]["ID"].any(),
        scaler_path,
    )

    if TESTING:
        df = df.tail(30000)

    # creates sequences for sliding windows for training
    threshold_ratio = THRESHOLD_RATIOS[key]
    time_steps = TIME_STEP_SIZES[key]
    window_size = WINDOW_SIZES[key]
    x_train, y_train = mt.create_sequences(df["Stand_Val"], df["Stand_Val"], time_steps, window_size)
    x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
    normal_dict = cl.model_parser(df, x_train, y_train)
    mt.fit_models(normal_dict, model_path, threshold_ratio)

    # creates sequences for sliding windows for predicting on the train set
    x_eval, y_eval = mt.create_sequences(df["Stand_Val"], df["Stand_Val"], time_steps, 1)
    x_eval = np.reshape(x_eval, (x_eval.shape[0], x_eval.shape[1], 1))
    timestamps = df["Datetime"].tail(len(df) - x_train.shape[1]).values
    val_nums = df["Value"].tail(len(df) - x_train.shape[1]).values
    manual_anomaly = df["manual_anomaly"].tail(len(df) - x_train.shape[1]).values
    loss_percentile = cl.load_loss_percentile(key, file_path="./test_env_loss_percentiles/")
    threshold = loss_percentile * threshold_ratio



    # predicting and prediction formatting
    ar_df = mp.make_prediction(
        key,
        x_eval,
        timestamps,
        threshold,
        val_nums,
        model_path,
        anomaly_type="model_anomaly",
        manual_anomaly = manual_anomaly
    )
    ar_df = ar_df[["uniqueID", "val_num", "model_anomaly", "manual_anomaly"]]


    influx_conn.write_data(ar_df, "TRAINING_ANOMALY", tags=["uniqueID", "model_anomaly", "manual_anomaly"])

Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
this is the train_mae_loss average
[0.25596018]
8.95409479175198
this is the train_mae_loss average
/Users/mitch/data_labs/DATA599/w2020-data599-capstone-projects-ubc-udl/code/create-test-env
this is the loss percentile
8.95409479175198
this is the loss percentile
Training for : Campus Energy Centre Campus HW Main Meter Entering Water Temperature
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
E

The following screenshots show InfluxDB with the TRAINING_ANOMALY measurements with the model_anomaly field written from the above process. Note that as the anomaly labels are tags, it is best to view the data as a scatter plot with the symbol column as uniqueID and the fill column as `model_anomaly` or `manual_anomaly`.

![](demo_screenshots/step3.PNG)

## Step 4 - Test Anomaly Detection Predictions

This step tests anomaly predictions and includes reading recent data from InfluxDB (including the window of data required to make predictions), loading previously saved anomaly detection models, running these models on the data to provide predictions, and writing the results back to InfluxDB. This would be typically by completed on a high frequency interval (for example every minute or 5 minutes). A script for anomaly predictions that can be used with UDL's InfluxDB instance is available in `../code/sensor_predict.py`. Code that is only applicable to this test environment or differs from what would exist in `../code/sensor_predict.py` is noted.

The code presented in this section is also available in `test_env_scheduled_predictor.py`.  

First setup start and end times for the prediction data set in this testing environment. In `sensor_predict.py` END_TIME would be `now()` and START_TIME would be `now() - 1d`.

In [97]:
# END TIME FOR TRAINING SET BECOMES PREDICTING'S START TIME
START_TIME = 1613109600
END_TIME = 1613196000

Read data from InfluxDB:

In [100]:
influx_read_df_for_pred = influx_conn.make_query(
    location="Campus Energy Centre",
    measurement="READINGS",
    start=START_TIME,
    end=END_TIME,
)

Split the data based on uniqueID into individual sensor dataframes

In [101]:
main_bucket_for_pred = cl.split_sensors(influx_read_df_for_pred)

The following cell provides predictions by iterating over each sensor in `main_bucket` and:

1. Standardizes the values for training by loading the standardizer
2. Sequences the values into windows for the LSTM-ED and other reshaping for the prediction step
3. Creates predictions for the data and returns the prediction object
4. Shapes the prediction object and write predictions to the PREDICT_ANOMALY Measurement realtime_anomaly field in InfluxDB

In [103]:
for key, df in main_bucket_for_pred.items():
    main_bucket_for_pred[key]["Stand_Val"] = cl.std_val_predict(
        main_bucket_for_pred[key][["Value"]],
        main_bucket_for_pred[key]["ID"].any(),
        scaler_path,
    )

    import importlib
    importlib.reload(mp)

    # creates arrays for sliding windows
    time_steps = TIME_STEP_SIZES[key]
    window_size = 1
    x_train, y_train = mt.create_sequences(df["Stand_Val"], df["Stand_Val"], time_steps, window_size)
    x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))

    # set up lists for passing to predict
    timestamps = df["DateTime"].tail(len(df) - x_train.shape[1]).values
    val_nums = df["Value"].tail(len(df) - x_train.shape[1]).values

    loss_percentile = cl.load_loss_percentile(key, file_path="./test_env_loss_percentiles/")
    threshold = THRESHOLD_RATIOS[key] * loss_percentile


    # predicting and prediction formatting
    pred_df = mp.make_prediction(
        key,
        x_train,
        timestamps,
        threshold,
        val_nums,
        model_path,
        anomaly_type="realtime_anomaly"
    )
    pred_df = pred_df[["uniqueID", "val_num", "realtime_anomaly"]]

    influx_conn.write_data(pred_df, "PREDICT_ANOMALY", tags=["uniqueID", "realtime_anomaly"])

(96, 5)
(132, 5)
(296, 5)
(1386, 5)
(660, 5)


Predictions are now written to InfluxDB and a screenshot of the PREDICT_ANOMALY measurement in InfluxDB is shown below. Note that as the anomaly labels are tags, it is best to view the data as a scatter plot with the symbol column as uniqueID and the fill column as `realtime_anomaly`.

![](demo_screenshots/step4.PNG)

The test environment will now have three measurements:

- READINGS: the raw data  
- TRAINING_ANOMALY: data with the `manual_anomaly` tag (if a user has input this) and `model_anomaly` tag generated from model training step
- PREDICT_ANOMALY: data with the `realtime_anomaly` tag generated from the prediction step

## Step 5 - Dashboard

A template for the dashboard has been provided in this `create-test-env` directory as `cec_boiler_sensors_(test).json`

**To upload the dashboard template:**
1) Navigate to the `Dashboards` tab on the left panel of the influxdb user interface  
2) Click `Create Dashboard` in the top right  
3) Click `Import Dashboard` from the drop down  
4) Click then upload `cec_boiler_sensors_(test).json`  
5) Click the new dashboard to view, you will have to change the start date to view data (try 2020-12-20 to now)  

The dashboard allows the user to change the time period viewed, whether the dashboard is continuously updated, and select the `anomaly_type` variable dropdown which will change the view between: `manual` (user entered `manual_anomaly` tag from the `TRAINING_ANOMALY` measurement), `model` (model training predictions from the `model_anomaly` tag from the `TRAINING_ANOMALY` measurement), or realtime (realtime predictions from the `realtime_anomaly` tag from the `PREDICT_ANOMALY` measurement`).

Screenshot of the user controls and the full dashboard for the 5 sensors are shown below.

![](demo_screenshots/step5a.PNG)

![](demo_screenshots/step5b.PNG)

There are several challenges with the dashboard interface in InfluxDB including:

- Coloring is not consistent between True/False labels on graphs and it does not appear possible to change this
- It does not appear possible to change the point sizes
- anomalies were input as tags as plotting boolean field data did not appear possible (there are workarounds to this)

As such, it is recommended to explore Grafana if additional styling/capability is required. It may also be necessary to consider modifing the schema such that anomalies are field values intsead of tag values.

## Step 6 - Notifications

This is done manually within InfluxDB. There may be a way to upload a template but this was not explored. The notification rule works by filtering for any data in the `PREDICT_ANOMALY` measurement that has the `realtime_anomaly` tag = True.

The notification functionality was tested at a very high level in this study. Basically tyring to answer the question: can notifications be sent using InfluxDB on predicted anomalous data. The answer is `Yes` but additional investigations on notification settings should be completed.

The process involves creating three objects:

1. Checks  
2. Notification Endpoints  
3. Notification Rules  

### 1) To Create a Check

1. Navigate to the `Alerts` tab on the left panel of the influxdb user interface.
2. Click `Create` in the top right  
3. Click `Threshold Check` from the drop down  
4. Define the query to look like (note that prior to creating this, the data explorer must be set on a timeframe that contains data): 

![query](./demo_screenshots/step6_1a.png)

5. Configure Check as follows: 

![check](./demo_screenshots/step6_1b.png)

6. Click the green check box

### 2) To Create an Endpoint
1. Create a new slack app and copy the incoming webhook https://api.slack.com/messaging/webhooks#create_a_webhook  
2. Click `Notification Endpoints` on the middle banner  
3. Click `Create` in the top right  
4. Choose `Slack` from the drop down, name the endpoint, and paste your incoming webhook from your slack app and click Create    

### 3) To Create a Notification Rule  
1. Click `Notification Rules` from the middle banner  
2. Click `Create` in the top right  
3. Configure the Notification Rule to look like:

![rule](./demo_screenshots/step6_2a.png)

4. Click `Create Notification Rule`

## Step 7 - Dashboard/Notification Test

Upload data that has been flagged as anomalous to InfluxDB to test the notification system.

The test data is set up to have 3 time stamps, now, 5 mins ago, and 10 mins ago. The notification system will only trigger on fresh data.

**NOTE:** It was found during testing that notifications were sometimes inconsistent. Additional testing on the notification system would be required.

In [51]:
DateTime = [int(time.time_ns()), int(time.time_ns() - 3e11), int(time.time_ns() - 6e11),]
val_num = [140.0, -40.0, 40.0]
realtime_anomaly = ["True", "True", "False"]
uniqueID = ["Campus Energy Centre Campus HW Main Meter Power"] * 3

data = {"DateTime": DateTime, "val_num":val_num, "uniqueID":uniqueID, "realtime_anomaly": realtime_anomaly}
test_realtime = pd.DataFrame(data)
test_realtime.set_index("DateTime", drop=True, inplace=True)
test_realtime.index.rename("DateTime", inplace=True)
test_realtime.head()

Unnamed: 0_level_0,val_num,uniqueID,realtime_anomaly
DateTime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1623740755070619000,140.0,Campus Energy Centre Campus HW Main Meter Power,True
1623740455070618880,-40.0,Campus Energy Centre Campus HW Main Meter Power,True
1623740155070621952,40.0,Campus Energy Centre Campus HW Main Meter Power,False


In [52]:
influx_conn.write_data(test_realtime, "PREDICT_ANOMALY", tags=["uniqueID", "realtime_anomaly"])

A flagged point will appear in the check's history:

![notification](./demo_screenshots/step7a.png)

And a notification will be pushed to slack:

![notification](./demo_screenshots/step7b.png) 

## Step 8
Re running the prediction step with different thresholds.  
This doesnt retrain the model, just runs a prediction based on the new threshold set.  
This requires the training to have already been run.

Choose the time period for the new analysis

In [119]:
START_TIME = 1613109600
END_TIME = 1613196000

Choose a sensor and a new threshold ratio

In [120]:
update_data = {
    "Campus Energy Centre Campus HW Main Meter Flow" : 1.5
}
print(list(update_data.keys()))

['Campus Energy Centre Campus HW Main Meter Flow']


In [121]:
influx_read_df_for_pred = influx_conn.make_query(
    location="Campus Energy Centre",
    measurement="READINGS",
    start=START_TIME,
    end=END_TIME,
    id = list(update_data.keys())
)

In [122]:
main_bucket_for_test = cl.split_sensors(influx_read_df_for_pred)

Name the measurement you want your threshold experiment to be sent to

In [123]:
measurement_name = "TEST_THRESHOLD_METER_FLOW"

In [124]:
for key, df in main_bucket_for_test.items():
    main_bucket_for_test[key]["Stand_Val"] = cl.std_val_predict(
        main_bucket_for_test[key][["Value"]],
        main_bucket_for_test[key]["ID"].any(),
        scaler_path,
    )
    print(key)

    # keeps external packages updated in the notebook
    import importlib
    importlib.reload(mp)

    # sets up sequencing
    time_steps = TIME_STEP_SIZES[key]
    window_size = 1
    x_train, y_train = mt.create_sequences(df["Stand_Val"], df["Stand_Val"], time_steps, window_size)
    x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))

    # set up lists for passing to predict
    timestamps = df["DateTime"].tail(len(df) - x_train.shape[1]).values
    val_nums = df["Value"].tail(len(df) - x_train.shape[1]).values

    # gets training loss percentile for threshold setting
    loss_percentile = cl.load_loss_percentile(key, file_path="./test_env_loss_percentiles/")
    threshold = THRESHOLD_RATIOS[key] * loss_percentile


    # predicting and prediction formatting
    pred_df = mp.make_prediction(
        key,
        x_train,
        timestamps,
        threshold,
        val_nums,
        model_path,
        anomaly_type="realtime_anomaly"
    )
    pred_df = pred_df[["uniqueID", "val_num", "realtime_anomaly"]]

    influx_conn.write_data(pred_df, measurement_name, tags=["uniqueID", "realtime_anomaly"])

Campus Energy Centre Campus HW Main Meter Flow
