## Addressing Our Architecture Issues
1. **Our sigmoid function squishes predictions between 0 and 1, but our labels are between 2 and -2. How can we fix this?**

Normalize the alarm labels according to our activation function in the prediction layer. For example normalize labels between -1 and 1 if using tanh, or normalize labels between 0 and 1 if using sigmoid.

2. **If we decide to not use multilabel classification and instead train individual networks, is there a library that makes this easier for us?**

Yes, the MultiOutputClassifier class of scikit learn is compatible with Keras classifiers

3. **What is a better accuracy metric to measure model preformance considering data is sparse?**

We can create custom Keras metrics functions to calculate the accuracy for the model in terms of predicting only alarm states that are on

4. **What loss function should we use now that out labels are not 0 and 1?**

Read notebook for my ideas

In [12]:
import torch
import torch.nn as nn
from torch.autograd import Variable

In [None]:
# Option 1
# 81 alarms each with 5 possible states
# multi-hot encoding of length 405
# Train model using binary cross entropy loss on those labels

model = nn.Linear(20, 5) 
x = torch.randn(1, 20)
y = torch.tensor([[0,1,0,1,0]]).float()

criterion = nn.BCEWithLogitsLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-1)

for epoch in range(20):
    optimizer.zero_grad()
    output = (model(x))
    loss = criterion(output, y)
    loss.backward()
    optimizer.step()
    print('Loss: {:.3f}'.format(loss.item()))

In [None]:
# Option 2
# Normalize alarms to be bewteen 0 and 1
# (-2, -1, 0, 1, 2) --> (0, 1, 2, 3, 4) -- > (0., 0.25, 0.50, 0.75, 1.0)
# Train model using binary cross entropy loss on those labels

import numpy as np
from sklearn.preprocessing import MinMaxScaler
labels = np.array([1,2,3,4,5])
scaler = MinMaxScaler()
normed_labels = scaler.fit_transform(labels.reshape(-1, 1))

print("original labels: ", labels)
print("normalized labels: ", normed_labels.reshape(-1))

model = nn.Linear(20, 5) 
x = torch.randn(1, 20)
y = torch.tensor([[0., 0.25, 0.50, 0.75, 1.0]]).float()

criterion = nn.BCEWithLogitsLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-1)

for epoch in range(20):
    optimizer.zero_grad()
    output = (model(x))
    loss = criterion(output, y)
    loss.backward()
    optimizer.step()
    print('Loss: {:.3f}'.format(loss.item()))

In [None]:
# Option 3
# Create a custom loss function that allows for our original categorical labels
# We could use weighted BCE as a starting point to weight incorrect positive predictions

In [1]:
#Lets check if a single variable LSTM still suffers from low accuracy using CSE Loss

In [2]:
import pandas as pd

In [3]:
sensors = pd.read_csv('sensors_original.csv')
alarms = pd.read_csv('alarms_filtered.csv')

In [4]:
# Take the first test case (number 1), and the first sensor in that test case (XMEAS 1) for training
sensor_id = 'XMEAS 1'
hi = 0.95
lo = 0.05
test_no = 1
features = sensors[sensors['TEST_NO']==test_no][sensor_id]
targets = alarms[alarms['TEST_NO']==test_no][sensor_id]

In [5]:
#What does the data look like?
import plotly.graph_objects as go

In [7]:
#Plot
fig = go.Figure([
    
    go.Scatter(
        name='XMEAS 1',
        y=features,
        mode='markers+lines',
        marker=dict(color='blue', size=2),
        showlegend=True
    ),
    go.Scatter(
        name='Alarm State',
        y=targets,
        mode='lines',
        marker=dict(color="orange"),
        line=dict(width=1),
        showlegend=True
    ),
    go.Scatter(
        name='Hi',
        y=[hi]*len(features),
        mode='lines',
        marker=dict(color="red"),
        line=dict(width=1),
        showlegend=True
    ),
    go.Scatter(
        name='Lo',
        y=[lo]*len(features),
        mode='lines',
        marker=dict(color="green"),
        line=dict(width=1),
        showlegend=True
    )
])
fig.show()

In [47]:
# Lets take the first 10k sequences since this is where the fault occurs
# Lets also use a lookback of 180 (30 minutes)
lookback = 1
batch_size = 64
sequences = list(pd.Series(features)[:10000].rolling(window=lookback))
sequences = list(s.to_numpy() for s in sequences)

In [48]:
# We aim to use cross entropy loss for classification of alarm state
# This requires target to be a probability distribtuion
# Accomplish this by converting target modulus five (five total alarm states)
# Thus 0 maps to 0, 1 to 1, 2 to 2, -2 to 3, and -1 to 4
# Ex. 1 --> [1]
# Ex. -2 --> [3]
encoded_targets = []
for index, target in enumerate(targets):
# We would use this commented out code for BCE Loss (I believe)
#     encoding = np.zeros(5)
#     encoding[target%5] = 1.0
#     encoded_targets.append(encoding)
    encoded_targets.append(target%5)

In [49]:
# Now we have our normalized sensor sequences and encoded targets
# Now we can train model

In [50]:
#This model uses 1 LSTM layer with 2 fully connected layers and a Softmax activation output
class LSTM(nn.Module):
    
    def __init__(self, num_classes, input_size, hidden_size, num_layers, sequence_length):
        super(LSTM, self).__init__()
        self.num_classes = num_classes 
        self.num_layers = num_layers 
        self.input_size = input_size 
        self.hidden_size = hidden_size 
        self.seq_length = sequence_length 
        self.lstm = nn.LSTM(
            input_size=input_size, 
            hidden_size=hidden_size, 
            num_layers=num_layers, 
            batch_first=True
        )
        self.fc1 =  nn.Linear(hidden_size, 36) 
        self.fc2 = nn.Linear(36, num_classes) 
        self.sm = nn.Softmax()
    def forward(self, x):
        h_0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size) #hidden state
        c_0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size) #internal state
        # Propagate input through LSTM
        output, (hn, cn) = self.lstm(x, (h_0, c_0))
        # reshaping the data for Dense layer next
        hn = hn.view(-1, self.hidden_size) 
        out = self.fc1(hn) 
        out = self.fc2(out) 
        out = self.sm(out)
        return out

In [51]:
# Save model to device and initialize
device = 'cuda' if torch.cuda.is_available() else 'cpu'
rnn_model = LSTM(
    num_classes=5,
    num_layers=1,
    input_size=1,
    hidden_size=64,
    sequence_length=lookback
).to(device)

In [52]:
# Use Adam optimization algorithm for updating weights
optim = torch.optim.Adam(rnn_model.parameters(), lr=1e-2)

In [53]:
# Balance dataset with oversampling so we have an even amount of on and off labels for training
# Convert data to torch dataloader object and shuffle
from torch.utils.data import DataLoader
from collections import Counter

seq_target_pairs = list(zip(sequences[lookback:], encoded_targets[lookback:]))
counts = Counter(list(seq_target_pair[1] for seq_target_pair in seq_target_pairs))
on_count = counts[1]
off_count = counts[0]

while (on_count<=off_count):
    add_ons = list(seq_target_pair for seq_target_pair in seq_target_pairs if seq_target_pair[1]==1)
    on_count=on_count+len(add_ons)
    seq_target_pairs=seq_target_pairs+add_ons

dataset = DataLoader(seq_target_pairs, batch_size=batch_size, shuffle=True, drop_last=True)

In [54]:
# TRAIN MODEL
# Also track losses
losses = []
for epoch in range(10):
    print("Epoch: ", epoch)
    for batch in dataset:
        # Reshape data for LSTM 
        preds = rnn_model(batch[0].reshape(batch_size,lookback,1).float())
        # Clear gradients
        optim.zero_grad()
        loss = nn.CrossEntropyLoss()(preds,batch[1])
        # Backward pass
        loss.backward()
        # Update weights
        optim.step()
        # Track loss
        losses.append(loss.item())

Epoch:  0



Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.



Epoch:  1
Epoch:  2
Epoch:  3
Epoch:  4
Epoch:  5
Epoch:  6
Epoch:  7
Epoch:  8
Epoch:  9


In [55]:
#Plot losses
fig = go.Figure([
    
    go.Scatter(
        name='loss',
        y=losses,
        mode='markers+lines',
        marker=dict(color='blue', size=2),
        showlegend=True
    )
])
fig.show()

In [62]:
# Lets check how our model has learned the training data
model_filtering = list(x.index(1) for x in rnn_model(torch.tensor(sequences).reshape(len(sequences),lookback,1).float()).round().tolist())
train_acc = sum(1 for x,y in zip(model_filtering,targets) if x == y) / float(len(model_filtering))
print("train_acc: ", train_acc)

train_acc:  0.9811



Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.



In [63]:
#How does our model predict the training data
fig = go.Figure([
    
    go.Scatter(
        name='alarm state',
        y=model_filtering,
        mode='markers+lines',
        marker=dict(color='blue', size=2),
        showlegend=True
    )
])
fig.show()

In [64]:
#Now lets test the model on some unseen data (test no 2), but with the same sensor we trained (XMEAS 1)
sensor_id = 'XMEAS 1'
test_no = 2
test_features = sensors[sensors['TEST_NO']==test_no][sensor_id]
test_targets = alarms[alarms['TEST_NO']==test_no][sensor_id]

In [65]:
#Again, how does the ground truth data look?
fig = go.Figure([
    
    go.Scatter(
        name='XMEAS 1',
        y=test_features,
        mode='markers+lines',
        marker=dict(color='blue', size=2),
        showlegend=True
    ),
    go.Scatter(
        name='Alarm State',
        y=test_targets,
        mode='lines',
        marker=dict(color="orange"),
        line=dict(width=1),
        showlegend=True
    )
])
fig.show()

In [66]:
# Lets convert the test data to sequences so our model can use it to make predictions
lookback = 1
test_sequences = list(pd.Series(test_features).rolling(window=lookback))
test_sequences = list(s.to_numpy() for s in test_sequences)

In [67]:
# Use the trained model to predict alarm states for testing data
model_filtering = list(x.index(1) for x in rnn_model(torch.tensor(test_sequences).reshape(len(test_sequences),lookback,1).float()).round().tolist())
test_acc = sum(1 for x,y in zip(model_filtering,test_targets) if x == y) / float(len(model_filtering))
print("test_acc: ", test_acc)

test_acc:  0.9935860616503925



Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.



In [69]:
#How does our models predictions look?
fig = go.Figure([
    
    go.Scatter(
        name='model',
        y=model_filtering,
        mode='markers+lines',
        marker=dict(color='blue', size=2),
        showlegend=True
    ),
        go.Scatter(
        name='truth',
        y=test_targets,
        mode='markers+lines',
        marker=dict(color='orange', size=2),
        showlegend=True
    )
])
fig.show()

In [70]:
# Pretty good results. 
# Something to note, this is for a lookback of 1
# Sequences of length, 180, 90, even 20, resulted in the model getting confused
# I hate this dataset :(