This notebook contains the code used to simulate the data I used in this article about how false positives and label noise can undermine your predictive maintenance projects.  What I’ve tried to do is simulate machine data and some “failure” after that data.  Rather than simulate sensors readings and then building features, I simulated the features directly.  The machine data is a “stability” measurement, which is just another way of saying that we have a single value to describe if the mean value of that day is low (the machine is very stable) or high (the machine is not stable).  This data is simulated for 21 days (one value per day), and given a label (0 = machine didn’t have an issue, 1 = machine had issue A, 2 = machine had issue B).  Each row of data (which would represent the data for a single machine) then has 21 features/columns and one label.

Some things to know  
- The machine has a “normal” operation, which has 2 means as the center for the stability measurement.  Why 2 and not just one mean?  Take for example a blender, which has multiple speeds, all of which are “normal.”  Many complex machines have more than one mode of normal operations.
- Abnormal data starts with 0 to 20 days of normal data, and is labels either 1 or 2.  The reason for starting with normal data is that a machine maybe running normally for a few days, then abnormally for a few days, and then something breaks.  It’s also possible that it breaks with no warning (the data looks normal the whole time before a break), or all days are abnormal before a break.
- To simulate label noise, the labels are flipped based on some percentage.  For example, if we want 10% of noise and we have 100 rows of simulated normal data, about 10 of those days will be flipped to having an event (a 1 or 2).  In the case of simulated abnormal data, the labels will be flipped to the other event (e.g. a 1 flips to a 2) or to no even (a 0).

This data is goes through a normal train/test machine learning process and we look at the predictions on a test set.  

In [1]:
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
from sklearn.model_selection import train_test_split

# from xgboost import XGBClassifier

In [2]:
def gen_normal_data(
    num_days=21,
    include_label=True,
    permute_label=False,
    labels=[0, 1],
    permute_label_probs=[0.9, 0.1],
):
    """Generates Normal Data

    Args:
        num_days (int): The number of values to generate
        include_label (bool): If a label should be generated as the last value
        permute_label (bool): If the label should be changed based on some probability
        labels (list of ints): the labels to use for labeling
        permute_label_probs (list of floats): the probability of each label

    Returns:
        list: a numpy array
    """

    # Randomly select the operating mode mean, which simulates 2 "normal" operating modes
    mean = np.random.choice([0.1, 0.3])
    data = np.abs(np.random.normal(mean, 0.1, num_days))
    if permute_label == True:
        label = np.random.choice(labels, p=permute_label_probs)
    else:
        label = 0
    if include_label == False:
        output = data
    else:
        # Add a boolean value so we know if this value was permuted
        permuted = int(label != 0)
        output = np.concatenate((data, [label], [permuted]))
    # print(label)
    return output

In [3]:
def gen_abnormal_data(
    mean=0.3,
    stdev=0.2,
    default_label=1,
    labels=[0, 1],
    permute_label=False,
    permute_label_probs=[0.1, 0.9],
    permute_normal_days=False,
):
    """Generates Abnormal Data

    Args:
        mean (float): The mean of the values being generated
        stdev (float): The standard deviation of the values being generated
        default_label: The value of the majority class being generated
        labels (list of ints): the labels to use for labeling
        permute_label (bool): If the label should be changed based on some probability
        permute_label_probs (list of floats): the probability of each label
        permute_normal_days (bool): Should the number of normal days that start the data array be fixed or random

    Returns:
        list: a numpy array
    """

    if permute_normal_days == False:
        num_normal_days = 7
    else:
        num_normal_days = np.random.choice(np.arange(0, 20))
    # Add 0 to 20 normal days of data before the abnormal data.
    normal_data = gen_normal_data(
        num_days=num_normal_days, include_label=False, permute_label=permute_normal_days
    )
    abnormal_data = np.abs(np.random.normal(mean, stdev, 21 - num_normal_days))
    if permute_label == True:
        label = np.random.choice(labels, p=permute_label_probs)
    else:
        label = default_label
    # Add a boolean value so we know if this value was permuted
    permuted = int(label != default_label)
    # print(permuted)
    return np.concatenate((normal_data, abnormal_data, [label], [permuted]))

In [4]:
def gen_data(
    means=[0.1, 0.4, 0.7],
    stdevs=[0.1, 0.2, 0.3],
    num_samples=300,
    labels=[0, 1, 2],
    major_label_prob=0.8,
    permute_label=False,
):
    """Generates a dataset

    Args:
        means (list): The means of the data for every label being generated
        stdevs (list): The standard deviations of the data for every label being generated
        num_samples: The number of rows of data for every label being generated
        labels (list of ints): the labels to use for labeling
        major_label_prob (float): The probability of the current label being generated
        permute_label (bool): Should labels be randomly changed

    Returns:
        list: a pandas dataframe
    """

    # Calculate the probailities of the other labels based on the number of labels
    minor_label_prob = np.round((1 - major_label_prob) / (len(labels) - 1), 3)
    # loop through the list of labels and generate the data for each one
    # If labels are being permuted, permute them based on the major (the current label selected in the loop) label
    # and the minor label (the other labels).
    for i in labels:
        if i == 0:
            normdata_label_probs = [minor_label_prob] * len(labels)
            normdata_label_probs[i] = major_label_prob
            output_data = [
                gen_normal_data(
                    permute_label=permute_label,
                    permute_label_probs=normdata_label_probs,
                    labels=labels,
                )
                for _ in np.arange(num_samples)
            ]
            # output_data['Permuted'] = False
        else:
            permute_label_probs = [minor_label_prob] * len(labels)
            permute_label_probs[i] = major_label_prob
            # print(permute_label_probs)
            abnormal_data = [
                gen_abnormal_data(
                    mean=means[i],
                    stdev=stdevs[i],
                    default_label=i,
                    labels=labels,
                    permute_label=permute_label,
                    permute_label_probs=permute_label_probs,
                    permute_normal_days=True,
                )
                for _ in np.arange(num_samples // 1)
            ]
            # abnormal_data['Permuted'] = abnormal_data['Label'] != i
            # print(abnormal_data)
            output_data = output_data + abnormal_data

    output = pd.DataFrame(output_data)
    output.columns = list(np.arange(21)) + ["Label"] + ["Permuted"]

    return output

In [5]:
def train_model(
    means=[0.1, 0.3, 0.6],
    stdevs=[0.1, 0.25, 0.35],
    labels=[0, 1, 2],
    num_samples=500,
    permute_label=True,
    major_label_prob=0.8,
):
    """Generate data, train a model, and evaluate the output on a test set

    Args:
        means (list): The means of the data for every label being generated
        stdevs (list): The standard deviations of the data for every label being generated
        labels (list of ints): the labels to use for labeling
        num_samples: The number of rows of data for every label being generated
        permute_label (bool): Should labels be randomly changed
        major_label_prob (float): The probability of the current label being generated

    Returns:
        list: a dictionary with the predictions
    """
    # Lists with all the results.
    unnessary_work = []
    incorrect_pred = []
    incorrect_work = []
    correct_alarms = []
    actual_issues = []
    predicted_alarms = []

    # Run this multiple times to get something like a pseudo montecarlo simulation
    for i in range(100):
        data = gen_data(
            means=means,
            stdevs=stdevs,
            num_samples=num_samples,
            labels=labels,
            major_label_prob=major_label_prob,
            permute_label=permute_label,
        )
        # Remote the X and y data that was permuted
        X = data[data["Permuted"] != 1].iloc[:, 0:20]
        y = data[data["Permuted"] != 1].iloc[:, 21]
        X_train, X_test, y_train, y_test = train_test_split(
            X, y, test_size=0.3, random_state=42
        )
        # Add the permuted data back to the training sets.
        # We don't include any permuted data to the test set because we want to evaluate how good the models is
        # on actual results, not the permuted results.  This helps to not artificially increase or decrease
        # the performance of the model on the test set.
        X_train = pd.concat([X_train, data[data["Permuted"] == 1].iloc[:, 0:20]])
        y_train = pd.concat([y_train, data[data["Permuted"] == 1].iloc[:, 21]])
        # print("Split and Join",data.shape, X_train.shape, y_train.shape)
        clf = RandomForestClassifier(random_state=0)
        clf.fit(X_train, y_train)
        y_pred = clf.predict(X_test)
        actual_issues.append(np.sum(y_test != 0))
        predicted_alarms.append(np.sum(y_pred != 0))
        unnessary_work.append(np.sum((y_test == 0) & (y_pred != 0)))
        incorrect_pred.append(np.sum((y_test != y_pred)))
        incorrect_work.append(
            np.sum(((y_test != y_pred) & (y_test != 0) & (y_pred != 0)))
        )
        correct_alarms.append(np.sum((y_test != 0) & (y_test == y_pred)))

    return {
        "actual_issues": actual_issues,
        "predicted_alarms": predicted_alarms,
        "incorrect_predictions": incorrect_pred,
        "correct_alarms": correct_alarms,
        "unnessary_work": unnessary_work,
        "incorrect_work": incorrect_work,
        "data": data,
        "X_test": X_test,
        "y_test": y_test,
    }

In [6]:
def print_results(model_output):
    """Generate metrics on the predictions and scale them to 100 Alarms to
    make it easier to understand the results (e.g. 15 false alarms out of 100 is easier to understand
    than 12 out of 80).  This function prints the results.

    Args:
        model_output: the dictionary from the train_model function.

    Returns:
        list: a dictionary with 2 of the metrics used in the article where the results are presented
    """
    scaling_factor = 100 / np.mean(model_output["predicted_alarms"])
    incorrect_work = np.mean(model_output["incorrect_work"])
    unncessary_work = np.mean(model_output["unnessary_work"])
    incorrect_work_scaled = np.round(
        np.mean(model_output["incorrect_work"]) * scaling_factor, 2
    )
    unncessary_work_scaled = np.round(
        np.mean(model_output["unnessary_work"]) * scaling_factor, 2
    )
    correct_alarms_scaled = np.round(
        np.mean(model_output["correct_alarms"]) * scaling_factor, 2
    )

    print(
        f"""
    Actual Issues: {np.mean(model_output['actual_issues'])}, Alarms:{np.mean(model_output['predicted_alarms'])}, 
    Correct Alarms:{np.mean(model_output['correct_alarms'])}, Incorrect Predictions:{np.mean(model_output['incorrect_predictions'])}, 
    Incorrect Work:{np.mean(model_output['incorrect_work'])}, Unnessary Work:{np.mean(model_output['unnessary_work'])},
    Incorrect Work Scaled:{incorrect_work_scaled}, Unnessary Work Scaled:{unncessary_work_scaled},
    Correct Alarms Scaled:{correct_alarms_scaled}
    """
    )
    return {"i_w_s": incorrect_work_scaled, "u_w_s": unncessary_work_scaled}

In [7]:
def cost_calc(
    alarms=100,
    proactive_value=250,
    unnecessary_work=0,
    unnecessary_work_value=-500,
    incorrect_work=0,
    incorrect_work_value=-1000,
):
    """Calculate the cost savings or expenditures based on the true or false positives

    Args:

    Returns:
        nothing.  Just prints the results.
    """
    potential_savings = alarms * proactive_value
    unnecessary_costs = unnecessary_work * unnecessary_work_value
    incorrect_work_costs = incorrect_work * incorrect_work_value
    actual_savings = potential_savings + unnecessary_costs + incorrect_work_costs
    print(
        f"""
    Theoretical Savings: {potential_savings}, Actual Savings: {actual_savings},
    Costs Due to Unnecessary Work:{unnecessary_costs}, Costs Due to Incorrect Work:{incorrect_work_costs}"""
    )

# Generate Data, Train Models, and Print Results

# Binary Classification

## Non Permuted Labels

In [8]:
model_output = train_model(
    means=[0.1, 0.8],
    stdevs=[0.1, 0.2],
    num_samples=1000,
    labels=[0, 1],
    permute_label=False,
)
metrics = print_results(model_output)


    Actual Issues: 298.0, Alarms:297.3, 
    Correct Alarms:296.17, Incorrect Predictions:2.96, 
    Incorrect Work:0.0, Unnessary Work:1.13,
    Incorrect Work Scaled:0.0, Unnessary Work Scaled:0.38,
    Correct Alarms Scaled:99.62
    


In [9]:
cost_calc(incorrect_work=metrics["i_w_s"], unnecessary_work=metrics["u_w_s"])


    Theoretical Savings: 25000, Actual Savings: 24810.0,
    Costs Due to Unnecessary Work:-190.0, Costs Due to Incorrect Work:-0.0


## Permuted Labels, 5% Label Noise

In [10]:
model_output = train_model(
    means=[0.1, 0.4],
    stdevs=[0.1, 0.2],
    num_samples=1000,
    labels=[0, 1],
    permute_label=True,
    major_label_prob=0.95,
)
metrics = print_results(model_output)


    Actual Issues: 286.87, Alarms:290.03, 
    Correct Alarms:263.42, Incorrect Predictions:50.06, 
    Incorrect Work:0.0, Unnessary Work:26.61,
    Incorrect Work Scaled:0.0, Unnessary Work Scaled:9.17,
    Correct Alarms Scaled:90.83
    


In [11]:
cost_calc(incorrect_work=metrics["i_w_s"], unnecessary_work=metrics["u_w_s"])


    Theoretical Savings: 25000, Actual Savings: 20415.0,
    Costs Due to Unnecessary Work:-4585.0, Costs Due to Incorrect Work:-0.0


## Permuted Labels, 10% Label Noise

In [12]:
model_output = train_model(
    means=[0.1, 0.4],
    stdevs=[0.1, 0.2],
    num_samples=1000,
    labels=[0, 1],
    permute_label=True,
    major_label_prob=0.9,
)
metrics = print_results(model_output)


    Actual Issues: 274.72, Alarms:274.59, 
    Correct Alarms:250.24, Incorrect Predictions:48.83, 
    Incorrect Work:0.0, Unnessary Work:24.35,
    Incorrect Work Scaled:0.0, Unnessary Work Scaled:8.87,
    Correct Alarms Scaled:91.13
    


In [13]:
cost_calc(incorrect_work=metrics["i_w_s"], unnecessary_work=metrics["u_w_s"])


    Theoretical Savings: 25000, Actual Savings: 20565.0,
    Costs Due to Unnecessary Work:-4435.0, Costs Due to Incorrect Work:-0.0


## Permuted Labels, 20% Label Noise

In [14]:
model_output = train_model(
    means=[0.1, 0.4],
    stdevs=[0.1, 0.2],
    num_samples=1000,
    labels=[0, 1],
    permute_label=True,
    major_label_prob=0.8,
)
metrics = print_results(model_output)


    Actual Issues: 233.71, Alarms:235.48, 
    Correct Alarms:206.59, Incorrect Predictions:56.01, 
    Incorrect Work:0.0, Unnessary Work:28.89,
    Incorrect Work Scaled:0.0, Unnessary Work Scaled:12.27,
    Correct Alarms Scaled:87.73
    


In [15]:
cost_calc(incorrect_work=metrics["i_w_s"], unnecessary_work=metrics["u_w_s"])


    Theoretical Savings: 25000, Actual Savings: 18865.0,
    Costs Due to Unnecessary Work:-6135.0, Costs Due to Incorrect Work:-0.0


## Permuted Labels, 30% Label Noise

In [16]:
model_output = train_model(
    means=[0.1, 0.4],
    stdevs=[0.1, 0.2],
    num_samples=1000,
    labels=[0, 1],
    permute_label=True,
    major_label_prob=0.8,
)
metrics = print_results(model_output)


    Actual Issues: 233.59, Alarms:235.38, 
    Correct Alarms:206.83, Incorrect Predictions:55.31, 
    Incorrect Work:0.0, Unnessary Work:28.55,
    Incorrect Work Scaled:0.0, Unnessary Work Scaled:12.13,
    Correct Alarms Scaled:87.87
    


In [17]:
cost_calc(incorrect_work=metrics["i_w_s"], unnecessary_work=metrics["u_w_s"])


    Theoretical Savings: 25000, Actual Savings: 18935.0,
    Costs Due to Unnecessary Work:-6065.0, Costs Due to Incorrect Work:-0.0


# 3 Label Classification

## Non Permuted Labels

In [18]:
model_output = train_model(
    means=[0.1, 0.4, 0.7],
    stdevs=[0.1, 0.2, 0.3],
    num_samples=1000,
    labels=[0, 1, 2],
    permute_label=False,
)
metrics = print_results(model_output)


    Actual Issues: 587.0, Alarms:588.92, 
    Correct Alarms:516.54, Incorrect Predictions:98.27, 
    Incorrect Work:44.57, Unnessary Work:27.81,
    Incorrect Work Scaled:7.57, Unnessary Work Scaled:4.72,
    Correct Alarms Scaled:87.71
    


In [19]:
cost_calc(incorrect_work=metrics["i_w_s"], unnecessary_work=metrics["u_w_s"])


    Theoretical Savings: 25000, Actual Savings: 15070.0,
    Costs Due to Unnecessary Work:-2360.0, Costs Due to Incorrect Work:-7570.0


In [20]:
mydatatest = model_output["data"].copy()
mydatatest["Permuted_Copy"] = mydatatest["Permuted"]
pd.pivot_table(
    mydatatest,
    index="Label",
    columns="Permuted",
    values="Permuted_Copy",
    aggfunc="count",
)

Permuted,0.0
Label,Unnamed: 1_level_1
0.0,1000
1.0,1000
2.0,1000


## Permuted Labels

## Permuted Labels, 5% Label Noise

In [21]:
model_output = train_model(
    means=[0.1, 0.4, 0.7],
    stdevs=[0.1, 0.2, 0.3],
    num_samples=1000,
    labels=[0, 1, 2],
    permute_label=True,
    major_label_prob=0.95,
)
metrics = print_results(model_output)


    Actual Issues: 565.39, Alarms:564.59, 
    Correct Alarms:494.48, Incorrect Predictions:97.42, 
    Incorrect Work:43.6, Unnessary Work:26.51,
    Incorrect Work Scaled:7.72, Unnessary Work Scaled:4.7,
    Correct Alarms Scaled:87.58
    


In [22]:
cost_calc(incorrect_work=metrics["i_w_s"], unnecessary_work=metrics["u_w_s"])


    Theoretical Savings: 25000, Actual Savings: 14930.0,
    Costs Due to Unnecessary Work:-2350.0, Costs Due to Incorrect Work:-7720.0


## Permuted Labels, 10% Label Noise

In [23]:
model_output = train_model(
    means=[0.1, 0.4, 0.6],
    stdevs=[0.1, 0.2, 0.3],
    num_samples=1000,
    labels=[0, 1, 2],
    permute_label=True,
    major_label_prob=0.9,
)
metrics = print_results(model_output)


    Actual Issues: 536.98, Alarms:533.57, 
    Correct Alarms:435.15, Incorrect Predictions:126.64, 
    Incorrect Work:73.61, Unnessary Work:24.81,
    Incorrect Work Scaled:13.8, Unnessary Work Scaled:4.65,
    Correct Alarms Scaled:81.55
    


In [24]:
# mydatatest = model_output['data'].copy()
# mydatatest['Permuted_Copy'] = mydatatest['Permuted']
# pd.pivot_table(mydatatest.iloc[1500:,:], index='Label',columns='Permuted', values='Permuted_Copy', aggfunc='count')

In [25]:
cost_calc(incorrect_work=metrics["i_w_s"], unnecessary_work=metrics["u_w_s"])


    Theoretical Savings: 25000, Actual Savings: 8875.0,
    Costs Due to Unnecessary Work:-2325.0, Costs Due to Incorrect Work:-13800.0


## Permuted Labels, 20% Label Noise

In [26]:
model_output = train_model(
    means=[0.1, 0.4, 0.6],
    stdevs=[0.1, 0.2, 0.3],
    num_samples=1000,
    labels=[0, 1, 2],
    permute_label=True,
    major_label_prob=0.8,
)
print_results(model_output)


    Actual Issues: 468.03, Alarms:466.28, 
    Correct Alarms:371.76, Incorrect Predictions:121.51, 
    Incorrect Work:69.28, Unnessary Work:25.24,
    Incorrect Work Scaled:14.86, Unnessary Work Scaled:5.41,
    Correct Alarms Scaled:79.73
    


{'i_w_s': 14.86, 'u_w_s': 5.41}

In [27]:
cost_calc(incorrect_work=metrics["i_w_s"], unnecessary_work=metrics["u_w_s"])


    Theoretical Savings: 25000, Actual Savings: 8875.0,
    Costs Due to Unnecessary Work:-2325.0, Costs Due to Incorrect Work:-13800.0


In [28]:
## Permuted Labels, 30% Label Noise

In [29]:
model_output = train_model(
    means=[0.1, 0.4, 0.6],
    stdevs=[0.1, 0.2, 0.3],
    num_samples=1000,
    labels=[0, 1, 2],
    permute_label=True,
    major_label_prob=0.7,
)
print_results(model_output)


    Actual Issues: 406.11, Alarms:408.25, 
    Correct Alarms:307.42, Incorrect Predictions:128.45, 
    Incorrect Work:71.07, Unnessary Work:29.76,
    Incorrect Work Scaled:17.41, Unnessary Work Scaled:7.29,
    Correct Alarms Scaled:75.3
    


{'i_w_s': 17.41, 'u_w_s': 7.29}

# Scratch Work Area

In [30]:
cost_calc(incorrect_work=metrics["i_w_s"], unnecessary_work=metrics["u_w_s"])


    Theoretical Savings: 25000, Actual Savings: 8875.0,
    Costs Due to Unnecessary Work:-2325.0, Costs Due to Incorrect Work:-13800.0


In [35]:
cost_calc(
    incorrect_work=15,
    incorrect_work_value=-1000,
    unnecessary_work=6,
    unnecessary_work_value=-500,
)


    Theoretical Savings: 25000, Actual Savings: 7000,
    Costs Due to Unnecessary Work:-3000, Costs Due to Incorrect Work:-15000
