# Predictive AI for AC Motor Diagnostics  - Model Selection

**Purpose:** Experiment with various models, compare different AI&ML learning algorithms, architectures and configurations to determine which one performs best for our maintenance project.


**Automated ML:** provides tools to automatically discover good learning models for a dataset with little user intervention.

Key Features:

- Data Preprocessing:- Automatically handles tasks like cleaning, feature selection, and feature engineering.

- Model Selection:- Chooses the best algorithm for your data, whether it's classification, regression, clustering, etc.

- Hyperparameter Optimization:- Tunes the parameters of the selected model to achieve optimal performance.

- Pipeline Automation:- Creates end-to-end workflows, including data preprocessing, model training, and evaluation.

- Scalability:- Some AutoML tools are designed to handle large datasets and integrate with distributed systems.

This project will combine 2 ML learning models (Logistic Regression & Decision Tree) to predict occurence and failure type as a baseline.

In addition the H2O AutoML will be used for comparison between baseline.

Finally, a multi-class classifier neural network will be used and compare with previous models to select the best model for this project.

# Logistic Regression & Decision Tree Models

In [30]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder, LabelEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.svm import SVC
from sklearn.metrics import classification_report, roc_auc_score, confusion_matrix, accuracy_score
import joblib

# Load dataset
df = pd.read_csv("00-AI4I 2020 Predictive Maintenance Dataset.csv")

# Drop irrelevant columns
df = df.drop(columns=["UDI", "Product ID"])

# Define categorical & numerical columns
categorical_columns = ["Type"]
numerical_columns = ["Process temperature [K]", "Torque [Nm]", "Tool wear [min]", "Air temperature [K]", "Rotational speed [rpm]"]

# Preprocessing pipeline
preprocessor = ColumnTransformer(transformers=[("num", StandardScaler(), numerical_columns), ("cat", OneHotEncoder(drop="first"), categorical_columns)])

In [31]:
# Drop 'Failure Type' to prevent data leakage
X_occurrence = df.drop(columns=["Target", "Failure Type"])
y_occurrence = df["Target"]

# Split data
X_train_occ, X_test_occ, y_train_occ, y_test_occ = train_test_split(X_occurrence, y_occurrence, test_size=0.25, random_state=42)

# Logistic Regression
from sklearn.linear_model import LogisticRegression

occurrence_pipeline = Pipeline(steps=[("preprocessor", preprocessor), ("classifier", LogisticRegression(class_weight="balanced", random_state=42))])

# Train and evaluate
occurrence_pipeline.fit(X_train_occ, y_train_occ)

accuracy = accuracy_score(y_test_occ, y_pred_occ)
y_pred_occ = occurrence_pipeline.predict(X_test_occ)
cm = confusion_matrix(y_test_occ, y_pred_occ)

print(f"Accuracy: {accuracy:.4f}\n")
print("Occurrence Model Performance:\n", classification_report(y_test_occ, y_pred_occ))
print("ROC AUC Score:", roc_auc_score(y_test_occ, occurrence_pipeline.predict_proba(X_test_occ)[:, 1]), "\n")
print("Confusion Matrix:\n", cm)

Accuracy: 0.8151

Occurrence Model Performance:
               precision    recall  f1-score   support

           0       0.99      0.81      0.89      2408
           1       0.14      0.82      0.24        90

    accuracy                           0.82      2498
   macro avg       0.57      0.82      0.57      2498
weighted avg       0.96      0.82      0.87      2498

ROC AUC Score: 0.9107927279438908 

Confusion Matrix:
 [[1962  446]
 [  16   74]]


In [32]:
# Extract failure cases
failure_subset = df[df["Target"] == 1].copy()

# Drop 'Target' to prevent leakage
X_failure = failure_subset.drop(columns=["Target", "Failure Type"])
y_failure = failure_subset["Failure Type"]

# Encode failure type labels
label_encoder = LabelEncoder()
y_failure_encoded = label_encoder.fit_transform(y_failure)

# Split data
X_train_fail, X_test_fail, y_train_fail, y_test_fail = train_test_split(X_failure, y_failure_encoded, test_size=0.25, random_state=42)

# Decision Tree
from sklearn.tree import DecisionTreeClassifier

failure_type_pipeline = Pipeline(steps=[("preprocessor", preprocessor), ("classifier", DecisionTreeClassifier(max_depth=5, class_weight="balanced", random_state=42))])

# Train and evaluate
failure_type_pipeline.fit(X_train_fail, y_train_fail)

y_pred_fail = failure_type_pipeline.predict(X_test_fail)
accuracy_fail = accuracy_score(y_test_fail, y_pred_fail)

print(f"Accuracy: {accuracy_fail:.4f}\n")
print("Failure Type Model Performance:\n", classification_report(y_test_fail, y_pred_fail, target_names=label_encoder.classes_))

Accuracy: 0.8675

Failure Type Model Performance:
                           precision    recall  f1-score   support

Heat Dissipation Failure       0.88      0.88      0.88        32
      Overstrain Failure       0.84      0.91      0.88        23
           Power Failure       0.81      0.81      0.81        16
       Tool Wear Failure       1.00      0.83      0.91        12

                accuracy                           0.87        83
               macro avg       0.88      0.86      0.87        83
            weighted avg       0.87      0.87      0.87        83



# H2O AutoML Models

In [None]:
!pip install h2o
import h2o
from h2o.automl import H2OAutoML
h2o.init()

In [37]:
df = h2o.import_file("00-AI4I 2020 Predictive Maintenance Dataset.csv")

# Drop irrelevant columns
df = df.drop(["\ufeffUDI", "Product ID"], axis=1)

# Convert categorical variables
df["Type"] = df["Type"].asfactor()
df["Failure Type"] = df["Failure Type"].asfactor()
df["Target"] = df["Target"].asfactor()

# Split dataset
train, test = df.split_frame(ratios=[0.75], seed=42)

Parse progress: |████████████████████████████████████████████████████████████████| (done) 100%


In [38]:
# Drop 'Failure Type' (to prevent leakage)
X_occurrence = train.drop(["Target", "Failure Type"])
y_occurrence = train["Target"]

# Run H2O AutoML
aml_occ = H2OAutoML(max_models=20, seed=42)
aml_occ.train(x=X_occurrence.columns, y="Target", training_frame=train)

# View leaderboard
aml_occ.leaderboard

AutoML progress: |███████████████████████████████████████████████████████████████| (done) 100%


model_id,auc,logloss,aucpr,mean_per_class_error,rmse,mse
StackedEnsemble_BestOfFamily_1_AutoML_1_20250502_35213,0.990494,0.0365959,0.875379,0.121717,0.0969116,0.00939185
StackedEnsemble_AllModels_1_AutoML_1_20250502_35213,0.990378,0.0360048,0.876592,0.104176,0.0956268,0.00914448
GBM_grid_1_AutoML_1_20250502_35213_model_2,0.989617,0.0392271,0.868859,0.134031,0.0987427,0.00975012
XGBoost_grid_1_AutoML_1_20250502_35213_model_3,0.989173,0.0405006,0.847647,0.11711,0.103841,0.0107829
GBM_grid_1_AutoML_1_20250502_35213_model_1,0.988535,0.0411452,0.851772,0.116835,0.101067,0.0102146
XGBoost_grid_1_AutoML_1_20250502_35213_model_1,0.987619,0.043358,0.859366,0.122268,0.102287,0.0104626
XGBoost_3_AutoML_1_20250502_35213,0.987179,0.042918,0.830371,0.120253,0.106602,0.0113641
XGBoost_grid_1_AutoML_1_20250502_35213_model_2,0.984965,0.0478057,0.830905,0.144924,0.107681,0.0115953
GBM_5_AutoML_1_20250502_35213,0.984911,0.0462905,0.822868,0.108559,0.108982,0.0118771
GBM_2_AutoML_1_20250502_35213,0.984351,0.0445754,0.834103,0.12759,0.106861,0.0114193


In [44]:
# Get the best model from AutoML
best_model = aml_occ.leader

# Print model performance
best_model.model_performance().show()

Unnamed: 0,0,1,Error,Rate
0,7250.0,8.0,0.0011,(8.0/7258.0)
1,2.0,243.0,0.0082,(2.0/245.0)
Total,7252.0,251.0,0.0013,(10.0/7503.0)

metric,threshold,value,idx
max f1,0.3392489,0.9798387,154.0
max f2,0.3276167,0.9878543,158.0
max f0point5,0.5856758,0.9862188,133.0
max accuracy,0.348285,0.9986672,152.0
max precision,0.9985905,1.0,0.0
max recall,0.2152248,1.0,174.0
max specificity,0.9985905,1.0,0.0
max absolute_mcc,0.3392489,0.9792263,154.0
max min_per_class_accuracy,0.2152248,0.9960044,174.0
max mean_per_class_accuracy,0.2152248,0.9980022,174.0

group,cumulative_data_fraction,lower_threshold,lift,cumulative_lift,response_rate,score,cumulative_response_rate,cumulative_score,capture_rate,cumulative_capture_rate,gain,cumulative_gain,kolmogorov_smirnov
1,0.0101293,0.9701377,30.6244898,30.6244898,1.0,0.9854233,1.0,0.9854233,0.3102041,0.3102041,2962.4489796,2962.4489796,0.3102041
2,0.0201253,0.9001471,30.6244898,30.6244898,1.0,0.9433413,1.0,0.9645216,0.3061224,0.6163265,2962.4489796,2962.4489796,0.6163265
3,0.0301213,0.5903866,30.6244898,30.6244898,1.0,0.8074028,1.0,0.9123804,0.3061224,0.922449,2962.4489796,2962.4489796,0.922449
4,0.0401173,0.1523257,7.7582041,24.9269103,0.2533333,0.2926038,0.8139535,0.757951,0.077551,1.0,675.8204082,2392.6910299,0.9922844
5,0.0501133,0.0818197,0.0,19.9547872,0.0,0.1107732,0.6515957,0.6288597,0.0,1.0,-100.0,1895.4787234,0.981951
6,0.1000933,0.0111646,0.0,9.9906791,0.0,0.0352184,0.3262317,0.3324343,0.0,1.0,-100.0,899.0679095,0.9302838
7,0.1500733,0.0025825,0.0,6.6634103,0.0,0.00587,0.2175844,0.2236762,0.0,1.0,-100.0,566.3410302,0.8786167
8,0.2000533,0.0011875,0.0,4.9986676,0.0,0.0016574,0.1632245,0.1682085,0.0,1.0,-100.0,399.8667555,0.8269496
9,0.3000133,0.0006629,0.0,3.3331853,0.0,0.0008494,0.1088405,0.1124469,0.0,1.0,-100.0,233.3185251,0.7236153
10,0.3999733,0.0004972,0.0,2.5001666,0.0,0.0005722,0.0816395,0.0844876,0.0,1.0,-100.0,150.0166611,0.6202811


In [54]:
# Generate predictions on the test set
pred_occurrence = aml_occ.leader.predict(test)

# Convert predictions to a Pandas dataframe for easier analysis
pred_occurrence_df = pred_occurrence.as_data_frame()
test_df = test.as_data_frame()

# Merge predictions with actual values
results_occ = test_df[["Target"]].copy()
results_occ["Predicted_Target"] = pred_occurrence_df["predict"]
print(results_occ.head(10))

stackedensemble prediction progress: |███████████████████████████████████████████| (done) 100%
   Target  Predicted_Target
0       0                 0
1       0                 0
2       0                 0
3       0                 0
4       0                 0
5       0                 0
6       0                 0
7       0                 0
8       0                 0
9       0                 0






In [45]:
# Keep only failure cases
failure_subset = train[train["Target"] == "1"]

# Drop 'Target' (to prevent leakage)
X_failure = failure_subset.drop(["Target", "Failure Type"])
y_failure = failure_subset["Failure Type"]

# Run H2O AutoML
aml_fail = H2OAutoML(max_models=20, seed=42)
aml_fail.train(x=X_failure.columns, y="Failure Type", training_frame=failure_subset)

# View leaderboard
aml_fail.leaderboard

AutoML progress: |██
04:13:55.783: GBM_1_AutoML_2_20250502_41346 [GBM def_5] failed: water.exceptions.H2OModelBuilderIllegalArgumentException: Illegal argument(s) for GBM model: GBM_1_AutoML_2_20250502_41346.  Details: ERRR on field: _min_rows: The dataset size is too small to split for min_rows=100.0: must have at least 200.0 (weighted) rows, but have only 196.0.
ERRR on field: _min_rows: The dataset size is too small to split for min_rows=100.0: must have at least 200.0 (weighted) rows, but have only 196.0.
ERRR on field: _min_rows: The dataset size is too small to split for min_rows=100.0: must have at least 200.0 (weighted) rows, but have only 196.0.
ERRR on field: _min_rows: The dataset size is too small to split for min_rows=100.0: must have at least 200.0 (weighted) rows, but have only 196.0.
ERRR on field: _min_rows: The dataset size is too small to split for min_rows=100.0: must have at least 200.0 (weighted) rows, but have only 196.0.


███████████████████████████████████████

model_id,mean_per_class_error,logloss,rmse,mse
StackedEnsemble_BestOfFamily_1_AutoML_2_20250502_41346,0.0364361,0.160144,0.220327,0.048544
DeepLearning_grid_2_AutoML_2_20250502_41346_model_1,0.0461392,0.210546,0.245451,0.0602463
StackedEnsemble_AllModels_1_AutoML_2_20250502_41346,0.046664,0.170375,0.231208,0.0534571
GBM_grid_1_AutoML_2_20250502_41346_model_2,0.0511187,0.257031,0.258977,0.0670691
DeepLearning_grid_3_AutoML_2_20250502_41346_model_1,0.0550764,0.337899,0.281971,0.0795076
XGBoost_grid_1_AutoML_2_20250502_41346_model_3,0.0567778,0.25475,0.270817,0.0733416
GBM_3_AutoML_2_20250502_41346,0.059521,0.261552,0.269288,0.0725162
GBM_4_AutoML_2_20250502_41346,0.0599524,0.249503,0.264977,0.0702127
DeepLearning_grid_1_AutoML_2_20250502_41346_model_1,0.0633201,0.255994,0.272006,0.0739873
XGBoost_1_AutoML_2_20250502_41346,0.0640241,0.399785,0.355488,0.126372


In [46]:
# Get the best model from AutoML
best_model = aml_fail.leader

# Print detailed model performance
best_model.model_performance().show()

Heat Dissipation Failure,No Failure,Overstrain Failure,Power Failure,Random Failures,Tool Wear Failure,Error,Rate
84.0,0.0,0.0,0.0,0.0,0.0,0.0,0 / 84
0.0,0.0,0.0,0.0,0.0,0.0,,0 / 0
0.0,0.0,62.0,0.0,0.0,0.0,0.0,0 / 62
0.0,0.0,0.0,69.0,0.0,0.0,0.0,0 / 69
0.0,0.0,0.0,0.0,0.0,0.0,,0 / 0
0.0,0.0,0.0,0.0,0.0,30.0,0.0,0 / 30
84.0,0.0,62.0,69.0,0.0,30.0,0.0,0 / 245

k,hit_ratio
1,1.0
2,1.0
3,1.0
4,1.0
5,1.0
6,1.0


In [53]:
# Filter test data where failure occurred (Target = 1)
failure_test_subset = test[test["Target"] == "1"]

# Generate predictions for failure type classification
pred_failure_type = aml_fail.leader.predict(failure_test_subset)

# Convert predictions to a Pandas dataframe
pred_failure_df = pred_failure_type.as_data_frame()
failure_test_df = failure_test_subset.as_data_frame()

# Merge predictions with actual failure types
results_failure = failure_test_df[["Failure Type"]].copy()
results_failure["Predicted_Failure_Type"] = pred_failure_df["predict"]
print(results_failure.head(10))

stackedensemble prediction progress: |███████████████████████████████████████████| (done) 100%
         Failure Type Predicted_Failure_Type
0       Power Failure          Power Failure
1       Power Failure          Power Failure
2       Power Failure          Power Failure
3  Overstrain Failure     Overstrain Failure
4  Overstrain Failure     Overstrain Failure
5  Overstrain Failure     Overstrain Failure
6       Power Failure          Power Failure
7   Tool Wear Failure      Tool Wear Failure
8       Power Failure          Power Failure
9  Overstrain Failure     Overstrain Failure






# Deep Learning NN Model

In [60]:
import tensorflow as tf
from tensorflow import keras
from keras.api.layers import Input, Dense, Dropout
from keras.api.models import Model
from keras.api.utils import to_categorical
from sklearn.preprocessing import LabelEncoder, StandardScaler, OneHotEncoder
from sklearn.model_selection import train_test_split
import numpy as np

# Load dataset
df = pd.read_csv("00-AI4I 2020 Predictive Maintenance Dataset.csv")

# Drop irrelevant columns
df = df.drop(columns=["UDI", "Product ID"])

# Encode categorical features
df["Type"] = df["Type"].astype("category").cat.codes

# Encode target variables
label_encoder = LabelEncoder()
df["Failure Type"] = label_encoder.fit_transform(df["Failure Type"])
df["Target"] = df["Target"].astype("int")

# Separate features for both models
X = df.drop(columns=["Target", "Failure Type"])
y_occurrence = df["Target"]
y_failure = df[df["Target"] == 1]["Failure Type"]

# Standardize numerical features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# One-hot encode failure type labels
y_failure_encoded = to_categorical(y_failure)

# Split data
X_train_occ, X_test_occ, y_train_occ, y_test_occ = train_test_split(X_scaled, y_occurrence, test_size=0.25, random_state=42)
X_train_fail, X_test_fail, y_train_fail, y_test_fail = train_test_split(X_scaled[y_occurrence == 1], y_failure_encoded, test_size=0.25, random_state=42)

In [65]:
import tensorflow as tf

# Define input shape
input_shape = X_train_occ.shape[1]

# Shared Input Layer
inputs = Input(shape=(input_shape,))
x = Dense(128, activation="relu")(inputs)
x = Dropout(0.2)(x)
x = Dense(64, activation="relu")(x)

# Output 1: Binary Classification (Failure Occurrence)
output_occ = Dense(1, activation="sigmoid", name="occurrence_output")(x)

# Output 2: Multi-Class Classification (Failure Type)
output_fail = Dense(y_failure_encoded.shape[1], activation="softmax", name="failure_type_output")(x)

# Define Model
model = Model(inputs=inputs, outputs=[output_occ, output_fail])

# Compile Model
model.compile(optimizer="adam",
              loss={"occurrence_output": "binary_crossentropy", "failure_type_output": "categorical_crossentropy"},
              metrics={"occurrence_output": "accuracy", "failure_type_output": "accuracy"})

# Print Model Summary
model.summary()

In [67]:
# Train using both tasks
X_train_occ_filtered = X_train_occ[y_train_occ == 1]

# Ensure X_train_occ_filtered and y_train_fail have the same number of samples
X_train_occ_filtered = X_train_occ_filtered[:len(y_train_fail)]

# Now train the model
history = model.fit(X_train_occ_filtered,  # Use filtered X_train_occ
                    {"occurrence_output": y_train_occ[y_train_occ == 1][:len(y_train_fail)],
                     "failure_type_output": y_train_fail},
                    epochs=50,
                    batch_size=32,
                    validation_split=0.2)

Epoch 1/50
[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 165ms/step - failure_type_output_accuracy: 0.1903 - failure_type_output_loss: 1.4503 - loss: 2.0281 - occurrence_output_accuracy: 0.8203 - occurrence_output_loss: 0.5778 - val_failure_type_output_accuracy: 0.2708 - val_failure_type_output_loss: 1.3882 - val_loss: 1.9172 - val_occurrence_output_accuracy: 0.9792 - val_occurrence_output_loss: 0.5003
Epoch 2/50
[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 48ms/step - failure_type_output_accuracy: 0.2239 - failure_type_output_loss: 1.4495 - loss: 1.9164 - occurrence_output_accuracy: 0.9691 - occurrence_output_loss: 0.4669 - val_failure_type_output_accuracy: 0.2708 - val_failure_type_output_loss: 1.3613 - val_loss: 1.7802 - val_occurrence_output_accuracy: 1.0000 - val_occurrence_output_loss: 0.3848
Epoch 3/50
[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 35ms/step - failure_type_output_accuracy: 0.2784 - failure_type_output_loss: 1.4429 

In [68]:
# Generate predictions for failure occurrence
pred_occurrence = model.predict(X_test_occ)

# Extract binary predictions
pred_occ_binary = (pred_occurrence[0] > 0.5).astype(int)

# Compare predictions with actual values
results_occ = pd.DataFrame({"Actual_Target": y_test_occ, "Predicted_Target": pred_occ_binary.flatten()})

# Print sample predictions
print(results_occ.head())

[1m79/79[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 7ms/step
      Actual_Target  Predicted_Target
4023              0                 1
7346              0                 1
487               0                 1
39                0                 1
6797              0                 1


In [69]:
# Generate predictions for failure type classification (only for Target = 1)
pred_failure_type = model.predict(X_test_fail)

# Convert softmax outputs to class labels
pred_fail_labels = np.argmax(pred_failure_type[1], axis=1)

# Compare predictions with actual failure types
results_fail = pd.DataFrame({
    "Actual_Failure_Type": np.argmax(y_test_fail, axis=1),
    "Predicted_Failure_Type": pred_fail_labels
})

# Decode failure type labels back to original categories
results_fail["Actual_Failure_Type"] = label_encoder.inverse_transform(results_fail["Actual_Failure_Type"])
results_fail["Predicted_Failure_Type"] = label_encoder.inverse_transform(results_fail["Predicted_Failure_Type"])

# Print sample predictions
print(results_fail.head())

[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
   Actual_Failure_Type  Predicted_Failure_Type
0                    2                       3
1                    0                       3
2                    0                       3
3                    3                       3
4                    5                       3


In [72]:
from sklearn.metrics import classification_report, accuracy_score

# Accuracy of failure occurrence prediction
acc_occ = accuracy_score(y_test_occ, pred_occ_binary)
print(f"Accuracy: {acc_occ:.4f}\n")

# Classification report for failure type prediction
print("Performance:\n", classification_report(results_fail["Actual_Failure_Type"], results_fail["Predicted_Failure_Type"]))

Accuracy: 0.0360

Performance:
               precision    recall  f1-score   support

           0       0.00      0.00      0.00        32
           2       0.00      0.00      0.00        23
           3       0.19      1.00      0.32        16
           5       0.00      0.00      0.00        12

    accuracy                           0.19        83
   macro avg       0.05      0.25      0.08        83
weighted avg       0.04      0.19      0.06        83



  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


# Results:

**Logistic Regression & Decision Trees:**

- Accuracy is 81.51% for occurences and 86.75% for failure types.

- High AUC at 91% meaning strong separation between failure/no-failure classes.

- Good recall for detecting failures at 82%.

- Failure type classification handles multiple categories fairly well.

- Possible Improvements: Hyperparameter tuning and using random forrest will most likely yield better results, providing a fairly simple and accurate solution.

**H2O AutoML (Stacked Ensemble) Models:**

- Very high AUC at 99.99%.

- Very good classification confidence with LogLoss of 0.01.

- Near zero miss-classification of Failure Types.

- Possible Improvements: Better handling of class imbalances and assess feature importance.

**Neural Network:**

- Model prediction is random and strongly biased towards one Failure Type.

- This model requires major fixes and a significant architecture change to be viable.

- Possible Improvements: The complexity of the changes required vs the benefit of using this model do not make it a viable solution.

**Model Selection:**
Overall model selection for the Predictive AI for AC Motor Diagnostics Project are the Stacked Ensemble Models defined using the H2O AutoML library. Backup models are the Logistic Regression in combination with Decision Trees. A neural network solution is not viable for this project.
