# Machine Predictive Maintenance Classification
## Dataset to predict machine failure (binary) and type (multiclass)

Machine Predictive Maintenance Classification Dataset
Since real predictive maintenance datasets are generally difficult to obtain and in particular difficult to publish, we present and provide a synthetic dataset that reflects real predictive maintenance encountered in the industry to the best of our knowledge.

The dataset consists of 10 000 data points stored as rows with 14 features in columns

UID: unique identifier ranging from 1 to 10000
productID: consisting of a letter L, M, or H for low (50% of all products), medium (30%), and high (20%) as product quality variants and a variant-specific serial number
air temperature [K]: generated using a random walk process later normalized to a standard deviation of 2 K around 300 K
process temperature [K]: generated using a random walk process normalized to a standard deviation of 1 K, added to the air temperature plus 10 K.
rotational speed [rpm]: calculated from powepower of 2860 W, overlaid with a normally distributed noise
torque [Nm]: torque values are normally distributed around 40 Nm with an Ïƒ = 10 Nm and no negative values.
tool wear [min]: The quality variants H/M/L add 5/3/2 minutes of tool wear to the used tool in the process.
and a'machine failure' label that indicates, whether the machine has failed in this particular data point for any of the following failure modes are true.

Important : There are two Targets - Do not make the mistake of using one of them as feature, as it will lead to leakage.
Target : Failure or Not
Failure Type : Type of Failure

In [1]:
import pandas as pd 
import numpy as np

In [2]:
df=pd.read_csv(r'C:\Users\Hazem Moustafa\Desktop\Data Analysis\predictive maintence project\predictive_maintenance.csv')

In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 10 columns):
 #   Column                   Non-Null Count  Dtype  
---  ------                   --------------  -----  
 0   UDI                      10000 non-null  int64  
 1   Product ID               10000 non-null  object 
 2   Type                     10000 non-null  object 
 3   Air temperature [K]      10000 non-null  float64
 4   Process temperature [K]  10000 non-null  float64
 5   Rotational speed [rpm]   10000 non-null  int64  
 6   Torque [Nm]              10000 non-null  float64
 7   Tool wear [min]          10000 non-null  int64  
 8   Target                   10000 non-null  int64  
 9   Failure Type             10000 non-null  object 
dtypes: float64(3), int64(4), object(3)
memory usage: 781.4+ KB


In [4]:
df['Failure Type'].nunique()

6

In [5]:
df['Target'].unique()

array([0, 1])

In [6]:
df.isna().sum()

UDI                        0
Product ID                 0
Type                       0
Air temperature [K]        0
Process temperature [K]    0
Rotational speed [rpm]     0
Torque [Nm]                0
Tool wear [min]            0
Target                     0
Failure Type               0
dtype: int64

In [7]:
df.describe()

Unnamed: 0,UDI,Air temperature [K],Process temperature [K],Rotational speed [rpm],Torque [Nm],Tool wear [min],Target
count,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0
mean,5000.5,300.00493,310.00556,1538.7761,39.98691,107.951,0.0339
std,2886.89568,2.000259,1.483734,179.284096,9.968934,63.654147,0.180981
min,1.0,295.3,305.7,1168.0,3.8,0.0,0.0
25%,2500.75,298.3,308.8,1423.0,33.2,53.0,0.0
50%,5000.5,300.1,310.1,1503.0,40.1,108.0,0.0
75%,7500.25,301.5,311.1,1612.0,46.8,162.0,0.0
max,10000.0,304.5,313.8,2886.0,76.6,253.0,1.0


In [8]:
df.head()

Unnamed: 0,UDI,Product ID,Type,Air temperature [K],Process temperature [K],Rotational speed [rpm],Torque [Nm],Tool wear [min],Target,Failure Type
0,1,M14860,M,298.1,308.6,1551,42.8,0,0,No Failure
1,2,L47181,L,298.2,308.7,1408,46.3,3,0,No Failure
2,3,L47182,L,298.1,308.5,1498,49.4,5,0,No Failure
3,4,L47183,L,298.2,308.6,1433,39.5,7,0,No Failure
4,5,L47184,L,298.2,308.7,1408,40.0,9,0,No Failure


In [9]:
df.duplicated().sum()

np.int64(0)

In [10]:
df.shape

(10000, 10)

## (Predict Failure- Binary Classification ) - (Predict Failure Type- Mutli-class classification )

In [11]:
x=df.drop(columns=['Target','Failure Type','UDI','Product ID'])

In [12]:
x=pd.get_dummies(x,columns=['Type'],drop_first=True) ## to avoid perfect multicollinearity , one category was dropped after one-hot encoding the type

In [13]:
df['Type'].unique()

array(['M', 'L', 'H'], dtype=object)

In [14]:
y=df['Target']

In [15]:
x.astype(int)

Unnamed: 0,Air temperature [K],Process temperature [K],Rotational speed [rpm],Torque [Nm],Tool wear [min],Type_L,Type_M
0,298,308,1551,42,0,0,1
1,298,308,1408,46,3,1,0
2,298,308,1498,49,5,1,0
3,298,308,1433,39,7,1,0
4,298,308,1408,40,9,1,0
...,...,...,...,...,...,...,...
9995,298,308,1604,29,14,0,1
9996,298,308,1632,31,17,0,0
9997,299,308,1645,33,22,0,1
9998,299,308,1408,48,25,0,0


In [16]:
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2,random_state=42, stratify=y) ## keep the same propotion of failure in both train and test 

In [17]:
from sklearn.linear_model import LogisticRegression

binary=LogisticRegression(max_iter=10000)
binary.fit(x_train,y_train)

0,1,2
,penalty,'l2'
,dual,False
,tol,0.0001
,C,1.0
,fit_intercept,True
,intercept_scaling,1
,class_weight,
,random_state,
,solver,'lbfgs'
,max_iter,10000


In [18]:
y_pred=binary.predict(x_test)

In [19]:
y_proba=binary.predict_proba(x_test)[:,1] ## to evalute the risk(failure probabilty) for each predction
threshold=0.25  # as the defult 0.5

y_pred_custom=(y_proba >=threshold ).astype(int)


In [20]:
print(y_proba)


[0.63293091 0.02330991 0.04520726 ... 0.00716183 0.03245584 0.00086428]


In [21]:
from sklearn.metrics import classification_report ,confusion_matrix, roc_auc_score 

In [22]:
print('Confussion Matrix:',confusion_matrix(y_test,y_pred))

print('Classification Report :',classification_report(y_test,y_pred))

print('ROC_AUC:',roc_auc_score(y_test,y_proba))   

print('Confussion Matrix-custom:',confusion_matrix(y_test,y_pred_custom))

print('Classification Report_custom:',classification_report(y_test,y_pred_custom))


#   predict 0  1
#actul  0
#actual 1
       

Confussion Matrix: [[1927    5]
 [  58   10]]
Classification Report :               precision    recall  f1-score   support

           0       0.97      1.00      0.98      1932
           1       0.67      0.15      0.24        68

    accuracy                           0.97      2000
   macro avg       0.82      0.57      0.61      2000
weighted avg       0.96      0.97      0.96      2000

ROC_AUC: 0.8994489099987821
Confussion Matrix-custom: [[1903   29]
 [  40   28]]
Classification Report_custom:               precision    recall  f1-score   support

           0       0.98      0.98      0.98      1932
           1       0.49      0.41      0.45        68

    accuracy                           0.97      2000
   macro avg       0.74      0.70      0.72      2000
weighted avg       0.96      0.97      0.96      2000



### the model predicat the machine failure and no failure with about 89% disciminatory power( ROC_AUC).
### it can be enhanced by imporiving failure recall through balancing or storgner algorithims 

In [23]:
y.value_counts()

Target
0    9661
1     339
Name: count, dtype: int64

## Lowering the classification threshold from 0.3 to 0.25 further reduced false negatives (missed failures) from 45 to 40, improving failure recall.
This improvement came at the cost of a moderate increase in false alarms (from 23 to 29), which is considered acceptable in a predictive maintenance context where missing failures is more critical than additional inspections

In [24]:
from sklearn.ensemble import RandomForestClassifier 

In [25]:
binary_model=RandomForestClassifier(n_estimators=100,random_state=42,class_weight='balanced',n_jobs=-1)
binary_model.fit(x_train,y_train)

y_pred=binary_model.predict(x_test)
y_proba=binary_model.predict_proba(x_test)[:,1]  ## failur probabilty

print('ROC_AUC :',roc_auc_score(y_test,y_proba))

print('confusion_matrix for 0.5 threeshold :', confusion_matrix(y_test,y_pred) ) , 

print('classification report for 0.5 threeshold:' ,classification_report(y_test,y_pred))


y_pred_proba3=(y_proba>=0.3).astype(int)
y_pred_proba25=(y_proba>=0.25).astype(int)

print(' confusion matrix for 0.3 threshold:',confusion_matrix(y_test,y_pred_proba3))

print('classification report for 0.3 threshold:', classification_report(y_test,y_pred_proba3))

print(' confusion matrix for 0.25 threshold:',confusion_matrix(y_test,y_pred_proba25))

print('classification report for 0.25 threshold:', classification_report(y_test,y_pred_proba25))


#   predict 0  1\
#actul  0
#actual 1


ROC_AUC : 0.9632276823772989
confusion_matrix for 0.5 threeshold : [[1930    2]
 [  36   32]]
classification report for 0.5 threeshold:               precision    recall  f1-score   support

           0       0.98      1.00      0.99      1932
           1       0.94      0.47      0.63        68

    accuracy                           0.98      2000
   macro avg       0.96      0.73      0.81      2000
weighted avg       0.98      0.98      0.98      2000

 confusion matrix for 0.3 threshold: [[1919   13]
 [  20   48]]
classification report for 0.3 threshold:               precision    recall  f1-score   support

           0       0.99      0.99      0.99      1932
           1       0.79      0.71      0.74        68

    accuracy                           0.98      2000
   macro avg       0.89      0.85      0.87      2000
weighted avg       0.98      0.98      0.98      2000

 confusion matrix for 0.25 threshold: [[1908   24]
 [  16   52]]
classification report for 0.25 threshold

In [26]:
rf=RandomForestClassifier(n_estimators=500,random_state=42,class_weight='balanced',n_jobs=-1)
rf.fit(x_train,y_train)

y_pred=rf.predict(x_test)
y_proba=rf.predict_proba(x_test)[:,1]  ## failur probabilty

print('ROC_AUC :',roc_auc_score(y_test,y_proba))

print('confusion_matrix for 0.5 threeshold :', confusion_matrix(y_test,y_pred) ) , 

print('classification report for 0.5 threeshold:' ,classification_report(y_test,y_pred))


y_pred_proba3=(y_proba>=0.3).astype(int)
y_pred_proba25=(y_proba>=0.25).astype(int)

print(' confusion matrix for 0.3 threshold:',confusion_matrix(y_test,y_pred_proba3))

print('classification report for 0.3 threshold:', classification_report(y_test,y_pred_proba3))

print(' confusion matrix for 0.25 threshold:',confusion_matrix(y_test,y_pred_proba25))

print('classification report for 0.25 threshold:', classification_report(y_test,y_pred_proba25))


#   predict 0  1


#   predict 0  1
#actul  0
#actual 1

ROC_AUC : 0.9586834733893559
confusion_matrix for 0.5 threeshold : [[1929    3]
 [  33   35]]
classification report for 0.5 threeshold:               precision    recall  f1-score   support

           0       0.98      1.00      0.99      1932
           1       0.92      0.51      0.66        68

    accuracy                           0.98      2000
   macro avg       0.95      0.76      0.83      2000
weighted avg       0.98      0.98      0.98      2000

 confusion matrix for 0.3 threshold: [[1918   14]
 [  19   49]]
classification report for 0.3 threshold:               precision    recall  f1-score   support

           0       0.99      0.99      0.99      1932
           1       0.78      0.72      0.75        68

    accuracy                           0.98      2000
   macro avg       0.88      0.86      0.87      2000
weighted avg       0.98      0.98      0.98      2000

 confusion matrix for 0.25 threshold: [[1909   23]
 [  17   51]]
classification report for 0.25 threshold

## A Random Forest classifier with 100 estimators and a tuned decision threshold (0.25) was selected to maximize failure recall while maintaining acceptable false alarm rates. The model achieved a failure recall of 76% with only 16 missed failures.

## A second-stage multi-class model is proposed to identify the failure type for predictive maintenance planning.

In [27]:
df.head()

Unnamed: 0,UDI,Product ID,Type,Air temperature [K],Process temperature [K],Rotational speed [rpm],Torque [Nm],Tool wear [min],Target,Failure Type
0,1,M14860,M,298.1,308.6,1551,42.8,0,0,No Failure
1,2,L47181,L,298.2,308.7,1408,46.3,3,0,No Failure
2,3,L47182,L,298.1,308.5,1498,49.4,5,0,No Failure
3,4,L47183,L,298.2,308.6,1433,39.5,7,0,No Failure
4,5,L47184,L,298.2,308.7,1408,40.0,9,0,No Failure


In [28]:
z=df.drop(columns=['Target'])

In [29]:
z.shape

(10000, 9)

In [30]:
z=z[z['Failure Type']!='No Failure']

In [31]:
z.shape

(348, 9)

In [32]:
y=z['Failure Type']

In [33]:
y.unique()

array(['Power Failure', 'Tool Wear Failure', 'Overstrain Failure',
       'Random Failures', 'Heat Dissipation Failure'], dtype=object)

In [34]:
z=z.drop(columns=['Failure Type','UDI','Product ID'],errors='ignore')

In [35]:
z=pd.get_dummies(z,columns=['Type'],drop_first=True)

In [36]:
z.astype(int)

Unnamed: 0,Air temperature [K],Process temperature [K],Rotational speed [rpm],Torque [Nm],Tool wear [min],Type_L,Type_M
50,298,309,2861,4,143,1,0
69,298,309,1410,65,191,1,0
77,298,308,1455,41,208,1,0
160,298,308,1282,60,216,1,0
161,298,308,1412,52,218,1,0
...,...,...,...,...,...,...,...
9758,298,309,2271,16,218,1,0
9764,298,309,1294,66,12,1,0
9822,298,309,1360,60,187,1,0
9830,298,309,1337,56,206,1,0


In [37]:
from sklearn.preprocessing import LabelEncoder
le=LabelEncoder()
y_enc=le.fit_transform(y)

list(enumerate(le.classes_))


[(0, 'Heat Dissipation Failure'),
 (1, 'Overstrain Failure'),
 (2, 'Power Failure'),
 (3, 'Random Failures'),
 (4, 'Tool Wear Failure')]

In [38]:
from sklearn.model_selection import train_test_split
z_train,z_test,y_train,y_test=train_test_split(z,y_enc,test_size=0.2,random_state=42,stratify=y)

In [39]:
from sklearn.ensemble import RandomForestClassifier
rf=RandomForestClassifier(n_estimators=500,random_state=42,class_weight='balanced',max_depth=8,min_samples_leaf=10,min_samples_split=10)
rf.fit(z_train,y_train)

y_pred_test=rf.predict(z_test)
y_pred_train=rf.predict(z_train)
y_proba=rf.predict_proba(z_test)


print('ROC_AUC:',roc_auc_score(y_test,y_proba,multi_class='ovr',average='weighted'))

#print('confusion matrix for threshold 0.5::',confusion_matrix(y_test,y_pred_test))
print('classification Report for threshold 0.5::',classification_report(y_test,y_pred_test))


cm=confusion_matrix(y_test,y_pred_test,labels=rf.classes_)
cm_df=pd.DataFrame(cm,index=rf.classes_,columns=rf.classes_)
print(cm_df)

ROC_AUC: 0.983086240788748
classification Report for threshold 0.5::               precision    recall  f1-score   support

           0       0.85      1.00      0.92        22
           1       0.85      0.69      0.76        16
           2       0.89      0.84      0.86        19
           3       0.67      0.50      0.57         4
           4       0.80      0.89      0.84         9

    accuracy                           0.84        70
   macro avg       0.81      0.78      0.79        70
weighted avg       0.84      0.84      0.84        70

    0   1   2  3  4
0  22   0   0  0  0
1   2  11   2  0  1
2   1   1  16  1  0
3   1   0   0  2  1
4   0   1   0  0  8


In [40]:
from sklearn.metrics import accuracy_score


train_acc=accuracy_score(y_train,y_pred_train)
test_acc= accuracy_score(y_test,y_pred_test)
print('train_acc =',train_acc)
print('test_acc =',test_acc)

train_acc = 0.9136690647482014
test_acc = 0.8428571428571429


In [41]:
from imblearn.over_sampling import SMOTE
from imblearn.pipeline import Pipeline

In [42]:

model=Pipeline([('smote',SMOTE(random_state=42)),
                ('rf',RandomForestClassifier(n_estimators=500,n_jobs=-1,random_state=42,class_weight='balanced',
                                             max_depth=8,min_samples_leaf=10,min_samples_split=10))])
model.fit(z_train,y_train)



0,1,2
,steps,"[('smote', ...), ('rf', ...)]"
,transform_input,
,memory,
,verbose,False

0,1,2
,sampling_strategy,'auto'
,random_state,42
,k_neighbors,5

0,1,2
,n_estimators,500
,criterion,'gini'
,max_depth,8
,min_samples_split,10
,min_samples_leaf,10
,min_weight_fraction_leaf,0.0
,max_features,'sqrt'
,max_leaf_nodes,
,min_impurity_decrease,0.0
,bootstrap,True


In [43]:
print('ROC_AUC:',roc_auc_score(y_test,y_proba,multi_class='ovr',average='weighted'))
y_pred_test=model.predict(z_test)
#print('confusion matrix for threshold 0.5::',confusion_matrix(y_test,y_pred_test))
print('classification Report for threshold 0.5::',classification_report(y_test,y_pred_test))

cm=confusion_matrix(y_test,y_pred_test,labels=rf.classes_)
cm_df=pd.DataFrame(cm,index=rf.classes_,columns=rf.classes_)
print(cm_df)


ROC_AUC: 0.983086240788748
classification Report for threshold 0.5::               precision    recall  f1-score   support

           0       0.85      1.00      0.92        22
           1       0.85      0.69      0.76        16
           2       0.89      0.84      0.86        19
           3       0.75      0.75      0.75         4
           4       0.89      0.89      0.89         9

    accuracy                           0.86        70
   macro avg       0.84      0.83      0.84        70
weighted avg       0.86      0.86      0.85        70

    0   1   2  3  4
0  22   0   0  0  0
1   2  11   2  0  1
2   1   1  16  1  0
3   1   0   0  3  0
4   0   1   0  0  8


from sklearn.multiclass import OneVsRestClassifier

model=OneVsRestClassifier(estimator=model,n_jobs=-1)


In [44]:
from sklearn.model_selection import StratifiedKFold, cross_val_predict

skf=StratifiedKFold(n_splits=5,shuffle=True,random_state=42)

y_pred_cv=cross_val_predict(model,z_train,y_train,cv=skf)


print('classification Report for threshold 0.5::',classification_report(y_train,y_pred_cv))
      
cm=confusion_matrix(y_train,y_pred_cv,labels=model.classes_)
cm_df=pd.DataFrame(cm,index=model.classes_,columns=model.classes_) 
print(cm_df)


y_proba_cv=cross_val_predict(model,z_train,y_train,cv=skf,method='predict_proba')
ROC_AUC=roc_auc_score(y_train,y_proba_cv,multi_class='ovr',average='weighted')
print('ROC_AUC for :', ROC_AUC)

classification Report for threshold 0.5::               precision    recall  f1-score   support

           0       0.90      0.86      0.88        90
           1       0.81      0.95      0.87        62
           2       0.92      0.74      0.82        76
           3       0.71      0.71      0.71        14
           4       0.77      0.94      0.85        36

    accuracy                           0.85       278
   macro avg       0.82      0.84      0.83       278
weighted avg       0.86      0.85      0.85       278

    0   1   2   3   4
0  77   7   3   0   3
1   0  59   1   0   2
2   7   6  56   3   4
3   2   0   1  10   1
4   0   1   0   1  34
ROC_AUC for : 0.976456466017771


In [45]:
### there is overfitting (model store the data insted of train ) will reduce the max_depth 

### The model achieves good overall performance for frequent failure types; however, it struggles with rare failure categories due to class imbalance.
Failure types with very low sample counts exhibit weak recall and F1-score, indicating missed detections. This highlights the need for additional data collection, data enrichment, or balancing techniques to improve the prediction of rare failures so i used the imblearn.over_sampling.smote to generate new samples for minority  class , and reduce the max _depth , increase min_samples_leaf and min_samples_split to prevent the overfitting 

### after comparing between 2 models SMOTE and RandomForestClassifer , i chooesd SMOTE as that encahnce all of result regarding the recall and F1 score for all types of failures

In [46]:
import joblib
joblib.dump(binary_model,'binary_model.pkl')
joblib.dump(model,'failure_type_model.pkl')
print('saved')

saved


In [47]:
#loaded_model=joblib.load('final_model.pkl')

### Built a multi-class classification model to predict failure type from machine sensor/features.
Observed class imbalance: the model performed well on common failure types but had low recall/F1 on rare failures (it missed them).
Improved minority-class performance by using SMOTE (to synthetically oversample rare classes) and tuning Random Forest to reduce overfitting (e.g., limiting max_depth, increasing min_samples_leaf / min_samples_split).
Compared Random Forest vs Random Forest + SMOTE and selected the RF+SMOTE pipeline because it improved recall and F1-score, especially for rare failure types.


### Next step (App – brief)
Build a small user-friendly app that lets a maintenance user:
Enter sensor values (or upload a CSV row),
Click Predict,
See Predicted Failure Type + Probability/Confidence,
Optionally show a simple “Action hint” per failure type (recommended check/maintenance step).
Most practical tools: Streamlit (fast, simple) or Gradio (very easy UI).