### Predictive Maintanence

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('ggplot')

In [2]:
dataset=pd.read_csv('ai4i2020.csv')
dataset.head(3)

Unnamed: 0,UDI,Product ID,Type,Air temperature [K],Process temperature [K],Rotational speed [rpm],Torque [Nm],Tool wear [min],Machine failure,TWF,HDF,PWF,OSF,RNF
0,1,M14860,M,298.1,308.6,1551,42.8,0,0,0,0,0,0,0
1,2,L47181,L,298.2,308.7,1408,46.3,3,0,0,0,0,0,0
2,3,L47182,L,298.1,308.5,1498,49.4,5,0,0,0,0,0,0


In [3]:
dataset.columns

Index(['UDI', 'Product ID', 'Type', 'Air temperature [K]',
       'Process temperature [K]', 'Rotational speed [rpm]', 'Torque [Nm]',
       'Tool wear [min]', 'Machine failure', 'TWF', 'HDF', 'PWF', 'OSF',
       'RNF'],
      dtype='object')

In [4]:
dataset.drop(['UDI', 'Product ID'] , axis=1, inplace=True)

In [5]:
dataset['Machine failure'].unique()

array([0, 1], dtype=int64)

In [6]:
Machine_failure= dataset[['Type', 'Air temperature [K]','Process temperature [K]', 'Rotational speed [rpm]', 'Torque [Nm]','Tool wear [min]', 'Machine failure']]

In [7]:
Machine_failure.fillna(0)

Unnamed: 0,Type,Air temperature [K],Process temperature [K],Rotational speed [rpm],Torque [Nm],Tool wear [min],Machine failure
0,M,298.1,308.6,1551,42.8,0,0
1,L,298.2,308.7,1408,46.3,3,0
2,L,298.1,308.5,1498,49.4,5,0
3,L,298.2,308.6,1433,39.5,7,0
4,L,298.2,308.7,1408,40.0,9,0
...,...,...,...,...,...,...,...
9995,M,298.8,308.4,1604,29.5,14,0
9996,H,298.9,308.4,1632,31.8,17,0
9997,M,299.0,308.6,1645,33.4,22,0
9998,H,299.0,308.7,1408,48.5,25,0


The dataset consists of 10 000 data points stored as rows with 14 features in columns
UID: unique identifier ranging from 1 to 10000
product ID: consisting of a letter L, M, or H for low (50% of all products), medium (30%) and high (20%) as product quality variants and a variant-specific serial number
air temperature [K]: generated using a random walk process later normalized to a standard deviation of 2 K around 300 K
process temperature [K]: generated using a random walk process normalized to a standard deviation of 1 K, added to the air temperature plus 10 K.
rotational speed [rpm]: calculated from a power of 2860 W, overlaid with a normally distributed noise
torque [Nm]: torque values are normally distributed around 40 Nm with a Ïƒ = 10 Nm and no negative values.
tool wear [min]: The quality variants H/M/L add 5/3/2 minutes of tool wear to the used tool in the process. and a
'machine failure' label that indicates, whether the machine has failed in this particular datapoint for any of the following failure modes are true.

The machine failure consists of five independent failure modes
tool wear failure (TWF): the tool will be replaced of fail at a randomly selected tool wear time between 200 â€“ 240 mins (120 times in our dataset). At this point in time, the tool is replaced 69 times, and fails 51 times (randomly assigned).
heat dissipation failure (HDF): heat dissipation causes a process failure, if the difference between air- and process temperature is below 8.6 K and the toolâ€™s rotational speed is below 1380 rpm. This is the case for 115 data points.
power failure (PWF): the product of torque and rotational speed (in rad/s) equals the power required for the process. If this power is below 3500 W or above 9000 W, the process fails, which is the case 95 times in our dataset.
overstrain failure (OSF): if the product of tool wear and torque exceeds 11,000 minNm for the L product variant (12,000 M, 13,000 H), the process fails due to overstrain. This is true for 98 datapoints.
random failures (RNF): each process has a chance of 0,1 % to fail regardless of its process parameters. This is the case for only 5 datapoints, less than could be expected for 10,000 datapoints in our dataset.

If at least one of the above failure modes is true, the process fails and the 'machine failure' label is set to 1. It is therefore not transparent to the machine learning method, which of the failure modes has caused the process to fail

In [8]:
from sklearn.model_selection import train_test_split

In [9]:
from sklearn import preprocessing
label_encoder = preprocessing.LabelEncoder()

In [10]:
Machine_failure['Type']=label_encoder.fit_transform(Machine_failure['Type'])

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  Machine_failure['Type']=label_encoder.fit_transform(Machine_failure['Type'])


In [11]:
X= Machine_failure.drop('Machine failure' ,axis=1)

In [12]:
Y=Machine_failure['Machine failure']

In [13]:
x_train, x_test, y_train, y_test = train_test_split(X,Y, test_size=0.7, random_state=40)

#### Linear regression

In [14]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

In [15]:
model_linear=LinearRegression()

In [16]:
model_linear_fit=model_linear.fit(x_train,y_train)

In [17]:
# coefficeints of the trained model
print('\nCoefficient of model :', model_linear_fit.coef_)

# intercept of the model
print('\nIntercept of model',model_linear_fit.intercept_)



Coefficient of model : [-0.00411981  0.02413001 -0.02273087  0.00058221  0.01223167  0.00027603]

Intercept of model -1.568901210570032


In [18]:
# predict the target on the test dataset
predict_train = model_linear_fit.predict(x_train)
print('\nCoeffiecient of Linear Regression model of trainning dataset',predict_train) 

# Root Mean Squared Error on training dataset
rmse_train = mean_squared_error(y_train,predict_train)**(0.5)
print('\nRMSE of Linear Regression model of trainning dataset : ', rmse_train)


Coeffiecient of Linear Regression model of trainning dataset [ 0.12861153  0.02065238  0.06706082 ...  0.07710044  0.0643638
 -0.10266547]

RMSE of Linear Regression model of trainning dataset :  0.168155184138138


In [19]:
# predict the target on the testing dataset
predict_test = model_linear_fit.predict(x_test)
print('\nPrediction of linear Regression on test dataset',predict_test) 

# Root Mean Squared Error on testing dataset
rmse_test = mean_squared_error(y_test,predict_test)**(0.5)
print('\nRMSE on test dataset of Linear Regression : ', rmse_test)


Prediction of linear Regression on test dataset [ 0.00083623 -0.00616276 -0.02909271 ...  0.07104901  0.03738231
 -0.05149118]

RMSE on test dataset of Linear Regression :  0.17017998823693256


#### Logistics Regression

In [20]:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

In [21]:
model_log=LogisticRegression()
model_log_fit= model_log.fit(x_train,y_train)


In [22]:
#coeff of logistc regression

print("\n Coefficient of Logistic Regression" , model_log_fit.coef_)
print("\n Intercept of Logistic Regression" , model_log_fit.intercept_)


 Coefficient of Logistic Regression [[-0.06296257  1.12654322 -1.21480807  0.01284425  0.30290127  0.01468607]]

 Intercept of Logistic Regression [-0.00906959]


In [23]:
##training accuracy
predict_log_train=model_log_fit.predict(x_train)
Accuracy_logistic_train= accuracy_score(y_train,predict_log_train)
print("\n Accuracy of  Logistic Regression on trainning dataset",Accuracy_logistic_train*100,"%" )


 Accuracy of  Logistic Regression on trainning dataset 97.1 %


In [24]:
## testing accuracy

predict_log_test=model_log_fit.predict(x_test)
Accuracy_logistic_test= accuracy_score(y_test,predict_log_test)
print("\n Accuracy of Logistic Regression on test dataset" , Accuracy_logistic_test*100,"%" )


 Accuracy of Logistic Regression on test dataset 96.85714285714285 %


In [82]:
model_log_fit.evaluate(x_test,y_test)

AttributeError: 'LogisticRegression' object has no attribute 'evaluate'

### SVM

In [25]:
from sklearn.svm import SVC
model_svm = SVC(kernel='linear', random_state=0)  
model_svm_fit=model_svm.fit(x_train,y_train)

In [26]:
##training accuracy
predict_svm_train=model_svm_fit.predict(x_train)
Accuracy_svm_train= accuracy_score(y_train,predict_svm_train)
print("\n Accuracy of SVM on trainning dataset",Accuracy_svm_train*100,"%" )


 Accuracy of SVM on trainning dataset 97.23333333333333 %


In [27]:
##testing accuracy

predict_svm_test=model_svm_fit.predict(x_test)
Accuracy_svm_test= accuracy_score(y_test,predict_svm_test)
print("\n Accuracy of  SVM on trainning dataset",Accuracy_svm_test*100,"%" )


 Accuracy of  SVM on trainning dataset 97.02857142857142 %


## KNN

In [28]:
from sklearn.neighbors import KNeighborsClassifier  
model_knn= KNeighborsClassifier(n_neighbors=5, metric='minkowski', p=2 )  
model_knn_fit=model_knn.fit(x_train, y_train)  

In [97]:
##training accuracy
predict_knn_train=model_knn_fit.predict(x_train)
Accuracy_knn_train= accuracy_score(y_train,predict_svm_train)
print("\n Accuracy of KNN on trainning dataset",Accuracy_knn_train*100,"%" )

##testing accuracy

predict_knn_test=model_knn_fit.predict(x_test)
Accuracy_knn_test= accuracy_score(y_test,predict_knn_test)
print("\n Accuracy of  KNN on test dataset",Accuracy_knn_test*100,"%" )


 Accuracy of KNN on trainning dataset 97.23333333333333 %

 Accuracy of  KNN on test dataset 96.87142857142858 %


## Decision Tree Classifier

In [30]:
from sklearn.tree import DecisionTreeClassifier  
model_dtc= DecisionTreeClassifier(criterion='entropy', random_state=0)  
model_dtc_fit=model_dtc.fit(x_train, y_train)  

In [95]:
##training accuracy
predict_dtc_train=model_dtc_fit.predict(x_train)
Accuracy_dtc_train= accuracy_score(y_train,predict_dtc_train)
print("\n Accuracy of Decision Tree Classifier on trainning dataset",Accuracy_dtc_train*100,"%" )

##testing accuracy

predict_dtc_test=model_dtc_fit.predict(x_test)
Accuracy_dtc_test= accuracy_score(y_test,predict_dtc_test)
print("\n Accuracy of  Decision Tree Classifier on test dataset",Accuracy_dtc_test*100,"%" )


 Accuracy of Decision Tree Classifier on trainning dataset 100.0 %

 Accuracy of  Decision Tree Classifier on test dataset 97.68571428571428 %


## XGBoost

In [99]:
from xgboost import XGBClassifier
model_xgb= DecisionTreeClassifier(criterion='entropy', random_state=0)  
model_xgb_fit=model_xgb.fit(x_train, y_train)

In [100]:
#training accuracy
predict_xgb_train=model_xgb_fit.predict(x_train)
Accuracy_xgb_train= accuracy_score(y_train,predict_xgb_train)
print("\n Accuracy of XGBoost Classifier on trainning dataset",Accuracy_xgb_train*100,"%" )

##testing accuracy

predict_xgb_test=model_xgb_fit.predict(x_test)
Accuracy_xgb_test= accuracy_score(y_test,predict_xgb_test)
print("\n Accuracy of  XGBoost Classifier on test dataset",Accuracy_xgb_test*100,"%" )


 Accuracy of XGBoost Classifier on trainning dataset 100.0 %

 Accuracy of  XGBoost Classifier on test dataset 97.68571428571428 %


## AdaBoostClassifier

In [36]:
from sklearn.ensemble import AdaBoostClassifier

In [41]:
model_ada= AdaBoostClassifier(n_estimators=50,
                         learning_rate=1)  
model_ada_fit=model_ada.fit(x_train, y_train)
#training accuracy
predict_ada_train=model_ada_fit.predict(x_train)
Accuracy_ada_train= accuracy_score(y_train,predict_ada_train)
print("\n Accuracy of AdaBoost Classifier on trainning dataset",Accuracy_ada_train*100,"%" )

##testing accuracy

predict_ada_test=model_ada_fit.predict(x_test)
Accuracy_ada_test= accuracy_score(y_test,predict_ada_test)
print("\n Accuracy of  AdaBoost Classifier on test dataset",Accuracy_ada_test*100,"%" )


 Accuracy of AdaBoost Classifier on trainning dataset 98.2 %

 Accuracy of  AdaBoost Classifier on test dataset 97.32857142857144 %


## Deep learning

## ANN

In [50]:
import tensorflow as tf
from tensorflow import keras


model=keras.Sequential([
    keras.layers.Dense(6, input_shape=(6,) , activation="relu"),
    keras.layers.Dense(6, activation="relu"),
    keras.layers.Dense(1,activation="sigmoid")
])

model.compile(optimizer="adam" , loss="binary_crossentropy" , metrics=['accuracy'])
model.fit(x_train,y_train,epochs=200)

Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78

Epoch 82/200
Epoch 83/200
Epoch 84/200
Epoch 85/200
Epoch 86/200
Epoch 87/200
Epoch 88/200
Epoch 89/200
Epoch 90/200
Epoch 91/200
Epoch 92/200
Epoch 93/200
Epoch 94/200
Epoch 95/200
Epoch 96/200
Epoch 97/200
Epoch 98/200
Epoch 99/200
Epoch 100/200
Epoch 101/200
Epoch 102/200
Epoch 103/200
Epoch 104/200
Epoch 105/200
Epoch 106/200
Epoch 107/200
Epoch 108/200
Epoch 109/200
Epoch 110/200
Epoch 111/200
Epoch 112/200
Epoch 113/200
Epoch 114/200
Epoch 115/200
Epoch 116/200
Epoch 117/200
Epoch 118/200
Epoch 119/200
Epoch 120/200
Epoch 121/200
Epoch 122/200
Epoch 123/200
Epoch 124/200
Epoch 125/200
Epoch 126/200
Epoch 127/200
Epoch 128/200
Epoch 129/200
Epoch 130/200
Epoch 131/200
Epoch 132/200
Epoch 133/200
Epoch 134/200
Epoch 135/200
Epoch 136/200
Epoch 137/200
Epoch 138/200
Epoch 139/200
Epoch 140/200
Epoch 141/200
Epoch 142/200
Epoch 143/200
Epoch 144/200
Epoch 145/200
Epoch 146/200
Epoch 147/200
Epoch 148/200
Epoch 149/200
Epoch 150/200
Epoch 151/200
Epoch 152/200
Epoch 153/200
Epoch 154/

Epoch 162/200
Epoch 163/200
Epoch 164/200
Epoch 165/200
Epoch 166/200
Epoch 167/200
Epoch 168/200
Epoch 169/200
Epoch 170/200
Epoch 171/200
Epoch 172/200
Epoch 173/200
Epoch 174/200
Epoch 175/200
Epoch 176/200
Epoch 177/200
Epoch 178/200
Epoch 179/200
Epoch 180/200
Epoch 181/200
Epoch 182/200
Epoch 183/200
Epoch 184/200
Epoch 185/200
Epoch 186/200
Epoch 187/200
Epoch 188/200
Epoch 189/200
Epoch 190/200
Epoch 191/200
Epoch 192/200
Epoch 193/200
Epoch 194/200
Epoch 195/200
Epoch 196/200
Epoch 197/200
Epoch 198/200
Epoch 199/200
Epoch 200/200


<keras.callbacks.History at 0x2dfd7526eb0>

In [93]:
model.evaluate(x_test,y_test)



[0.10245774686336517, 0.9712857007980347]

In [53]:
y_pred=model.predict(x_test)

In [62]:
y_pred

prediction=[]

for i in y_pred:
    for j in i:
        if j>0.5 :
            prediction.append(1)
        else :
            prediction.append(0)
        
prediction

[0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 1,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 1,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 1,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,


In [63]:
y_test

293     0
1244    0
7353    0
5145    0
1618    0
       ..
5049    0
4329    0
1315    0
843     0
7547    0
Name: Machine failure, Length: 7000, dtype: int64

In [64]:
from sklearn.metrics import classification_report , confusion_matrix

print(classification_report(y_test,prediction))

              precision    recall  f1-score   support

           0       0.97      1.00      0.99      6763
           1       0.76      0.22      0.34       237

    accuracy                           0.97      7000
   macro avg       0.87      0.61      0.66      7000
weighted avg       0.97      0.97      0.96      7000



## multilayer perceptron (MLP) 

In [73]:
from sklearn.neural_network import MLPClassifier
mlp=MLPClassifier(max_iter=500, activation='logistic')
mlp

MLPClassifier(activation='logistic', max_iter=500)

In [74]:
model_mlp_fit=mlp.fit(x_train,y_train)

In [75]:
pred_mlp=model_mlp_fit.predict(x_test)

In [76]:
accuracy_score(y_test,pred_mlp)

0.9677142857142857

In [78]:
print(classification_report(y_test,pred_mlp))

              precision    recall  f1-score   support

           0       0.97      1.00      0.98      6763
           1       1.00      0.05      0.09       237

    accuracy                           0.97      7000
   macro avg       0.98      0.52      0.54      7000
weighted avg       0.97      0.97      0.95      7000



Linear regression
Logistics Regression
Support vector machine (SVM) 
K nearest neighbors classifier(KNN)
Decision Tree Classifier
XGBoost Classifier
AdaBoostClassifier
ANN
Multilayer Perceptron Classifier

In [84]:
## Logistics Regression

print(classification_report(y_test,predict_log_test))

              precision    recall  f1-score   support

           0       0.97      0.99      0.98      6763
           1       0.59      0.23      0.33       237

    accuracy                           0.97      7000
   macro avg       0.78      0.61      0.66      7000
weighted avg       0.96      0.97      0.96      7000



In [85]:
## Support vector machine (SVM)

print(classification_report(y_test,predict_svm_test))

              precision    recall  f1-score   support

           0       0.97      1.00      0.98      6763
           1       0.73      0.19      0.31       237

    accuracy                           0.97      7000
   macro avg       0.85      0.60      0.65      7000
weighted avg       0.96      0.97      0.96      7000



In [86]:
## K nearest neighbors classifier(KNN)

print(classification_report(y_test,predict_knn_test))

              precision    recall  f1-score   support

           0       0.97      1.00      0.98      6763
           1       0.73      0.19      0.31       237

    accuracy                           0.97      7000
   macro avg       0.85      0.60      0.65      7000
weighted avg       0.96      0.97      0.96      7000



In [87]:
##Decision Tree Classifier

print(classification_report(y_test,predict_dtc_test))

              precision    recall  f1-score   support

           0       0.97      1.00      0.98      6763
           1       0.73      0.19      0.31       237

    accuracy                           0.97      7000
   macro avg       0.85      0.60      0.65      7000
weighted avg       0.96      0.97      0.96      7000



In [88]:
##XGBoost Classifier

print(classification_report(y_test,predict_xgb_test))

              precision    recall  f1-score   support

           0       0.97      1.00      0.98      6763
           1       0.73      0.19      0.31       237

    accuracy                           0.97      7000
   macro avg       0.85      0.60      0.65      7000
weighted avg       0.96      0.97      0.96      7000



In [89]:
##AdaBoostClassifier

print(classification_report(y_test,predict_ada_test))

              precision    recall  f1-score   support

           0       0.98      0.99      0.99      6763
           1       0.65      0.47      0.54       237

    accuracy                           0.97      7000
   macro avg       0.81      0.73      0.76      7000
weighted avg       0.97      0.97      0.97      7000



In [90]:
##ann

print(classification_report(y_test,prediction))



              precision    recall  f1-score   support

           0       0.97      1.00      0.99      6763
           1       0.76      0.22      0.34       237

    accuracy                           0.97      7000
   macro avg       0.87      0.61      0.66      7000
weighted avg       0.97      0.97      0.96      7000



In [91]:
## multilayer perceptron


print(classification_report(y_test,pred_mlp))

              precision    recall  f1-score   support

           0       0.97      1.00      0.98      6763
           1       1.00      0.05      0.09       237

    accuracy                           0.97      7000
   macro avg       0.98      0.52      0.54      7000
weighted avg       0.97      0.97      0.95      7000

