# Predictive Maintenance for Manufacturing Equipment

## Life cycle of Machine learning Project

1.Understanding the Problem Statement<br>
2.Data Collection<br>
3.Data Checks to perform<br>
4.Exploratory data analysis<br>
5.Data Pre-Processing<br>
6.Model Training<br>
7.Choose best model<br>



## 1.Problem Statement

The goal is to develop a predictive maintenance model that can predict equipment
failures before they occur. The dataset includes sensor readings and maintenance
logs from a variety of machines.

## 2.Data Collection

### 2.1 Importing the Libraries

In [12]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

In [13]:
df=pd.read_csv('Csv_Files/Final_Project_dataset.csv')

### 2.2 Showing the First Five Record

In [14]:
df.head()

Unnamed: 0,UDI,Product ID,Type,Air temperature [K],Process temperature [K],Rotational speed [rpm],Torque [Nm],Tool wear [min],Target,Failure Type
0,1,M14860,M,298.1,308.6,1551,42.8,0,0,No Failure
1,2,L47181,L,298.2,308.7,1408,46.3,3,0,No Failure
2,3,L47182,L,298.1,308.5,1498,49.4,5,0,No Failure
3,4,L47183,L,298.2,308.6,1433,39.5,7,0,No Failure
4,5,L47184,L,298.2,308.7,1408,40.0,9,0,No Failure


### 2.3 Shape oF the Dataset

In [17]:
print("Total No. oF Rows in the DataFrame:",df.shape[0])
print("Total No. oF Columns in the DataFrame:",df.shape[1])

Total No. oF Rows in the DataFrame: 10000
Total No. oF Columns in the DataFrame: 10


### 2.4 Columns oF the Dataset

In [19]:
print("Columns oF the DataFrame:",df.columns)

Columns oF the DataFrame: Index(['UDI', 'Product ID', 'Type', 'Air temperature [K]',
       'Process temperature [K]', 'Rotational speed [rpm]', 'Torque [Nm]',
       'Tool wear [min]', 'Target', 'Failure Type'],
      dtype='object')


## 3.DataChecks to PerForm

1.Check Missing Values<br>
2.Check Duplicates<br>
3.Check Datatype<br>
4.Check the number  of unique values oF each Column<br>
5.Check the Statistics oF dataset<br>
6.Check the various categories present in the different Categorical Column<br>

### 3.1.Check Missing Values

In [40]:
for i in df.columns:
    print('Total no. oF missing values in the "{}" column: {}'.format(i,df[i].isnull().sum()))

Total no. oF missing values in the "UDI" column: 0
Total no. oF missing values in the "Product ID" column: 0
Total no. oF missing values in the "Type" column: 0
Total no. oF missing values in the "Air temperature [K]" column: 0
Total no. oF missing values in the "Process temperature [K]" column: 0
Total no. oF missing values in the "Rotational speed [rpm]" column: 0
Total no. oF missing values in the "Torque [Nm]" column: 0
Total no. oF missing values in the "Tool wear [min]" column: 0
Total no. oF missing values in the "Target" column: 0
Total no. oF missing values in the "Failure Type" column: 0


### 3.2.Check Duplicates

In [33]:
print("Total no. oF duplicate values in the DataFrame: {}".format(df.duplicated().sum()))

Total no. oF duplicate values in the DataFrame: 0


### 3.3 Check datatypes

In [39]:
for i in df.columns:
    print('Datatype oF "{}" column: {}'.format(i,df[i].dtype))

Datatype oF "UDI" column: int64
Datatype oF "Product ID" column: object
Datatype oF "Type" column: object
Datatype oF "Air temperature [K]" column: float64
Datatype oF "Process temperature [K]" column: float64
Datatype oF "Rotational speed [rpm]" column: int64
Datatype oF "Torque [Nm]" column: float64
Datatype oF "Tool wear [min]" column: int64
Datatype oF "Target" column: int64
Datatype oF "Failure Type" column: object


### 3.4 Check the number oF unique values oF each column

In [44]:
for i in df.columns:
    print('Unique no. oF values in the "{}" column: {}'.format(i,df[i].nunique()))

Unique no. oF values in the "UDI" column: 10000
Unique no. oF values in the "Product ID" column: 10000
Unique no. oF values in the "Type" column: 3
Unique no. oF values in the "Air temperature [K]" column: 93
Unique no. oF values in the "Process temperature [K]" column: 82
Unique no. oF values in the "Rotational speed [rpm]" column: 941
Unique no. oF values in the "Torque [Nm]" column: 577
Unique no. oF values in the "Tool wear [min]" column: 246
Unique no. oF values in the "Target" column: 2
Unique no. oF values in the "Failure Type" column: 6


### 3.5 Check the Statistics oF Dataset

In [45]:
df.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
UDI,10000.0,5000.5,2886.89568,1.0,2500.75,5000.5,7500.25,10000.0
Air temperature [K],10000.0,300.00493,2.000259,295.3,298.3,300.1,301.5,304.5
Process temperature [K],10000.0,310.00556,1.483734,305.7,308.8,310.1,311.1,313.8
Rotational speed [rpm],10000.0,1538.7761,179.284096,1168.0,1423.0,1503.0,1612.0,2886.0
Torque [Nm],10000.0,39.98691,9.968934,3.8,33.2,40.1,46.8,76.6
Tool wear [min],10000.0,107.951,63.654147,0.0,53.0,108.0,162.0,253.0
Target,10000.0,0.0339,0.180981,0.0,0.0,0.0,0.0,1.0


### 3.7 Exploring the Data by Count

In [64]:
Categorical_Cols=[Feature for Feature in df.columns if df[Feature].dtypes=='O']
print('Categorical_Cols',Categorical_Cols)

numerical_Cat_Cols=[Feature for Feature in df.columns if df[Feature].dtypes!='O' and df[Feature].nunique()<25]
print('numerical_Cat_Cols',numerical_Cat_Cols)

numerical_Cols=[Feature for Feature in df.columns if df[Feature].dtypes!='O' and Feature not in numerical_Cat_Cols]
print('numerical_Cols',numerical_Cols)

Categorical_Cols ['Product ID', 'Type', 'Failure Type']
numerical_Cat_Cols ['Target']
numerical_Cols ['UDI', 'Air temperature [K]', 'Process temperature [K]', 'Rotational speed [rpm]', 'Torque [Nm]', 'Tool wear [min]']


In [68]:
for i in numerical_Cat_Cols:
    print('Unique value oF the {} Column: {}'.format(i,df[i].unique()))

Unique value oF the Target Column: [0 1]


In [69]:
for i in Categorical_Cols:
    print('Unique value oF the {} Column: {}'.format(i,df[i].unique()))

Unique value oF the Product ID Column: ['M14860' 'L47181' 'L47182' ... 'M24857' 'H39412' 'M24859']
Unique value oF the Type Column: ['M' 'L' 'H']
Unique value oF the Failure Type Column: ['No Failure' 'Power Failure' 'Tool Wear Failure' 'Overstrain Failure'
 'Random Failures' 'Heat Dissipation Failure']


### 3.8.Exploring the Data by Visualization

In [71]:
''' 
1. Bar Plot--->Comparision
2. Box Plot--->Outlier
3. Line Plot--->Relationship
4. Histplot--->Distribution
5. Countplot--->Counting values in Categories '''

#https://github.com/krishnaik06/mlproject/blob/main/notebook/1%20.%20EDA%20STUDENT%20PERFORMANCE%20.ipynb

' \n1. Bar Plot--->Comparision\n2. Box Plot--->Outlier\n3. Line Plot--->Relationship\n4. Histplot--->Distribution\n5. Countplot--->Counting values in Categories '