# CNC Mill Tool Wear


## Table of contents<a name=top></a>1. [Data describtion](#des)<br> 
1. [Import libraries](#import)<br> 
1. [Load Datasets](#load)<br>
1. [Pre-processing](#pre)<br>
    * [Missing values](#miss)<br>
    * [Merge Data](#merge)<br>


## Data describtion<a name=des></a>
In this project, a CNC milling machine was used to perform machining experiments. The CNC machine recorded machining data for different settings of tool condition, feed rate, and clamping pressure. More details about the dataset can be found [here](https://www.kaggle.com/datasets/shasun/tool-wear-detection-in-cnc-mill/data).<br>
The CNC machine had 4 motors (X, Y, Z axes and spindle) that generated time series data with a 100 ms interval in 18 machining experiments. The output of each experiment showed the tool condition (whether the tool was worn or not) and the result of visual inspection. This dataset is useful for finding tool wear or poor clamping.

## Import libraries<a name=import></a>

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

## Load datasets<a name=load></a>

In [2]:
train_data=pd.read_csv(r'E:/udemy/proj/train.csv')
print(f'shape of train_data : {train_data.shape}')
train_data.head(2)

shape of train_data : (18, 7)


Unnamed: 0,No,material,feedrate,clamp_pressure,tool_condition,machining_finalized,passed_visual_inspection
0,1,wax,6,4.0,unworn,yes,yes
1,2,wax,20,4.0,unworn,yes,yes


In [3]:
train_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 18 entries, 0 to 17
Data columns (total 7 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   No                        18 non-null     int64  
 1   material                  18 non-null     object 
 2   feedrate                  18 non-null     int64  
 3   clamp_pressure            18 non-null     float64
 4   tool_condition            18 non-null     object 
 5   machining_finalized       18 non-null     object 
 6   passed_visual_inspection  14 non-null     object 
dtypes: float64(1), int64(2), object(4)
memory usage: 1.1+ KB


In [4]:
experiment_01 = pd.read_csv(r"E:/udemy/proj/experiment_01.csv")
print(f'shape of experiment_01 : {experiment_01.shape}')
experiment_01.head(2)

shape of experiment_01 : (1055, 48)


Unnamed: 0,X1_ActualPosition,X1_ActualVelocity,X1_ActualAcceleration,X1_CommandPosition,X1_CommandVelocity,X1_CommandAcceleration,X1_CurrentFeedback,X1_DCBusVoltage,X1_OutputCurrent,X1_OutputVoltage,...,S1_CurrentFeedback,S1_DCBusVoltage,S1_OutputCurrent,S1_OutputVoltage,S1_OutputPower,S1_SystemInertia,M1_CURRENT_PROGRAM_NUMBER,M1_sequence_number,M1_CURRENT_FEEDRATE,Machining_Process
0,198.0,0.0,0.0,198.0,0.0,0.0,0.18,0.0207,329.0,2.77,...,0.524,2.74e-19,329.0,0.0,6.96e-07,12.0,1.0,0.0,50.0,Starting
1,198.0,-10.8,-350.0,198.0,-13.6,-358.0,-10.9,0.186,328.0,23.3,...,-0.288,2.74e-19,328.0,0.0,-5.27e-07,12.0,1.0,4.0,50.0,Prep


In [6]:
experiment_02 = pd.read_csv(r"E:/udemy/proj/experiment_02.csv")
print(f'shape of experiment_02 : {experiment_02.shape}')
experiment_02.head(2)

shape of experiment_02 : (1668, 48)


Unnamed: 0,X1_ActualPosition,X1_ActualVelocity,X1_ActualAcceleration,X1_CommandPosition,X1_CommandVelocity,X1_CommandAcceleration,X1_CurrentFeedback,X1_DCBusVoltage,X1_OutputCurrent,X1_OutputVoltage,...,S1_CurrentFeedback,S1_DCBusVoltage,S1_OutputCurrent,S1_OutputVoltage,S1_OutputPower,S1_SystemInertia,M1_CURRENT_PROGRAM_NUMBER,M1_sequence_number,M1_CURRENT_FEEDRATE,Machining_Process
0,198.0,0.0,0.0,198.0,0.0,0.0,-0.284,2.7899999999999997e-19,329.0,0.0,...,-1.86,0.0,332.0,0.0,0.0,12.0,1.0,2.0,50.0,Prep
1,198.0,0.0,0.0,198.0,0.0,0.0,-0.284,2.7899999999999997e-19,329.0,0.0,...,-1.86,0.0,332.0,0.0,0.0,12.0,0.0,0.0,50.0,Prep


## Pre-processing<a name=pre></a>
### Missing_values<a name=miss></a>
As it is seen in the `train_data.info()`, the `train_data` has some Non-values in the column passed_visual_inspection, which is indeed "No", i.e. the machining process was not passed the final test. Therefore, we can replace these missing-values by "No".

In [7]:
train_data['passed_visual_inspection'].fillna('no',inplace=True)
train_data.head()

Unnamed: 0,No,material,feedrate,clamp_pressure,tool_condition,machining_finalized,passed_visual_inspection
0,1,wax,6,4.0,unworn,yes,yes
1,2,wax,20,4.0,unworn,yes,yes
2,3,wax,6,3.0,unworn,yes,yes
3,4,wax,6,2.5,unworn,no,no
4,5,wax,20,3.0,unworn,no,no


## Merge data<a name=merge></a>
We can merge all experiments and add each row of train_data to the experiments. As you can see in the 'material' column, the only material is "Wax". Therefore, we can drop this column.

In [8]:
train_data['material'].unique()

array(['wax'], dtype=object)

In [10]:
Merge_frame = []
for i in range(1,19):
    exp_number = '0' + str(i) if i < 10 else str(i)
    exp_data = pd.read_csv(f"E:/udemy/proj/experiment_{exp_number}.csv")

    train_data_row = train_data[train_data['No'] == i]
    exp_data['exp_number'] = i
    #add each column of our train_data to the experiments
    exp_data['feedrate'] = train_data_row.iloc[0]['feedrate']
    exp_data['clamp_pressure'] = train_data_row.iloc[0]['clamp_pressure']
    exp_data['tool_condition'] = train_data_row.iloc[0]['tool_condition']
    exp_data['machining_finalized'] = train_data_row.iloc[0]['machining_finalized']
    exp_data['passed_visual_inspection'] = train_data_row.iloc[0]['passed_visual_inspection']

    Merge_frame.append(exp_data)

merged_data = pd.concat(Merge_frame, ignore_index = True)
merged_data.head(2)

Unnamed: 0,X1_ActualPosition,X1_ActualVelocity,X1_ActualAcceleration,X1_CommandPosition,X1_CommandVelocity,X1_CommandAcceleration,X1_CurrentFeedback,X1_DCBusVoltage,X1_OutputCurrent,X1_OutputVoltage,...,M1_CURRENT_PROGRAM_NUMBER,M1_sequence_number,M1_CURRENT_FEEDRATE,Machining_Process,exp_number,feedrate,clamp_pressure,tool_condition,machining_finalized,passed_visual_inspection
0,198.0,0.0,0.0,198.0,0.0,0.0,0.18,0.0207,329.0,2.77,...,1.0,0.0,50.0,Starting,1,6,4.0,unworn,yes,yes
1,198.0,-10.8,-350.0,198.0,-13.6,-358.0,-10.9,0.186,328.0,23.3,...,1.0,4.0,50.0,Prep,1,6,4.0,unworn,yes,yes


In [11]:
merged_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 25286 entries, 0 to 25285
Data columns (total 54 columns):
 #   Column                     Non-Null Count  Dtype  
---  ------                     --------------  -----  
 0   X1_ActualPosition          25286 non-null  float64
 1   X1_ActualVelocity          25286 non-null  float64
 2   X1_ActualAcceleration      25286 non-null  float64
 3   X1_CommandPosition         25286 non-null  float64
 4   X1_CommandVelocity         25286 non-null  float64
 5   X1_CommandAcceleration     25286 non-null  float64
 6   X1_CurrentFeedback         25286 non-null  float64
 7   X1_DCBusVoltage            25286 non-null  float64
 8   X1_OutputCurrent           25286 non-null  float64
 9   X1_OutputVoltage           25286 non-null  float64
 10  X1_OutputPower             25286 non-null  float64
 11  Y1_ActualPosition          25286 non-null  float64
 12  Y1_ActualVelocity          25286 non-null  float64
 13  Y1_ActualAcceleration      25286 non-null  flo

In [12]:
merged_data['Machining_Process'].unique()

array(['Starting', 'Prep', 'Layer 1 Up', 'Layer 1 Down', 'Repositioning',
       'Layer 2 Up', 'Layer 2 Down', 'Layer 3 Up', 'Layer 3 Down', 'end',
       'End'], dtype=object)

In [13]:
merged_data['Machining_Process'].value_counts().sort_values()

Machining_Process
Starting            1
end                 8
Prep             1795
Layer 3 Down     2354
Layer 2 Down     2528
End              2585
Layer 1 Down     2655
Layer 3 Up       2794
Layer 2 Up       3104
Repositioning    3377
Layer 1 Up       4085
Name: count, dtype: int64

I assume 'End' and 'end' are the same result in the "Machining_Process" column. So we can use only 'End'. In addition, we have only one row with the value "Starting", so I think it can be included in the "Prep" category.

In [16]:
merged_data.replace({'Machining_Process':{'end':'End','Starting':'Prep'}},inplace=True)
merged_data['Machining_Process'].unique()

array(['Prep', 'Layer 1 Up', 'Layer 1 Down', 'Repositioning',
       'Layer 2 Up', 'Layer 2 Down', 'Layer 3 Up', 'Layer 3 Down', 'End'],
      dtype=object)