# DSE Course 2, Session 7: Casting Defect Detection Case Study

**Instructor**: Wesley Beckner

**Contact**: wesleybeckner@gmail.com

<br>

---

<br>

# Overview

This dataset contains data from machining experiments. Machining data was collected from a CNC machine for variations of tool condition, feed rate, and clamping pressure.

## Definitions

* feed rate
    * relative velocity of the cutting tool along the workpiece (mm/s)
* clamping pressure
    * pressure used to hold the workpiece in the vise (bar)

In 18 machining experiments, time series data was collected with a sampling rate of 100 ms from the 4 motors in the CNC (X, Y, Z axes and spindle).
And output of each experiments includes tool condition (unworn and worn tools) and whether or not the tool passed visual inspection.
## Objectives

* tool wear detection 
* detection of inadequate clamping.

[more info](https://github.com/wesleybeckner/ds_for_engineers/tree/main/data/mill_tool_wear)

<br>

---

<br>

<a name='top'></a>

# Contents

* 7.0 [Preparing Environment and Importing Data](#x.0)
  * 7.0.1 [Import Packages](#x.0.1)
  * 7.0.2 [Load Dataset](#x.0.2)
* 7.1 [Exploratory Data Analysis](#x.1)
  * 7.1.1 [First Look: Shape, Nulls, Description](#x.1.1)
  * 7.1.2 [Descriptive Statistics](#x.1.2)
    * 7.1.2.1 [Univariate Analysis](#x.1.2.1)
    * 7.1.2.2 [Multivariate Analysis](#x.1.2.2)
    * 7.1.2.3 [Frequency Analysis](#x.1.2.3)
* 7.2 [Feature Engineering](#x.2)
  * 7.2.1 [Feature Skewness](#x.2.1)
  * 7.2.2 [Feature Colinearity](#x.2.2)
  * 7.2.3 [Feature Normalization](#x.2.3)
  * 7.2.4 [Feature Selection](#x.2.4)
  * 7.2.5 [Fast Fourier Transform](#x.2.5)
* 7.3 [Modeling](#x.3)
  * 7.3.1 [Tool Condition](#x.3.1)
  * 7.3.2 [Machining Finalized](#x.3.2)
  * 7.3.3 [Passed Visual Inspection](#x.3.3)
  

<br>

---

<a name='x.0'></a>

## 7.0 Preparing Environment and Importing Data

[back to top](#top)

<a name='x.0.1'></a>

### 7.0.1 Import Packages

[back to top](#top)

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
import plotly.express as px
from plotly.subplots import make_subplots
from ipywidgets import interact, ToggleButtons, SelectMultiple
import seaborn as sns
from scipy import signal

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.metrics import accuracy_score, f1_score, roc_auc_score

from sklearn.linear_model import LogisticRegression, LinearRegression
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier, AdaBoostClassifier
from sklearn.svm import SVC
from sklearn.metrics import mean_squared_error, r2_score
from xgboost import XGBClassifier
from sklearn.decomposition import PCA
from sklearn.metrics import confusion_matrix

<a name='x.0.1'></a>

### 7.0.2 Load Dataset

[back to top](#top)

In [2]:
status = pd.read_csv("https://raw.githubusercontent.com/wesleybeckner/"\
                     "ds_for_engineers/main/data/mill_tool_wear/train.csv")
status

Unnamed: 0,No,material,feedrate,clamp_pressure,tool_condition,machining_finalized,passed_visual_inspection
0,1,wax,6,4.0,unworn,yes,yes
1,2,wax,20,4.0,unworn,yes,yes
2,3,wax,6,3.0,unworn,yes,yes
3,4,wax,6,2.5,unworn,no,
4,5,wax,20,3.0,unworn,no,
5,6,wax,6,4.0,worn,yes,no
6,7,wax,20,4.0,worn,no,
7,8,wax,20,4.0,worn,yes,no
8,9,wax,15,4.0,worn,yes,no
9,10,wax,12,4.0,worn,yes,no


In [3]:
tests = pd.DataFrame()

for i in range(1,19):
  if i < 10:
    df = pd.read_csv("https://raw.githubusercontent.com/wesleybeckner/"\
                   "ds_for_engineers/main/data/mill_tool_wear/"\
                   "experiment_0{}.csv".format(i))
  else:
    df = pd.read_csv("https://raw.githubusercontent.com/wesleybeckner/"\
                   "ds_for_engineers/main/data/mill_tool_wear/"\
                   "experiment_{}.csv".format(i))
  df["feedrate"] = status.loc[i-1,"feedrate"]
  df["clamp_pressure"] = status.loc[i-1,"clamp_pressure"]
  df["tool_condition"] = status.loc[i-1,"tool_condition"]
  df["experiment"] = i
  tests = pd.concat([tests, df], ignore_index=True)

<a name='x.1'></a>

## 7.1 Exploratory Data Analysis

[back to top](#top)

<a name='x.1.1'></a>

### 7.1.1 First Look: Shape, Nulls, Description

[back to top](#top)

In [4]:
tests.shape

(25286, 52)

In [5]:
tests.loc[tests.isnull().any(1)]

Unnamed: 0,X1_ActualPosition,X1_ActualVelocity,X1_ActualAcceleration,X1_CommandPosition,X1_CommandVelocity,X1_CommandAcceleration,X1_CurrentFeedback,X1_DCBusVoltage,X1_OutputCurrent,X1_OutputVoltage,X1_OutputPower,Y1_ActualPosition,Y1_ActualVelocity,Y1_ActualAcceleration,Y1_CommandPosition,Y1_CommandVelocity,Y1_CommandAcceleration,Y1_CurrentFeedback,Y1_DCBusVoltage,Y1_OutputCurrent,Y1_OutputVoltage,Y1_OutputPower,Z1_ActualPosition,Z1_ActualVelocity,Z1_ActualAcceleration,Z1_CommandPosition,Z1_CommandVelocity,Z1_CommandAcceleration,Z1_CurrentFeedback,Z1_DCBusVoltage,Z1_OutputCurrent,Z1_OutputVoltage,S1_ActualPosition,S1_ActualVelocity,S1_ActualAcceleration,S1_CommandPosition,S1_CommandVelocity,S1_CommandAcceleration,S1_CurrentFeedback,S1_DCBusVoltage,S1_OutputCurrent,S1_OutputVoltage,S1_OutputPower,S1_SystemInertia,M1_CURRENT_PROGRAM_NUMBER,M1_sequence_number,M1_CURRENT_FEEDRATE,Machining_Process,feedrate,clamp_pressure,tool_condition,experiment


In [6]:
tests.dtypes

X1_ActualPosition            float64
X1_ActualVelocity            float64
X1_ActualAcceleration        float64
X1_CommandPosition           float64
X1_CommandVelocity           float64
X1_CommandAcceleration       float64
X1_CurrentFeedback           float64
X1_DCBusVoltage              float64
X1_OutputCurrent             float64
X1_OutputVoltage             float64
X1_OutputPower               float64
Y1_ActualPosition            float64
Y1_ActualVelocity            float64
Y1_ActualAcceleration        float64
Y1_CommandPosition           float64
Y1_CommandVelocity           float64
Y1_CommandAcceleration       float64
Y1_CurrentFeedback           float64
Y1_DCBusVoltage              float64
Y1_OutputCurrent             float64
Y1_OutputVoltage             float64
Y1_OutputPower               float64
Z1_ActualPosition            float64
Z1_ActualVelocity            float64
Z1_ActualAcceleration        float64
Z1_CommandPosition           float64
Z1_CommandVelocity           float64
Z

In [7]:
tests.describe()

Unnamed: 0,X1_ActualPosition,X1_ActualVelocity,X1_ActualAcceleration,X1_CommandPosition,X1_CommandVelocity,X1_CommandAcceleration,X1_CurrentFeedback,X1_DCBusVoltage,X1_OutputCurrent,X1_OutputVoltage,X1_OutputPower,Y1_ActualPosition,Y1_ActualVelocity,Y1_ActualAcceleration,Y1_CommandPosition,Y1_CommandVelocity,Y1_CommandAcceleration,Y1_CurrentFeedback,Y1_DCBusVoltage,Y1_OutputCurrent,Y1_OutputVoltage,Y1_OutputPower,Z1_ActualPosition,Z1_ActualVelocity,Z1_ActualAcceleration,Z1_CommandPosition,Z1_CommandVelocity,Z1_CommandAcceleration,Z1_CurrentFeedback,Z1_DCBusVoltage,Z1_OutputCurrent,Z1_OutputVoltage,S1_ActualPosition,S1_ActualVelocity,S1_ActualAcceleration,S1_CommandPosition,S1_CommandVelocity,S1_CommandAcceleration,S1_CurrentFeedback,S1_DCBusVoltage,S1_OutputCurrent,S1_OutputVoltage,S1_OutputPower,S1_SystemInertia,M1_CURRENT_PROGRAM_NUMBER,M1_sequence_number,M1_CURRENT_FEEDRATE,feedrate,clamp_pressure,experiment
count,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0,25286.0
mean,159.052045,-0.288657,0.094264,159.0507,-0.283076,0.253215,-0.469714,0.06603073,326.945859,7.986942,0.00061,99.230064,-0.422932,0.928832,99.226271,-0.412075,1.484158,-0.061681,0.06398315,325.862058,7.068352,0.000637,47.780638,-0.328759,-0.103658,47.778031,-0.328464,0.134118,0.0,0.0,0.0,0.0,-115.373082,38.986424,0.248602,-115.051536,38.650012,0.3855889,15.243948,0.6692938,322.784505,85.479195,0.127405,12.0,1.003441,47.345013,16.542039,7.123942,3.368168,10.738235
std,19.330873,5.65826,93.877623,19.331144,5.664309,72.594951,4.22075,0.03700384,1.459937,7.710379,0.001565,29.24488,6.006439,85.07458,29.242802,6.004721,81.358073,4.469548,0.04777806,1.804164,8.601484,0.002098,34.25565,7.635223,66.442671,34.252517,7.637045,61.988683,0.0,0.0,0.0,0.0,1212.730873,23.491267,32.204079,1212.766136,23.75809,6.189738,10.222419,0.4332413,4.293571,52.531863,0.080753,0.0,0.349055,43.826214,19.620219,6.167036,0.615639,5.213285
min,141.0,-20.4,-1280.0,141.0,-20.0,-1000.0,-23.4,2.78e-19,320.0,0.0,-0.00606,72.4,-32.8,-1260.0,72.4,-32.4,-1000.0,-27.8,2.68e-19,319.0,0.0,-0.00492,27.5,-51.5,-1260.0,27.5,-50.0,-1000.0,0.0,0.0,0.0,0.0,-2150.0,-0.069,-150.0,-2150.0,0.0,-9.54e-07,-8.28,0.0,290.0,0.0,-0.00296,12.0,0.0,0.0,3.0,3.0,2.5,1.0
25%,145.0,-2.05,-31.3,145.0,-2.05,0.0,-3.93,0.0415,326.0,2.59,0.0,77.5,-0.075,-18.8,77.5,0.0,0.0,-3.09,0.0219,325.0,1.81,0.0,28.5,0.0,-6.25,28.5,0.0,0.0,0.0,0.0,0.0,0.0,-1160.0,0.002,-15.9,-1160.0,0.0,0.0,0.821,2.7899999999999997e-19,320.0,0.0,5e-06,12.0,1.0,2.0,3.0,3.0,3.0,6.0
50%,153.0,0.0,0.0,153.0,0.0,0.0,-0.666,0.0668,327.0,7.14,0.000174,90.0,0.0,0.0,90.0,0.0,0.0,0.146,0.0578,326.0,4.93,4e-06,29.5,0.0,0.0,29.5,0.0,0.0,0.0,0.0,0.0,0.0,-112.5,53.3,0.0,-112.0,53.3,0.0,18.8,0.858,322.0,117.0,0.164,12.0,1.0,39.0,6.0,3.0,3.0,12.0
75%,162.0,0.2,25.0,162.0,0.0,0.0,3.14,0.0913,327.0,10.2,0.000585,105.0,0.1,18.8,105.0,0.0,0.0,2.9,0.095575,326.0,9.61,0.000506,55.5,0.0,6.25,55.5,0.0,0.0,0.0,0.0,0.0,0.0,860.75,53.4,17.5,861.75,53.3,0.0,22.3,0.952,327.0,119.0,0.183,12.0,1.0,85.0,20.0,6.0,4.0,15.0
max,198.0,50.7,1440.0,198.0,50.0,1000.0,27.1,0.38,331.0,75.4,0.0388,158.0,50.4,1460.0,158.0,50.0,1000.0,30.7,0.43,333.0,76.8,0.0424,119.0,50.9,1270.0,119.0,50.0,1000.0,0.0,0.0,0.0,0.0,2150.0,53.8,150.0,2150.0,53.3,100.0,75.4,3.16,332.0,130.0,0.569,12.0,4.0,135.0,50.0,20.0,4.0,18.0


<a name='x.1.2'></a>

### 7.1.2 Descriptive Statistics

[back to top](#top)

<a name='x.1.2.1'></a>

#### 7.1.2.1 Univariate Analysis

[back to top](#top)

In [8]:
tests.skew()

X1_ActualPosition             1.219721
X1_ActualVelocity             1.727296
X1_ActualAcceleration         0.912499
X1_CommandPosition            1.219765
X1_CommandVelocity            1.715114
X1_CommandAcceleration       -0.027659
X1_CurrentFeedback            0.269644
X1_DCBusVoltage               0.830362
X1_OutputCurrent             -0.037263
X1_OutputVoltage              2.844746
X1_OutputPower                7.594701
Y1_ActualPosition             1.199451
Y1_ActualVelocity            -1.448097
Y1_ActualAcceleration         0.841916
Y1_CommandPosition            1.199722
Y1_CommandVelocity           -1.446608
Y1_CommandAcceleration        1.953117
Y1_CurrentFeedback           -0.058733
Y1_DCBusVoltage               0.963220
Y1_OutputCurrent              1.084588
Y1_OutputVoltage              3.334789
Y1_OutputPower                6.154120
Z1_ActualPosition             1.477857
Z1_ActualVelocity             0.806090
Z1_ActualAcceleration        -1.073989
Z1_CommandPosition       

In [9]:
tests.kurtosis()

X1_ActualPosition              0.077315
X1_ActualVelocity             21.078788
X1_ActualAcceleration         69.964697
X1_CommandPosition             0.077357
X1_CommandVelocity            20.806996
X1_CommandAcceleration       143.523032
X1_CurrentFeedback             0.218157
X1_DCBusVoltage                4.465559
X1_OutputCurrent               0.873106
X1_OutputVoltage              14.430564
X1_OutputPower                88.354746
Y1_ActualPosition              0.032372
Y1_ActualVelocity             22.919821
Y1_ActualAcceleration         84.394353
Y1_CommandPosition             0.033187
Y1_CommandVelocity            23.067341
Y1_CommandAcceleration       111.688216
Y1_CurrentFeedback             0.886280
Y1_DCBusVoltage                1.698791
Y1_OutputCurrent               2.819351
Y1_OutputVoltage              14.875838
Y1_OutputPower                48.488226
Z1_ActualPosition              0.380075
Z1_ActualVelocity             30.490088
Z1_ActualAcceleration        220.670604


<a name='x.1.2.2'></a>

#### 7.1.2.2 Multivariate Analysis

[back to top](#top)

<a name='x.1.2.3'></a>

#### 7.1.2.3 Frequency Analysis

[back to top](#top)

In [10]:
dropdown = []
for feature in tests.columns:
  df = tests.loc[tests['experiment'] == 1]
  t = df.index
  s = df[feature]
  try:
    F = np.fft.fft(s)
    freq = np.fft.fftfreq(t.shape[-1])
    magnitude = np.sqrt(F.real**2 + F.imag**2)
    skew = pd.Series(magnitude).skew()
    kurt = pd.Series(magnitude).kurtosis()
    dropdown.append([feature, skew, kurt])
  except:
    print(feature)
    pass

Machining_Process
tool_condition


In [11]:
pd.DataFrame(dropdown).sort_values(2)

Unnamed: 0,0,1,2
24,Z1_ActualAcceleration,0.569778,-0.470645
13,Y1_ActualAcceleration,0.425063,-0.427739
37,S1_CommandAcceleration,1.116458,-0.193898
27,Z1_CommandAcceleration,0.518229,-0.11321
5,X1_CommandAcceleration,0.419607,-0.044612
2,X1_ActualAcceleration,0.525094,-0.01809
31,Z1_OutputVoltage,0.0,0.0
30,Z1_OutputCurrent,0.0,0.0
29,Z1_DCBusVoltage,0.0,0.0
28,Z1_CurrentFeedback,0.0,0.0


In [12]:
# first 5 are not useful
features = pd.DataFrame(dropdown).sort_values(2, ascending=False)\
  .reset_index(drop=True)[0].values[5:]

In [13]:
def fft(exp=np.arange(1,19), feature=features):
  df = tests.loc[tests['experiment'] == exp]
  t = df.index
  s = df[feature]
  F = np.fft.fft(s)
  freq = np.fft.fftfreq(t.shape[-1])
  fig, ax = plt.subplots(1,2,figsize=(10,5))
  ax[0].plot(t,s)
  ax[0].set_title("{}".format(df['tool_condition'][0]))
  magnitude = np.sqrt(F.real**2 + F.imag**2)
  skew = pd.Series(magnitude).skew()
  kurt = pd.Series(magnitude).kurtosis()
  ax[1].plot(freq, magnitude)
  ax[1].set_title("skew: {:.2f}, kurtosis: {:.2f}".format(skew, kurt))
  magnitude

In [14]:
interact(fft)

interactive(children=(Dropdown(description='exp', options=(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, …

<function __main__.fft>

<a name='x.2'></a>

## 7.2 Feature Engineering

[back to top](#top)

<a name='x.2.3'></a>

### 7.2.3 Feature Normalization

[back to top](#top)

If we were using a linear model, or interested in comparing features based on coefficients we would want to do some feature scaling

<a name='x.2.4'></a>

### 7.2.4 Feature Selection

[back to top](#top)

An interesting improvement could be a lag or window based treatment of the time series data.

<a name='x.2.5'></a>

### 7.2.5 Fast Fourier Transform

[back to top](#top)

Here I've created some summary features for each experiment, proved not very useful.

In [18]:
fft = []
dicc = {}
for exp in range(1,19):
  dic = {}
  for feature in tests.columns:
    df = tests.loc[tests['experiment'] == exp]
    t = df.index
    s = df[feature]
    try:
      F = np.fft.fft(s)
      freq = np.fft.fftfreq(t.shape[-1])
      magnitude = np.sqrt(F.real**2 + F.imag**2)
      skew = pd.Series(magnitude).skew()
      kurt = pd.Series(magnitude).kurtosis()
      dropdown.append([feature, skew, kurt])
      high_amp = np.max(magnitude)
      high_freq = freq[np.argmax(np.max(magnitude))]  
      fft.append([feature,skew,kurt,high_amp])
      dic[f'{feature}_skew'] = skew
      dic[f'{feature}_kurt'] = kurt
      dic[f'{feature}_max'] = high_amp
    except:
      pass
  dicc[exp-1] = dic

In [19]:
pd.DataFrame(dicc).T.join(status['tool_condition'])

Unnamed: 0,X1_ActualPosition_skew,X1_ActualPosition_kurt,X1_ActualPosition_max,X1_ActualVelocity_skew,X1_ActualVelocity_kurt,X1_ActualVelocity_max,X1_ActualAcceleration_skew,X1_ActualAcceleration_kurt,X1_ActualAcceleration_max,X1_CommandPosition_skew,X1_CommandPosition_kurt,X1_CommandPosition_max,X1_CommandVelocity_skew,X1_CommandVelocity_kurt,X1_CommandVelocity_max,X1_CommandAcceleration_skew,X1_CommandAcceleration_kurt,X1_CommandAcceleration_max,X1_CurrentFeedback_skew,X1_CurrentFeedback_kurt,X1_CurrentFeedback_max,X1_DCBusVoltage_skew,X1_DCBusVoltage_kurt,X1_DCBusVoltage_max,X1_OutputCurrent_skew,X1_OutputCurrent_kurt,X1_OutputCurrent_max,X1_OutputVoltage_skew,X1_OutputVoltage_kurt,X1_OutputVoltage_max,X1_OutputPower_skew,X1_OutputPower_kurt,X1_OutputPower_max,Y1_ActualPosition_skew,Y1_ActualPosition_kurt,Y1_ActualPosition_max,Y1_ActualVelocity_skew,Y1_ActualVelocity_kurt,Y1_ActualVelocity_max,Y1_ActualAcceleration_skew,...,S1_CommandAcceleration_skew,S1_CommandAcceleration_kurt,S1_CommandAcceleration_max,S1_CurrentFeedback_skew,S1_CurrentFeedback_kurt,S1_CurrentFeedback_max,S1_DCBusVoltage_skew,S1_DCBusVoltage_kurt,S1_DCBusVoltage_max,S1_OutputCurrent_skew,S1_OutputCurrent_kurt,S1_OutputCurrent_max,S1_OutputVoltage_skew,S1_OutputVoltage_kurt,S1_OutputVoltage_max,S1_OutputPower_skew,S1_OutputPower_kurt,S1_OutputPower_max,S1_SystemInertia_skew,S1_SystemInertia_kurt,S1_SystemInertia_max,M1_CURRENT_PROGRAM_NUMBER_skew,M1_CURRENT_PROGRAM_NUMBER_kurt,M1_CURRENT_PROGRAM_NUMBER_max,M1_sequence_number_skew,M1_sequence_number_kurt,M1_sequence_number_max,M1_CURRENT_FEEDRATE_skew,M1_CURRENT_FEEDRATE_kurt,M1_CURRENT_FEEDRATE_max,feedrate_skew,feedrate_kurt,feedrate_max,clamp_pressure_skew,clamp_pressure_kurt,clamp_pressure_max,experiment_skew,experiment_kurt,experiment_max,tool_condition
0,32.358518,1049.654223,160358.0,7.241241,78.608798,2267.403657,0.525094,-0.01809,8459.839213,32.35871,1049.662474,160359.0,7.110567,76.197827,2267.00598,0.419607,-0.044612,6331.522668,10.23555,129.194311,1995.362517,29.839524,940.07642,86.0951,32.480693,1054.996955,343799.0,25.071505,736.581021,11347.382,14.088407,321.947449,0.891809,31.551351,1013.552642,92779.9,4.8086,29.366344,1748.729717,0.425063,...,1.116458,-0.193898,499.999981,31.104582,995.349836,21636.81659,31.216578,1000.145261,951.3057,32.479049,1054.925646,337797.0,31.243408,1000.903649,120785.08,31.256235,1001.668769,181.344193,32.480764,1055.0,12660.0,32.480764,1055.0,1055.0,23.399266,646.973129,67672.0,11.986209,227.32026,9146.0,32.480764,1055.0,6330.0,32.480764,1055.0,4220.0,32.480764,1055.0,1055.0,unworn
1,40.295421,1638.194077,308372.0,2.626479,8.069969,1277.580811,0.562289,0.293123,13088.655455,40.293702,1638.099747,308391.0,2.654007,8.140925,1305.110204,0.599825,0.158215,11321.882159,2.558626,8.482396,635.715798,9.304873,187.732198,42.9562,40.840674,1667.973734,545153.0,7.483621,126.746626,7919.041,5.813683,84.460021,1.12478,38.686557,1550.555694,230642.1,2.461187,7.211775,1262.065195,0.588225,...,1.115913,-0.195628,499.99999,6.136065,77.727803,6692.53163,7.865156,129.686369,336.8754,40.835828,1667.70961,544911.0,7.986397,133.040693,41963.82,8.070305,135.662615,63.424294,40.841156,1668.0,20016.0,23.09924,770.143865,1755.0,6.328647,85.642701,24093.0,37.206297,1470.640823,71406.0,40.841156,1668.0,33360.0,40.841156,1668.0,6672.0,40.841156,1668.0,3336.0,unworn
2,38.342509,1486.687728,244091.0,2.627967,8.000897,918.946257,0.746675,0.622838,10160.781483,38.342372,1486.680555,244088.0,2.578902,7.689968,926.934184,0.655832,0.350907,8184.940463,2.786419,9.814904,940.782235,32.003336,1161.77097,101.4818,38.999256,1520.9613,497611.0,25.169811,831.559094,12362.5752,10.040535,221.253074,0.908206,35.462183,1337.016074,154388.3,2.649535,7.248651,976.465955,0.66869,...,1.382662,0.509055,599.999963,25.890431,862.017634,21802.92985,25.356344,836.422543,948.54,38.992268,1520.597414,491131.0,25.407972,838.276907,120622.7,25.657187,850.311335,180.472302,39.0,1521.0,18252.0,39.0,1521.0,1521.0,19.595185,579.097965,67675.0,14.181628,352.446583,18366.0,39.0,1521.0,9126.0,39.0,1521.0,4563.0,39.0,1521.0,4563.0,unworn
3,22.67919,520.04218,95688.0,2.795115,10.012261,976.013595,0.241357,-0.59825,3234.492248,22.679365,520.047602,95689.0,2.804278,10.035086,972.122878,-0.064151,-1.017145,2308.806979,3.823341,21.60623,551.314246,13.851766,257.038493,26.0729,23.064751,531.988463,174454.0,5.997672,64.956182,3520.5037,2.61917,9.602323,0.427223,21.665755,488.597829,70344.8,2.992856,10.311706,1150.438371,0.283774,...,1.333828,0.958082,600.000026,6.677844,76.410107,3636.55792,6.283067,68.083869,147.608901,23.059758,531.834469,173165.0,6.295723,65.435701,17081.98,6.532402,71.084245,26.905088,23.065125,532.0,6384.0,23.065125,532.0,532.0,6.53015,74.619407,3230.0,13.172083,235.9651,13928.0,23.065125,532.0,3192.0,23.065125,532.0,1330.0,23.065125,532.0,2128.0,unworn
4,21.267138,455.427574,86812.0,2.323399,5.974312,865.302235,0.420449,-0.377731,5908.05642,21.266683,455.414119,86811.0,2.305478,5.833795,869.067327,0.452086,-0.394649,4902.914294,2.013011,5.268584,296.3536,11.957849,200.562904,19.6616,21.493849,461.99034,150947.0,4.483334,39.147225,2975.7265,2.337403,9.472902,0.439088,20.714257,439.350703,66345.6,2.682114,8.62856,1257.039244,0.454239,...,1.050849,0.24123,599.999998,3.248347,14.101336,1303.4182,3.29075,14.040133,57.0754,21.492507,461.951753,150651.0,3.685108,16.080379,5543.764,3.512596,14.940117,9.847975,21.494185,462.0,5544.0,21.494185,462.0,462.0,4.085149,21.957074,1200.0,11.223774,177.355255,10176.0,21.494185,462.0,9240.0,21.494185,462.0,1386.0,21.494185,462.0,2310.0,unworn
5,35.541095,1273.882318,203387.0,2.911551,10.787677,1138.466919,0.629469,0.039104,8564.569162,35.541602,1273.906807,203388.0,2.905976,10.6686,1150.659089,0.435329,-0.329166,5471.068162,3.120726,12.380866,1051.810182,31.638955,1088.195879,95.2215,35.999428,1295.972525,421678.0,25.434652,805.830043,11829.706,12.205181,285.923847,0.908058,33.349953,1168.588977,124255.9,2.822583,8.480196,1031.569664,0.528214,...,0.877896,-0.182554,500.000002,27.752641,906.559065,21369.3106,28.039129,919.769568,954.735601,35.993695,1295.696948,414840.0,27.958334,916.047916,121022.45,28.125104,923.501238,181.644351,36.0,1296.0,15552.0,36.0,1296.0,1296.0,21.965179,654.66331,67432.0,14.609206,355.807979,21196.0,36.0,1296.0,7776.0,36.0,1296.0,5184.0,36.0,1296.0,7776.0,worn
6,23.305378,550.127184,95918.0,2.231105,5.192789,1181.327089,0.487058,-0.078552,8284.329202,23.30462,550.102316,95907.0,2.21508,5.118822,1174.436129,0.628331,0.126429,9951.235397,1.947248,4.244931,434.221025,15.93589,323.258555,37.8803,23.769159,564.981909,184236.0,11.077615,190.118536,7567.3915,6.986812,93.264689,1.15841,21.689849,498.32012,65131.4,2.443734,6.16337,1356.211648,0.694692,...,1.117752,-0.189792,500.0,11.540475,197.995383,6699.14628,11.221013,189.12394,288.776441,23.765015,564.85022,182427.0,11.40437,194.843996,33244.44,11.555731,198.420958,54.150083,23.769729,565.0,6780.0,23.769729,565.0,565.0,9.358907,143.174422,15690.0,19.570173,432.060109,16670.0,23.769729,565.0,11300.0,23.769729,565.0,2260.0,23.769729,565.0,3955.0,worn
7,24.138704,589.838743,97562.0,2.899477,10.878884,1527.4319,0.460642,-0.205885,9937.59871,24.138606,589.835916,97550.0,2.872997,10.404449,1518.578079,0.343247,-0.457908,8971.497798,2.622927,10.13896,621.078845,18.557659,410.374555,44.6089,24.596026,604.976293,196878.0,13.089162,247.200283,7844.1775,9.930246,167.260568,1.098253,22.236398,526.678406,62170.6,2.796032,9.326426,1294.632403,0.565688,...,1.117571,-0.190376,500.000011,14.173756,273.149996,7851.40023,13.868722,263.206976,351.934801,24.591888,604.840142,194766.0,13.990806,265.816105,42279.6,14.100471,269.535049,66.205118,24.596748,605.0,7260.0,24.596748,605.0,605.0,11.028675,184.294174,23924.0,20.182599,460.300308,16420.0,24.596748,605.0,12100.0,24.596748,605.0,2420.0,24.596748,605.0,4840.0,worn
8,26.704015,721.786941,119786.0,2.472651,7.927795,1364.58869,0.717137,1.19725,14421.992044,26.704123,721.790775,119778.0,2.502836,8.125744,1380.253173,0.654779,0.149254,8555.170734,2.372296,7.423005,549.895518,21.124002,523.079681,51.8607,27.202169,739.971966,240660.0,15.181948,325.964385,8636.352,10.782144,200.083263,1.046503,24.732143,650.0022,76670.5,2.286455,5.379217,983.744927,0.648449,...,1.148529,0.299595,600.000027,16.481185,364.549419,9904.9502,16.260828,357.317215,435.373,27.197068,739.7865,237820.0,16.179764,353.773833,52957.8,16.451404,363.045937,82.150805,27.202941,740.0,8880.0,27.202941,740.0,740.0,12.525949,241.352163,29964.0,20.480647,498.570939,22370.0,27.202941,740.0,11100.0,27.202941,740.0,2960.0,27.202941,740.0,6660.0,worn
9,35.339513,1265.762431,206102.0,2.496072,7.198852,1238.925575,0.507769,-0.23729,10570.767974,35.339778,1265.775265,206099.0,2.468288,7.022851,1224.101307,0.588073,-0.101175,7635.02017,4.245874,34.494322,1440.05144,28.518478,944.068905,80.0038,36.068493,1300.957406,423988.0,16.926394,446.835862,10365.366,8.550345,161.791812,1.013957,32.425584,1125.976125,129979.9,3.029868,12.152268,1440.610867,0.62186,...,1.048179,0.230945,600.000026,14.154719,340.135202,11800.7894,14.120004,338.733364,511.536036,36.064591,1300.7695,421462.0,13.830704,325.436943,63899.64,14.167657,338.295685,96.350724,36.069378,1301.0,15612.0,36.069378,1301.0,1301.0,10.79538,222.650233,36158.0,27.850396,912.474795,45594.0,36.069378,1301.0,15612.0,36.069378,1301.0,5204.0,36.069378,1301.0,13010.0,worn


<a name='x.3'></a>

## 7.3 Modeling

[back to top](#top)

<a name='x.3.1'></a>

### 7.3.1 Tool Condition

[back to top](#top)

In [17]:
X = tests.drop(columns=['tool_condition', 'Machining_Process', 'experiment'],axis=1)
y = np.array(tests['tool_condition'])
X_train, X_test, y_train, y_test = train_test_split(X, y, 
                                                    train_size=0.8,
                                                    random_state=42)

#### 7.3.1.1 Random Forest

In [None]:
model = RandomForestClassifier(n_estimators=500, min_samples_leaf=4)
model.fit(X_train, y_train)

RandomForestClassifier(bootstrap=True, ccp_alpha=0.0, class_weight=None,
                       criterion='gini', max_depth=None, max_features='auto',
                       max_leaf_nodes=None, max_samples=None,
                       min_impurity_decrease=0.0, min_impurity_split=None,
                       min_samples_leaf=4, min_samples_split=2,
                       min_weight_fraction_leaf=0.0, n_estimators=500,
                       n_jobs=None, oob_score=False, random_state=None,
                       verbose=0, warm_start=False)

In [None]:
y_pred = model.predict(X_test)
print(f'{accuracy_score(y_pred, y_test):.4f}')
cnf_matrix = confusion_matrix(y_test, y_pred)
cnf_matrix

0.9949


array([[2357,   21],
       [   5, 2675]])

In [None]:
# grab feature importances
imp = model.feature_importances_

# their std
std = np.std([tree.feature_importances_ for tree in model.estimators_], axis=0)

# create new dataframe
feat = pd.DataFrame([X.columns, imp, std]).T
feat.columns = ['name', 'importance', 'std']
feat = feat.sort_values('importance', ascending=False)
feat = feat.reset_index(drop=True)
feat.columns = ['feature', 'importance', 'std']

In [None]:
px.bar(feat, x='feature', y='importance', error_y='std')

#### 7.3.1.2 Gradient Boosting

In [None]:
model = GradientBoostingClassifier(n_estimators=500, min_samples_leaf=4)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print(f'{accuracy_score(y_pred, y_test):.4f}')
cnf_matrix = confusion_matrix(y_test, y_pred)
cnf_matrix

0.9931


array([[2350,   28],
       [   7, 2673]])

In [None]:
# grab feature importances
imp = model.feature_importances_

# their std
# std = np.std([tree.feature_importances_ for tree in model.estimators_], axis=0)

# create new dataframe
feat = pd.DataFrame([X.columns, imp]).T
feat.columns = ['name', 'importance']
feat = feat.sort_values('importance', ascending=False)
feat = feat.reset_index(drop=True)
feat.columns = ['feature', 'importance']

In [None]:
px.bar(feat, x='feature', y='importance')

#### 7.3.1.3 Extreme Gradient Boosting

In [None]:
model = XGBClassifier(n_estimators=500, min_samples_leaf=4)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print(f'{accuracy_score(y_pred, y_test):.4f}')
cnf_matrix = confusion_matrix(y_test, y_pred)
cnf_matrix

0.9955


array([[2361,   17],
       [   6, 2674]])

In [None]:
# grab feature importances
imp = model.feature_importances_

# their std
# std = np.std([tree.feature_importances_ for tree in model.estimators_], axis=0)

# create new dataframe
feat = pd.DataFrame([X.columns, imp]).T
feat.columns = ['name', 'importance']
feat = feat.sort_values('importance', ascending=False)
feat = feat.reset_index(drop=True)
feat.columns = ['feature', 'importance']

In [None]:
px.bar(feat, x='feature', y='importance')

<a name='x.3.2'></a>

### 7.3.2 Machining Finalized

[back to top](#top)

<a name='x.3.3'></a>

### 7.3.3 Passed Visual Inspection

[back to top](#top)