We are going to try to build a model that will automatically classify a part based on the dimension it is fed

For discrete and high dimensional data, a Naive classification works well.

Also, the variation in part production tends to follow a normal distribution, so a Gaussian assumption should prove acceptable.

In [75]:
import numpy as np
import pandas as pd

First we need to import the data. I have a history of measurements for ANSI 18.6.3 Pan Head Machine Screws that have passed inspection from our inspection records. This will be our training data for the model.

You can create training data as well by creating data withint the ANSI limits using a Gaussian distribution of random variables.

In [76]:
ls ../data

'ANSI_B18.6.3 History.xlsx'	  sales_data_sample.csv
 BlackFriday.csv*		  sales_data_sample.xlsx
 customer_data.xlsx		  sales_data_sample_no_customer.xlsx
 government_purchase_orders.csv   sales_encryption.csv
 inventory_list.csv		  screw_corpus.xml
 part_usage_trailing_12.xlsx	  wiki/
 purchases_by_vendor.xlsx


In [77]:
inspection_data = pd.read_excel('../data/ANSI_B18.6.3 History.xlsx')

In [78]:
inspection_data.head()

Unnamed: 0,Nominal Size,A,H,J,T,M,G,N
0,0,0.106,0.039,0.019,0.019,0.06,0.051,0.013014
1,0,0.109,0.033,0.016,0.021,0.063,0.046,0.013947
2,0,0.11,0.038,0.018,0.016,0.066,0.039,0.013164
3,0,0.105,0.034,0.021,0.02,0.063,0.044,0.013133
4,0,0.108,0.039,0.018,0.017,0.058,0.045,0.013916


To make it easier to understand the dimensions, I'll convert the headers to actual dimension descriptions

In [79]:
inspection_data.columns = ["Nominal Size", "Head Diameter","Head Height", "Slot Width", "Slot Depth", "Recess Diameter", "Recess Depth", "Recess Width"]

In [80]:
inspection_data.head()

Unnamed: 0,Nominal Size,Head Diameter,Head Height,Slot Width,Slot Depth,Recess Diameter,Recess Depth,Recess Width
0,0,0.106,0.039,0.019,0.019,0.06,0.051,0.013014
1,0,0.109,0.033,0.016,0.021,0.063,0.046,0.013947
2,0,0.11,0.038,0.018,0.016,0.066,0.039,0.013164
3,0,0.105,0.034,0.021,0.02,0.063,0.044,0.013133
4,0,0.108,0.039,0.018,0.017,0.058,0.045,0.013916


Now it is time to build a model from the data

Our variable we are looking to classify with is the Nominal Size. So that is 'y', the others are the data we will use to identify, so they will be 'X'\

In [102]:
X = inspection_data[['Head Diameter','Head Height', 'Slot Width', 'Slot Depth', 'Recess Diameter', 'Recess Depth', 'Recess Width']]
y = inspection_data[['Nominal Size']]
X = X.values
y = y.values.reshape(-1) # Reshape it to a 1d array
y = list( map(str, y) ) # Turn the numbers to a string for classification
X

array([[1.06000000e-01, 3.90000000e-02, 1.90000000e-02, ...,
        6.00000000e-02, 5.10000000e-02, 1.30142340e-02],
       [1.09000000e-01, 3.30000000e-02, 1.60000000e-02, ...,
        6.30000000e-02, 4.60000000e-02, 1.39468246e-02],
       [1.10000000e-01, 3.80000000e-02, 1.80000000e-02, ...,
        6.60000000e-02, 3.90000000e-02, 1.31643785e-02],
       ...,
       [6.09000000e-01, 1.62000000e-01, 7.50000000e-02, ...,
        3.44000000e-01, 2.70000000e-01, 4.66518190e-04],
       [5.95000000e-01, 1.76000000e-01, 7.70000000e-02, ...,
        3.45000000e-01, 2.75000000e-01, 6.37955378e-04],
       [5.99000000e-01, 1.66000000e-01, 8.10000000e-02, ...,
        3.40000000e-01, 3.32000000e-01, 1.20945544e-05]])

In [104]:
from sklearn.naive_bayes import GaussianNB
model = GaussianNB()
model.fit(X,y)

GaussianNB(priors=None, var_smoothing=1e-09)

We now have a model!
Let's test it

In [108]:
#Creating a new measurement
# I am going to input a new measurement of a Nominal Size 4 Screw I just measured...
new_part = [[0.218, 0.062, 0.037, 0.04, 0.113, 0.107, 0.019]]
prediction = model.predict(new_part)

In [109]:
prediction[0]

'4'

As you can see, it correctly classified the bolt.

This can be extended to add the different specifications for different types of parts and standards. This way it is expanded to not only identify the size, but the standard, type of screw, etc. This is just a simple example of using the Gaussian Naive Bayesian model for classification.