## Overall Approach:
1) Create unacceptable samples given the acceptable ones (By rotation of chips due to the only difference being the orientation of the chip)

2) Load all the samples (Unacceptable/Acceptable) into a list using cv2 to represent them as pixels

3) Load all the labels (Unacceptable - 0 / Acceptable - 1) into a separate list 

4) Simple convolutional neural network with 2 layers of convolution and pooling (to reduce computation time)

5) Final evaluation of overall accuracy of model

Importing necessary libraries

In [2]:
import cv2
import numpy as np
import sys
import glob
import tensorflow as tf
import skimage

from sklearn.model_selection import train_test_split


Parameters for training

In [3]:
TEST_SIZE = 0.2
EPOCHS = 5

Data Handling, Sample Creation

In [75]:
'''Insert wherever the acceptable folder is located'''

base = r'C:\Users\De Yuan\Downloads\NUS\Intern\XRVision\Acceptable'
Faulty = r'C:\Users\De Yuan\Downloads\NUS\Intern\XRVision\Faulty'

'C:\\Users\\De Yuan\\Downloads\\NUS\\Intern\\XRVision\\Acceptable\\chips (145).jpg'

In [18]:
'''Rotate each picture to create more faulty samples'''
i = 0

'''for img in glob.glob(base + "/*.jpg"):
    image = cv2.imread(img)
    image_rotcw1 = cv2.rotate(image, cv2.ROTATE_90_CLOCKWISE)
    image_rotcw2 = cv2.rotate(image, cv2.ROTATE_180)
    image_rotcw3 = cv2.rotate(image, cv2.ROTATE_90_COUNTERCLOCKWISE)
    cv2.imwrite("Faulty/chip_%04i_rotcw1.jpg" %i,image_rotcw1)
    cv2.imwrite("Faulty/chip_%04i_rotcw2.jpg" %i,image_rotcw2)
    cv2.imwrite("Faulty/chip_%04i_rotcw3.jpg" %i,image_rotcw3)
    i+=1'''

'''Augmentation of Data in Acceptable Dataset'''
    
for img in glob.glob(base + "/*.jpg"):
    image = cv2.imread(img)
    image_blur1 = cv2.GaussianBlur(image,(3,3),0)
    image_blur2 = cv2.GaussianBlur(image,(5,5),0)
    cv2.imwrite("Acceptable/chip_%04i_blur1.jpg" %i,image_blur1)
    cv2.imwrite("Acceptable/chip_%04i_blur2.jpg" %i,image_blur2)
    i+=1

In [40]:
Faulty = r'C:\Users\De Yuan\Downloads\NUS\Intern\XRVision\Faulty'

images = []
labels = []

for img in glob.glob(base + "/*.jpg"):
    image = cv2.imread(img, 0) # Read in Grayscale
    images.append(image)
    labels.append(1)
    
# 1 represents acceptable, 0 represents faulty

for img in glob.glob(Faulty+"/*.jpg"):
    image = cv2.imread(img, 0) # Read in Grayscale
    images.append(image)
    labels.append(0)

In [88]:
acceptable = cv2.imread(base + '\chips (145).jpg',0)
cv2.imshow('Example of acceptable chip orientation',acceptable)

faulty1 = cv2.imread(Faulty + '\chip_0001_rotcw1.jpg',0)
faulty2 = cv2.imread(Faulty + '\chip_0001_rotcw2.jpg',0)
faulty3 = cv2.imread(Faulty + '\chip_0001_rotcw3.jpg',0)
allfaulty = np.concatenate((faulty1,faulty2,faulty3),axis = 1)

cv2.imshow('Example of unacceptable chip orientation',allfaulty)

cv2.waitKey(0)
cv2.destroyAllWindows()

In [70]:
images = np.array(images)
images.shape[1:]

(55, 55, 1)

Model Building and splitting of training and test data

In [71]:
def get_model():
    model = tf.keras.models.Sequential([
        tf.keras.layers.Conv2D(32, (3,3), activation = 'relu', input_shape = images.shape[1:]),
        # Pooling layer, to reduce size of feature maps
        # Repeat Convolution and Pooling
        tf.keras.layers.MaxPooling2D(pool_size=(2,2)),
        tf.keras.layers.Conv2D(32,(3,3),activation='relu'),
        tf.keras.layers.MaxPooling2D(pool_size=(2,2)),
        tf.keras.layers.Flatten(),
        #Add hidden layer with dropout
        tf.keras.layers.Dense(128,activation ='relu'),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(128,activation ='relu'),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(128,activation ='relu'),
        tf.keras.layers.Dropout(0.2),
        #Add output
        tf.keras.layers.Dense(1,activation = 'sigmoid')]) # Binary outcome, sigmoid is chosen.
    
    model.compile(
        optimizer = 'adam',
        loss='binary_crossentropy',
        metrics=['accuracy'])
    return model

In [68]:
x_train, x_test, y_train, y_test = train_test_split(np.array(images), np.array(labels), test_size=TEST_SIZE)


Final Training and Testing of Model ( ~100% Accuracy in about 5 epochs is achieved)

In [72]:
# Get a compiled neural network
model = get_model()

# Fit model on training data
model.fit(x_train, y_train, epochs=EPOCHS)

# Evaluate neural network performance
model.evaluate(x_test,  y_test, verbose=2)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
23/23 - 1s - loss: 2.8082e-07 - accuracy: 1.0000


[2.8081541358915274e-07, 1.0]

In [73]:
model.summary()

Model: "sequential_15"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_30 (Conv2D)           (None, 53, 53, 32)        320       
_________________________________________________________________
max_pooling2d_30 (MaxPooling (None, 26, 26, 32)        0         
_________________________________________________________________
conv2d_31 (Conv2D)           (None, 24, 24, 32)        9248      
_________________________________________________________________
max_pooling2d_31 (MaxPooling (None, 12, 12, 32)        0         
_________________________________________________________________
flatten_15 (Flatten)         (None, 4608)              0         
_________________________________________________________________
dense_60 (Dense)             (None, 128)               589952    
_________________________________________________________________
dropout_45 (Dropout)         (None, 128)             

In [87]:
model.predict(np.array([images[0]]))

array([[1.]], dtype=float32)

# Conclusion
- This was a relatively simple CNN, in terms of model building, but the main bulk of the problem lied in the data augmentation, as only the acceptable samples were given. 
- Rotation of the correct samples produced a 3:1 ratio of incorrect to correct samples, which skewed the initial model results.
- Gaussian noise was added to correct samples to generate more examples of the correct samples, which led to a very high model performance.