# An Elementary Image Classifier for Solid State Drive Production Error

**Author:** *Saransh Rakshak* 

**Github:**  [srak71](https://github.com/srak71) (Click to open)

**LinkedIn:** [srak71](https://www.linkedin.com/in/srak71/) (Click to open)

**Date:** *May 16, 2025*

In [1]:
import tensorflow as tf 
import os

The following is my attempt to show Western Digital hiring team that I would make a perfect candidate for the *Summer 2025 Intern, Failure Analysis Engineering* role. This basic project accomplishes the *"Essential Duties And Responsibilities"* posted in the job description. I hope you enjoy!

The NEU-CLS dataset has 1800 grayscale images of steel surface defects (200×200 pixels) in six classes: rolled-in scale, patches, crazing, pitted surface, inclusion, and scratches.

The dataset is organized into subdirectories per class in separate train/images and validation/images directories. Example: training images might be in .../train/images/scratches/ and similarly for the other classes. 

The steps for training the classification model are as follows:

**Part 1: Data Processing**

    1a. Proper loading of NEU data
    1b. Augment training set and normalize both sets for unified comparison
  
**Part 2: Model Creation and Training Parameters**

    2a. Defining Convolution Neural Net Model
    2b. Establishing Callbacks to stop training when optimization met.
  
**Part 3: Training and Selecting Optimal Model**

    3a. Training CNN with established parameters, and saving best model into 'SmartSSD/best_model.keras'.

# Part 1: Data Processing

## 1a. Proper loading of NEU data.

For reproducability I am using Python's **os** package to establish given user's current working directory. Then joining with location of my train & validation data.

In [None]:
# user dir
current_dir= os.getcwd()

# known filepaths
train_dir= os.path.join(current_dir, "NEU-CLS", "train", "images")
val_dir= os.path.join(current_dir, "NEU-CLS", "validation", "images")

# unified variables
image_size= (128, 128)
batch_size= 32 # number of images to process before updating mode weights 
epochs= 10 # number of times entire dataset trains on model
num_classes= 6

## 1b. Augment training set and normalize both sets for unified comparison

For loading my images and labels I am using Keras package to read directories of images. My code uses TF's *ImageDataGenerator()* instead of the basic *image_dataset_from_directory()* for integrated data augmentation such as normalization (*rescale=1. / 255*) and transformations (to the training data only). By augmenting I am expanding the size of the dataset so my model has more images to train on without having to gather any new data. 

- Normalization: All images rescaled to be in range(0,1) by dividing by 225 and resizing to 128x128 resolution. *color_mode='greyscale'* specified so each image has one channel.

- Augmentation: My transformations to the training data includes rotation and reflection (flip), as well as brightness transformation. Limiting range of augmentation to at most 20% for ensuring the image remains usable. No transformations/augmentations made to validation data to avoid leaking val information.

In [None]:
# Augmenting training images by rotating flipping and altering brightness
train_datagen = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255,
                                                                rotation_range=20, # random rotations
                                                                horizontal_flip=True, # random horizontal flips
                                                                brightness_range=(0.8, 1.2)) # random brightness

# just normalizing in the case of val images
val_datagen = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255) 

# actually generating the image defined from my ImageDataGenerator() function
train_gen = train_datagen.flow_from_directory(train_dir,
                                              target_size=image_size,
                                              color_mode='grayscale',
                                              batch_size=batch_size,
                                              class_mode='categorical')

val_gen = val_datagen.flow_from_directory(val_dir,
                                          target_size=image_size,
                                          color_mode='grayscale',
                                          batch_size=batch_size,
                                          class_mode='categorical')

# Part 2. Defining Convolution Neural Net Model

## 2a. I will create a basic **Convolutional Neural Network (CNN)** consisting of three blocks and a classification head. 

Block 1, 2, 3: 

> Layer 1. Convolution layer with **ReLu** activation function. *Conv2D()*
>
> Layer 2. Establishing limitations to my batch pool_size. *MaxPooling2D()*
>
> Layer 3. *Dropout()* layer limiting overfitting (also normalizing).

Classification Head:

> Layer 1: *Flatten()* layer to  the feature maps.
>
> Layer 2: *Dense()* layer using **ReLU** as my activation.
> 
> Layer 3: *Dropout()* layer to limit overfitting model.
>
> Layer 4: *Dense()* layer but now using number of types of defects (classes), thus switch to **softmax** activation.

In [None]:
model = tf.keras.models.Sequential([
    # Block 1
    tf.keras.layers.Conv2D(32, (3, 3), activation= 'relu', input_shape= (128, 128, 1)),
    tf.keras.layers.MaxPooling2D((2, 2)),
    tf.keras.layers.Dropout(0.25),
    
    # Block 2
    tf.keras.layers.Conv2D(64, (3, 3), activation= 'relu'),
    tf.keras.layers.MaxPooling2D((2,2)),
    tf.keras.layers.Dropout(0.25),
    
    # Block 3
    tf.keras.layers.Conv2D(128, (3, 3), activation= 'relu'),
    tf.keras.layers.MaxPooling2D((2, 2)),
    tf.keras.layers.Dropout(0.25),
    
    # Classification head
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation= 'relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(num_classes, activation= 'softmax')])

Finally, my last block will compile my established CNN. Since there are multiple different classifications possible for image, I will compile model with **Adam** as my optimizer and measure by categorical cross-entropy loss.

In [None]:
model.compile(optimizer='adam', 
              loss='categorical_crossentropy', 
              metrics=['accuracy'])

model.summary()

## 2b. Establishing Callbacks to stop training when optimization met.

I will train my model with the entire dataset (*epochs=*) 10 times. While fitting the model using *model.fit()* in Part 3, Keras autoprints training and validation accuracy/loss for every epoch. I will set *callbacks=* 'EarlyStopping' and 'ModelCheckpoint' in *model.fit()* to accomplish the following:

- EarlyStopping: While monitoring training loss, I will stop training if model performance does not improve for 3 epochs. For instance, if epoch 2 has val_loss of 0.1 (hypothetically), and epochs 3, 4, 5 have val_loss greater than 0.1, training should stop after epoch 5, and the model weights from epoch 2 should be loaded.

- ModelCheckpoint: I save model weights to file **best_model.keras** whenever I get improvement of performance on the validation set by setting *save_best_only=* True.

In [None]:
earlyStop = tf.keras.callbacks.EarlyStopping(monitor= 'val_loss',
                                             patience= 3, # test for improvement in next 3, if none stop and load old
                                             restore_best_weights= True,
                                             verbose= 1)

# after stopping loading weights from old epoch (the one with the lowest val loss)
modelCheckpoint = tf.keras.callbacks.ModelCheckpoint('best_model.keras',
                                                     monitor= 'val_loss',
                                                     save_best_only= True,
                                                     verbose= 1)

callbacks = [earlyStop, modelCheckpoint]

# Part 3: Training and Selecting Optimal Model

## 3a. Training and saving my best performing model to 'SmartSSD/best_model.keras'.

In order to use the trained model for classification on new data, I will save it to my directory in file **best_model.keras**. 

In [None]:
history = model.fit(train_gen,
                    epochs= epochs,
                    validation_data= val_gen,
                    callbacks= callbacks,
                    verbose= 1)

Now I am chaining our Callbacks to allow for more data training before picking optimal model weights.

In [None]:
earlyStop = tf.keras.callbacks.EarlyStopping(monitor= 'val_loss',
                                             patience= 5, # test for improvement in next 3, if none stop and load old
                                             restore_best_weights= True,
                                             verbose= 1)

# after stopping loading weights from old epoch (the one with the lowest val loss)
modelCheckpoint = tf.keras.callbacks.ModelCheckpoint('best_model.keras',
                                                     monitor= 'val_loss',
                                                     save_best_only= True,
                                                     verbose= 1)

callbacks = [earlyStop, modelCheckpoint]

history = model.fit(train_gen,
                    epochs= epochs,
                    validation_data= val_gen,
                    callbacks= callbacks,
                    verbose= 1)

**End of Notebook**