**Importing all required libraries**

In [22]:
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
import pandas as pd

**Data Preprocessing**

Training image preprocessing

In [23]:
training_set = tf.keras.utils.image_dataset_from_directory(
    'train',
    # lables means the subdirectory names
    # which are used as labels for the images it finds automatically from the subdirectory names from the directory
    labels="inferred",
    label_mode="categorical",
    class_names=None,
    color_mode="rgb",
    batch_size=32,
    image_size=(128, 128),
    shuffle=True,
    seed=None,
    validation_split=None,
    subset=None,
    interpolation="bilinear",
    follow_links=False,
    crop_to_aspect_ratio=False,
    pad_to_aspect_ratio=False,
    verbose=True,
)

Found 70295 files belonging to 38 classes.


#### Validation image preprocessing


In [24]:
validation_set = tf.keras.utils.image_dataset_from_directory(
    'valid',
    labels="inferred",
    label_mode="categorical",
    class_names=None,
    color_mode="rgb",
    batch_size=32,
    image_size=(128, 128),
    shuffle=True,
    seed=None,
    validation_split=None,
    subset=None,
    interpolation="bilinear",
    follow_links=False,
    crop_to_aspect_ratio=False,
    pad_to_aspect_ratio=False,
    verbose=True,
)

Found 17572 files belonging to 38 classes.


In [25]:
training_set

<_PrefetchDataset element_spec=(TensorSpec(shape=(None, 128, 128, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, 38), dtype=tf.float32, name=None))>

A Convolutional Neural Network (CNN) is a type of deep learning algorithm designed to automatically and adaptively learn patterns from image data, though it’s now also used for video, audio, and even text (in some NLP tasks).

Unlike standard neural networks (where every neuron is connected to every neuron in the next layer), CNNs use convolutional layers that apply filters (also called kernels) to input data, helping them detect spatial hierarchies — from edges to textures to object parts and whole objects.

1. **Input Layer**
You feed the network with a 2D image (for grayscale) or a 3D image (for RGB: height × width × channels).

2. **Convolution Layer**
Applies a small filter (e.g., 3×3 or 5×5) that slides over the image.

Produces a feature map showing where certain patterns (edges, corners) occur.

Multiple filters → multiple feature maps.

Think of this like applying an edge detection or blur filter in Photoshop.

3. **ReLU Activation (Rectified Linear Unit)
Adds non-linearity.**

Converts negative values in feature maps to zero: f(x) = max(0, x).

4. **Pooling Layer** (e.g., Max Pooling)
Reduces the size (downsampling) while keeping important features.

E.g., from 28×28 to 14×14 by selecting the max value in a 2×2 window.

5. **Stacking Multiple Conv + Pool Layers**
Deeper layers learn more complex patterns (e.g., face, object parts).

6. **Flattening**
Converts the final pooled feature map into a 1D vector.

7. **Fully Connected Layer**
Like traditional neural networks. Learns final classification boundaries.

Ends with a Softmax or Sigmoid layer for output (classification or regression).

**To Avoid Overshooting:**

1. choose a small learning rate. default is 0.001 we can use 0.0001
2. There may be chance of underfitting. So, we can increase the number of neurons.
3. Add more convolutional layers to extract more features from images there may be possibility to capture revelant features or models is confusing. due to lack of feature so feed with more features.

**Building the Model**

In [26]:
from tensorflow.keras.layers import Dense, Conv2D, MaxPool2D, Flatten
from tensorflow.keras.models import Sequential

In [27]:
model = Sequential()

In [28]:
# Building Convulutional Layer

model.add(Conv2D(filters=32, kernel_size=(3,3), padding='same', activation='relu', input_shape=(128, 128, 3)))
model.add(Conv2D(filters=32, kernel_size=(3,3), padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=2, strides=2))

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [29]:
model.add(Conv2D(filters=64, kernel_size=(3,3), padding='same', activation='relu', input_shape=(128, 128, 3)))
model.add(Conv2D(filters=64, kernel_size=(3,3), padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=2, strides=2))

In [30]:
model.add(Conv2D(filters=128, kernel_size=(3,3), padding='same', activation='relu', input_shape=(128, 128, 3)))
model.add(Conv2D(filters=128, kernel_size=(3,3), padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=2, strides=2))

In [31]:
model.add(Conv2D(filters=256, kernel_size=(3,3), padding='same', activation='relu', input_shape=(128, 128, 3)))
model.add(Conv2D(filters=256, kernel_size=(3,3), padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=2, strides=2))

Now we have completed the convolutional layer and max pooling layer

No we need to flatten the output and add a fully connected layer and feed it to the nural network 

In [32]:
model.add(Flatten())

In [33]:
model.add(Dense(units=1024, activation='relu'))

In [34]:
# Output Layer
model.add(Dense(units=38, activation='softmax'))

**Compaling the model**

In [35]:
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()

**Model Training**

In [36]:
training_history = model.fit(x=training_set, validation_data=validation_set, epochs=10)

Epoch 1/10
[1m2197/2197[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2836s[0m 1s/step - accuracy: 0.4144 - loss: 2.2471 - val_accuracy: 0.8441 - val_loss: 0.4837
Epoch 2/10
[1m2197/2197[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2439s[0m 1s/step - accuracy: 0.8628 - loss: 0.4191 - val_accuracy: 0.8915 - val_loss: 0.3280
Epoch 3/10
[1m2197/2197[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2435s[0m 1s/step - accuracy: 0.9105 - loss: 0.2705 - val_accuracy: 0.9006 - val_loss: 0.3138
Epoch 4/10
[1m2197/2197[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2435s[0m 1s/step - accuracy: 0.9328 - loss: 0.2032 - val_accuracy: 0.9018 - val_loss: 0.3356
Epoch 5/10
[1m2197/2197[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2430s[0m 1s/step - accuracy: 0.9422 - loss: 0.1767 - val_accuracy: 0.9114 - val_loss: 0.3181
Epoch 6/10
[1m2197/2197[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2435s[0m 1s/step - accuracy: 0.9488 - loss: 0.1600 - val_accuracy: 0.9219 - val_loss: 0.2866
Epoc