<a href="https://colab.research.google.com/github/lakshithagnk/CNN-image-classification/blob/main/Assignment_03_realwast_dataset.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

1 CNN for image classification

1. Set up your environment: Ensure that you have installed all the required software packages.


In [1]:
!git clone https://github.com/lakshithagnk/CNN-image-classification.git

Cloning into 'CNN-image-classification'...
remote: Enumerating objects: 13843, done.[K
remote: Counting objects: 100% (3/3), done.[K
remote: Compressing objects: 100% (3/3), done.[K
remote: Total 13843 (delta 0), reused 0 (delta 0), pack-reused 13840 (from 1)[K
Receiving objects: 100% (13843/13843), 802.58 MiB | 15.81 MiB/s, done.
Resolving deltas: 100% (5/5), done.
Updating files: 100% (13870/13870), done.


In [2]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers,datasets,models
from sklearn.metrics import confusion_matrix, precision_score, recall_score
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
import numpy as np
import pathlib
from sklearn.metrics import confusion_matrix, precision_score, recall_score

2. Prepare your dataset: Choose a data set from UCI Machine Learning Repository that is appropriate for classification. Download the selected dataset.

3. Split the dataset into training, validation, and testing subsets using a ratio of 60% for training and 20% each for validation and testing sets.

In [None]:
#main directory where images are stored
# data_dir = pathlib.Path('/content/drive/MyDrive/RealWaste')
data_dir = pathlib.Path('CNN-image-classification/RealWaste')

# seed for reproducibility
seed = 42

# training dataset
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.4,  # Use 40% for validation + testing
    subset="training",
    seed=seed,
    image_size=(128, 128),
    batch_size=64
)

# validation dataset
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.4,
    image_size=(128, 128),
    subset="validation",
    seed=seed,
    batch_size=64
)

# Further split validation into validation and test (20% each)
val_batches = tf.data.experimental.cardinality(val_ds)
test_ds = val_ds.take(val_batches // 2)
val_ds = val_ds.skip(val_batches // 2)

In [4]:
test_ds

<_TakeDataset element_spec=(TensorSpec(shape=(None, 128, 128, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.int32, name=None))>

In [29]:
print(f"Training set size: {tf.data.experimental.cardinality(train_ds).numpy()}")
print(f"Validation set size: {tf.data.experimental.cardinality(val_ds).numpy()}")
print(f"Test set size: {tf.data.experimental.cardinality(test_ds).numpy()}")

Training set size: 23
Validation set size: 8
Test set size: 7


In [30]:
# Normalize the input data
def normalize(image, label):
    image = tf.cast(image, tf.float32) / 255.0
    return image, label

train_ds = train_ds.map(normalize)
val_ds = val_ds.map(normalize)
test_ds = test_ds.map(normalize)

4. Build the CNN model: A common CNN design consists of interleaving convolutional and max-pooling layers, ending with a linear classification layer

In [31]:
model = models.Sequential([
    # Convolutional layer
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(128, 128, 3)),
    layers.MaxPooling2D((2, 2)),

    layers.Conv2D(32, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),

    # Flatten
    layers.Flatten(),

    # Fully connected layer
    layers.Dense(64, activation='relu'),
    layers.Dense(162, activation='relu'),
    layers.Dropout(0.5),

    # Output layer (9 - class)
    layers.Dense(9, activation='softmax')
])

model.summary()

7. Train the model: Train the model using the training data for 20 epochs and plot
training and validation loss for with respect to epoch. Here, for the optimizer
you may use adam and sparse categorical crossentropy as the loss function. Set
a suitable learning rate.

In [32]:
learning_rate = 0.00005

optimizer = keras.optimizers.Adam(learning_rate=learning_rate)
model.compile(optimizer=optimizer,
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

In [None]:
history = model.fit(train_ds, epochs=20, validation_data=val_ds, verbose=1)

Epoch 1/20
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m78s[0m 3s/step - accuracy: 0.1479 - loss: 2.1968 - val_accuracy: 0.1803 - val_loss: 2.2196
Epoch 2/20
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m87s[0m 4s/step - accuracy: 0.1759 - loss: 2.1937 - val_accuracy: 0.1663 - val_loss: 3.2541
Epoch 3/20
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3s/step - accuracy: 0.1620 - loss: 2.1860

In [None]:
# Plot training and validation loss VS epochs
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training and Validation Loss Over Epochs')
plt.legend()
plt.show()


10. Evaluate the Model: After training, evaluate the model’s performance on the testing
dataset. Record the train/test accuracy, confusion matrix, precision and recall.

In [None]:
# Evaluate on the test dataset
test_loss, test_accuracy = model.evaluate(test_ds)
print(f"Test Accuracy: {test_accuracy:.4f}")
print(f"Test Loss: {test_loss:.4f}")

# Predict class labels on the test dataset
y_pred = np.argmax(model.predict(test_ds), axis=1)
y_true = np.concatenate([y for _, y in test_ds], axis=0)

# Confusion matrix
conf_matrix = confusion_matrix(y_true, y_pred)
print("Confusion Matrix:\n", conf_matrix)

# Calculate precision and recall
precision = precision_score(y_true, y_pred, average='weighted')
recall = recall_score(y_true, y_pred, average='weighted')
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")


11. Plot training and validation loss for with respect to epoch for different learning rates
such as 0.0001, 0.001, 0.01, and 0.1. Comment on your results and select a learning
rate with a justification.