# Practice Exercise on Convolutional Neural Networks (CNN)

Welcome to the Practice Exercise on Convolutional Neural Networks (CNN). In this exercise, we will focus on an image classification task where the goal is to predict whether an image contains a cat or a dog. We will work with a dataset of labeled images and build, train, and evaluate a CNN model. This practice will allow you to apply your understanding of CNNs to achieve high accuracy in image classification.

---

## Dataset Overview

### **Dataset Name:** Cats and Dogs Image Dataset

### **Description:**  
The dataset contains images of cats and dogs labeled for classification purposes. Each image belongs to one of the two classes: 'Cat' or 'Dog'. The goal is to classify the images correctly based on the content (i.e., whether the image is of a cat or a dog). The dataset is often used to test image classification models.

### **Features:**
There are two main folders which are:
- `Cat`: Images labeled as containing a cat.
- `Dog`: Images labeled as containing a dog.

### **Target Variable:**
- The goal is to predict whether an image contains a cat or a dog.


Dataset: https://www.kaggle.com/datasets/samuelcortinhas/cats-and-dogs-image-classification/data

## Data Loading and Preprocessing

In [92]:
from keras.models import Sequential
from keras.layers import Dense, Conv2D, Flatten, MaxPooling2D, Dropout,Rescaling
import tensorflow as tf

from tensorflow.keras.utils import image_dataset_from_directory
from tensorflow.keras.preprocessing.image import ImageDataGenerator


import numpy as np

In [93]:
train = image_dataset_from_directory(
    'Datasets/archive/train',
    validation_split=0.15,
    # shuffle=True,
    subset="training",
    seed=123,
    image_size=(64, 64)
)

val = image_dataset_from_directory(
    'Datasets/archive/train',
    validation_split=0.15,
    # shuffle=True,
    subset="validation",
    seed=123,
    image_size=(64, 64)
)

test = image_dataset_from_directory(
    'Datasets/archive/test',
    seed=123,
    image_size=(64, 64)
)

Found 557 files belonging to 2 classes.
Using 474 files for training.
Found 557 files belonging to 2 classes.
Using 83 files for validation.
Found 140 files belonging to 2 classes.


In [94]:
# train_datagen = ImageDataGenerator(
#     rescale=1./255,   
#     validation_split=0.15, 
#     )

# test_datagen = ImageDataGenerator(
#     rescale=1./255,
#     )

# train = train_datagen.flow_from_directory(
#     'Datasets/archive/train',
#     target_size=(64, 64),
#     class_mode='binary',
#     subset='training',
#     seed=123
# )

# val = train_datagen.flow_from_directory(
#     'Datasets/archive/train',
#     target_size=(64, 64),
#     class_mode='binary',
#     subset='validation',
#     seed=123
# )

# test = test_datagen.flow_from_directory(
#     'Datasets/archive/test',
#     target_size=(64, 64),
#     class_mode='binary',
#     subset='training',
#     seed=123
# )


We will start by loading the dataset and preprocessing the images. This includes:
- Resizing images .
- Normalizing pixel values.

Add more if needed!


## Data Splitting
In this section, we will split our dataset into three parts:

* Training set (70%): This portion of the dataset is used to train the CNN model.
* Validation set (15%): This portion is used to validate the model during training, helping us tune hyperparameters and avoid overfitting.
* Test set (15%): This portion is used to evaluate the model after training, to check its generalization to unseen data.

## Building the CNN Model


Now, we will define our CNN architecture using `tensorflow.keras`. The architecture will consist of:
- Convolutional layers followed by max-pooling layers
- Flatten layer
- Dense layers
- Output layer


In [95]:
model = Sequential()
model.add(Rescaling(1./255, input_shape=(64, 64, 3)))

# # Hidding layer
model.add(Conv2D(12,(3,3), activation='relu'))
model.add(MaxPooling2D())
# Dropout(0.25)

model.add(Conv2D(20,(3,3), activation='relu'))
model.add(MaxPooling2D())

# Dropout(0.25)


model.add(Flatten())
model.add(Dense(units=6, activation='relu'))
# Dropout(0.25)

model.add(Dense(units=12, activation='relu'))
# Dropout(0.25)


# Output layer

# class_labels_numbers <= 2 -> dense layer units = 1    ===== activation='sigmoid'
# class_labels_numbers > 2 -> dense layer units = class_labels_numbers ====== activation='softmax'
model.add(Dense(units=1, activation='sigmoid'))

  super().__init__(**kwargs)


In [96]:
model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])

## Training the Model


Train the CNN model using the `fit` function. We will use the training and validation we created earlier.

Fill in the code to train the model for a specified number of epochs.


In [97]:
# from keras.callbacks import EarlyStopping

# early_stopping = EarlyStopping(monitor='val_loss',patience=3)

history = model.fit(train,epochs=10,validation_data=val)

Epoch 1/10
[1m15/15[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 65ms/step - accuracy: 0.4978 - loss: 0.6976 - val_accuracy: 0.5060 - val_loss: 0.6948
Epoch 2/10
[1m15/15[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 52ms/step - accuracy: 0.4951 - loss: 0.6954 - val_accuracy: 0.4819 - val_loss: 0.6937
Epoch 3/10
[1m15/15[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 75ms/step - accuracy: 0.5263 - loss: 0.6922 - val_accuracy: 0.4819 - val_loss: 0.6939
Epoch 4/10
[1m15/15[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 53ms/step - accuracy: 0.4927 - loss: 0.6933 - val_accuracy: 0.4819 - val_loss: 0.6937
Epoch 5/10
[1m15/15[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 45ms/step - accuracy: 0.5330 - loss: 0.6907 - val_accuracy: 0.5181 - val_loss: 0.6940
Epoch 6/10
[1m15/15[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 44ms/step - accuracy: 0.5405 - loss: 0.6893 - val_accuracy: 0.5060 - val_loss: 0.6934
Epoch 7/10
[1m15/15[0m [32m━━━━

## Evaluating the Model


After training, evaluate the model on the validation data to check its performance.


In [98]:
model.evaluate(val)

[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 27ms/step - accuracy: 0.4416 - loss: 0.6913


[0.6871176958084106, 0.4457831382751465]

## Testing with New Images

Finally, let's test the model with some new images. Preprocess the images and use the trained model to predict whether the image is of a cat or a dog.


In [99]:
from sklearn.metrics import accuracy_score

prediction = model.predict(test) > 0.5

print(accuracy_score(np.concatenate([y for x, y in test], axis=0),prediction))

[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 23ms/step
0.4642857142857143


2024-08-17 22:32:27.005378: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
