# CNN Example 1
For this example, we have images of cars and flowers, which have been divided into training and testing sets, and we have to build a CNN that identifies whether an image is a car or a flower.

### Step 1: Import the numpy library and the necessary Keras libraries and classes

In [57]:
# Import the Libraries
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPool2D
from keras.layers import Flatten
from keras.layers import Dense
import numpy as np
from tensorflow import random

### Step 2: Now, set a seed and initiate the model with the `Sequential` class

In [58]:
#set a seed
seed = 1
np.random.seed(seed)
random.set_seed(seed)

# Initialising the CNN
classifier = Sequential()

### Step 3: Add the first layer of the CNN, set the input shape to (64, 64, 3), the dimension of each image, and set the activation function as a ReLU:

In [59]:
input_shape = (64, 64, 3) # 3 corresponds to number of channels eg RGB
# keep the filter size as default
classifier.add(Conv2D(filters = 32, kernel_size = (3,3), input_shape= input_shape, activation = 'relu'))

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


### Step 4: Now, add the pooling layer with the image size as 2x2

In [60]:
classifier.add(MaxPool2D(pool_size = (2, 2)))

### Step 5: Flatten the output of the pooling layer by adding a flattening layer to the CNN model:

In [61]:
classifier.add(Flatten())

### Step 6: Add the first Dense layer of the MLP. 
Here, 128 is the output of the number of nodes. As a good practice, 128 is good to get started. activation is relu. As a good practice, the power of two is preferred

In [62]:
classifier.add(Dense(128, activation = 'relu'))

### Step 7: Add the output layer of the MLP.
This is a binary classification problem, so the size is 1 and the activation is `sigmoid`:

In [63]:
classifier.add(Dense(1, activation = 'sigmoid'))
classifier.summary()

### Step 8: Compile the network
Use an adam optimizer and compute the accuracy during the training process 

In [64]:
classifier.compile(optimizer= 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

### Step 9: Create training and test data generators. 
- Rescale the training and test images by `1/255` so that all the values are between `0` and `1`.
- Set these parameters for the training data generators only 
 - `shear_range=0.2`, `zoom_range=0.2`, and `horizontal_flip=True`
 
 - https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html


In [65]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)

# For test data generator, only use rescaling
test_datagen = ImageDataGenerator(rescale=1./255)


### Step 10: Create a training set from the training set folder.
'training_set' is the folder where our data has been placed. Our CNN model has an image size of `64x64`, so the same size should be passed here too. `batch_size` is the number of images in a single batch, which is `32`. `Class_mode` is set to binary since we are working on binary classifiers

In [66]:
import os
import shutil
import random

# Define paths
base_dir = "car_flower_small"
train_dir = "car_flower_train"
test_dir = "car_flower_test"
split_ratio = 0.8  # try an 80% split, do a meanual split

# Create directories for train and test sets
for category in ['car', 'flower']:
    os.makedirs(os.path.join(train_dir, category), exist_ok=True)
    os.makedirs(os.path.join(test_dir, category), exist_ok=True)

# group files by label class
all_files = os.listdir(base_dir)
car_files = [f for f in all_files if f.startswith('car')]
flower_files = [f for f in all_files if f.startswith('flower')]

# Function to split and copy files
def split_and_copy(files, category):
    random.shuffle(files)
    split_point = int(len(files) * split_ratio)
    train_files = files[:split_point]
    test_files = files[split_point:]

    for file in train_files:
        shutil.copy(os.path.join(base_dir, file), os.path.join(train_dir, category, file))

    for file in test_files:
        shutil.copy(os.path.join(base_dir, file), os.path.join(test_dir, category, file))

# Split and copy data - only run this when you want to create more copies 

split_and_copy(car_files, 'car')
split_and_copy(flower_files, 'flower')

In [67]:
batch_size = 32

# paths to the train and test directories
train_data_dir = "car_flower_train"
test_data_dir = "car_flower_test"

train_generator = train_datagen.flow_from_directory(
        train_data_dir,  # this is the target directory
        target_size=(64, 64),  # see the instructions above
        batch_size=batch_size,
        class_mode='binary')

Found 1597 images belonging to 2 classes.


### Step 11: Repeat step 10 for the test set 
while setting the folder to the location of the test images, that is, 'test_set'

In [68]:
test_generator = test_datagen.flow_from_directory( # test_datagen only does the rescaling
        test_data_dir,  # this is the target directory
        target_size=(64, 64),  # see the instructions above
        batch_size=batch_size,
        class_mode='binary')

Found 400 images belonging to 2 classes.


Split across training and test data looks good.

### Step 12: Finally, fit the data. 
Set the `steps_per_epoch` to `STEP_SIZE_TRAIN` and the `validation_steps` to `STEP_SIZE_TEST`. 

Why do we need `steps_per_epoch` ?

Keep in mind that a Keras data generator is meant to loop infinitely — it should never return or exit.

Since the function is intended to loop infinitely, Keras has no ability to determine when one epoch starts and a new epoch begins.

Therefore, we compute the `steps_per_epoch` value as the total number of training data points divided by the batch size. Once Keras hits this step count it knows that it’s a new epoch.

In [70]:
STEP_SIZE_TRAIN = 1597 // batch_size
STEP_SIZE_TEST = 400 // batch_size

classifier.fit(
    train_generator,
    steps_per_epoch = STEP_SIZE_TRAIN,
    epochs = 50,
    validation_data = test_generator,
    validation_steps = 400 // batch_size
)

Epoch 1/50
[1m49/49[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 74ms/step - accuracy: 0.7773 - loss: 0.4623 - val_accuracy: 0.7656 - val_loss: 0.5422
Epoch 2/50
[1m 1/49[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m1s[0m 38ms/step - accuracy: 0.8750 - loss: 0.4156



[1m49/49[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - accuracy: 0.8750 - loss: 0.4156 - val_accuracy: 0.7630 - val_loss: 0.5267
Epoch 3/50
[1m49/49[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 79ms/step - accuracy: 0.8135 - loss: 0.4173 - val_accuracy: 0.7630 - val_loss: 0.5318
Epoch 4/50
[1m49/49[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 9ms/step - accuracy: 0.7812 - loss: 0.4711 - val_accuracy: 0.7578 - val_loss: 0.5159
Epoch 5/50
[1m49/49[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 89ms/step - accuracy: 0.8090 - loss: 0.4186 - val_accuracy: 0.7656 - val_loss: 0.4956
Epoch 6/50
[1m49/49[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - accuracy: 0.8750 - loss: 0.3188 - val_accuracy: 0.7839 - val_loss: 0.4754
Epoch 7/50
[1m49/49[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 107ms/step - accuracy: 0.8322 - loss: 0.3795 - val_accuracy: 0.7708 - val_loss: 0.4934
Epoch 8/50
[1m49/49[0m [32m━━━━━━━━━━━━━━━━━

<keras.src.callbacks.history.History at 0x2c8fd1e1730>