# CNN Example 1
For this example, we have images of cars and flowers, which have been divided into training and testing sets, and we have to build a CNN that identifies whether an image is a car or a flower.

### Step 1: Import the numpy library and the necessary Keras libraries and classes

In [1]:
# Import the Libraries
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPool2D
from keras.layers import Flatten
from keras.layers import Dense
import numpy as np
import keras
from tensorflow import random

### Step 2: Now, set a seed and initiate the model with the `Sequential` class

In [2]:
#set a seed
seed = 1
np.random.seed(seed)
random.set_seed(seed)

### Step 3: Add the first layer of the CNN, set the input shape to (64, 64, 3), the dimension of each image, and set the activation function as a ReLU:

In [8]:
# Initialising the CNN
classifier = Sequential(name='CNN1')

classifier.add(Conv2D(32, (3, 3), input_shape=(64, 64, 3), activation='relu', name='Conv2D_1'))

classifier.summary()

Model: "CNN1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 Conv2D_1 (Conv2D)           (None, 62, 62, 32)        896       
                                                                 
Total params: 896
Trainable params: 896
Non-trainable params: 0
_________________________________________________________________


### Step 4: Now, add the pooling layer with the image size as 2x2

In [9]:
classifier.add(MaxPool2D(pool_size=(2, 2), name='MaxPool2D_1'))
classifier.summary()

Model: "CNN1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 Conv2D_1 (Conv2D)           (None, 62, 62, 32)        896       
                                                                 
 MaxPool2D_1 (MaxPooling2D)  (None, 31, 31, 32)        0         
                                                                 
Total params: 896
Trainable params: 896
Non-trainable params: 0
_________________________________________________________________


### Step 5: Flatten the output of the pooling layer by adding a flattening layer to the CNN model:

In [10]:
classifier.add(Flatten(name='Flatten_1'))
classifier.summary()

Model: "CNN1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 Conv2D_1 (Conv2D)           (None, 62, 62, 32)        896       
                                                                 
 MaxPool2D_1 (MaxPooling2D)  (None, 31, 31, 32)        0         
                                                                 
 Flatten_1 (Flatten)         (None, 30752)             0         
                                                                 
Total params: 896
Trainable params: 896
Non-trainable params: 0
_________________________________________________________________


### Step 6: Add the first Dense layer of the MLP. 
Here, 128 is the output of the number of nodes. As a good practice, 128 is good to get started. activation is relu. As a good practice, the power of two is preferred

In [11]:
classifier.add(Dense(units=128, activation='relu', name='Dense_1'))
classifier.summary()

Model: "CNN1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 Conv2D_1 (Conv2D)           (None, 62, 62, 32)        896       
                                                                 
 MaxPool2D_1 (MaxPooling2D)  (None, 31, 31, 32)        0         
                                                                 
 Flatten_1 (Flatten)         (None, 30752)             0         
                                                                 
 Dense_1 (Dense)             (None, 128)               3936384   
                                                                 
Total params: 3,937,280
Trainable params: 3,937,280
Non-trainable params: 0
_________________________________________________________________


### Step 7: Add the output layer of the MLP.
This is a binary classification problem, so the size is 1 and the activation is `sigmoid`:

In [12]:
classifier.add(Dense(units=1, activation='sigmoid', name='Output'))
classifier.summary()

Model: "CNN1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 Conv2D_1 (Conv2D)           (None, 62, 62, 32)        896       
                                                                 
 MaxPool2D_1 (MaxPooling2D)  (None, 31, 31, 32)        0         
                                                                 
 Flatten_1 (Flatten)         (None, 30752)             0         
                                                                 
 Dense_1 (Dense)             (None, 128)               3936384   
                                                                 
 Output (Dense)              (None, 1)                 129       
                                                                 
Total params: 3,937,409
Trainable params: 3,937,409
Non-trainable params: 0
_________________________________________________________________


### Step 8: Compile the network
Use an adam optimizer and compute the accuracy during the training process 

In [13]:
classifier.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

### Step 9: Create training and test data generators. 
- Rescale the training and test images by `1/255` so that all the values are between `0` and `1`.
- Set these parameters for the training data generators only 
 - `shear_range=0.2`, `zoom_range=0.2`, and `horizontal_flip=True`
 
 - https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html


In [24]:
from keras.preprocessing.image import ImageDataGenerator
import scipy

train_datagen = ImageDataGenerator(rescale=1./255,  shear_range=0.2, zoom_range=0.2, horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)

### Step 10: Create a training set from the training set folder.
'training_set' is the folder where our data has been placed. Our CNN model has an image size of `64x64`, so the same size should be passed here too. `batch_size` is the number of images in a single batch, which is `32`. `Class_mode` is set to binary since we are working on binary classifiers

In [18]:
# we need to create separate folders for our 2 classes and put the images in the respective folders.
import os
import shutil
import random

DATA_ROOT = r'D:\Study 2018 and later\Mignimind Bootcamp\Code\P5-WarmUp\Data for all Projects'

car_flower_small = os.path.join(DATA_ROOT, 'car_flower_small')
car = os.path.join(DATA_ROOT, 'car_flower_small/Car')
flower = os.path.join(DATA_ROOT, 'car_flower_small/Flower')
train = os.path.join(DATA_ROOT, 'car_flower_small/train')
test = os.path.join(DATA_ROOT, 'car_flower_small/test')
train_car = os.path.join(DATA_ROOT, 'car_flower_small/train/Car')
train_flower = os.path.join(DATA_ROOT, 'car_flower_small/train/Flower')
test_car = os.path.join(DATA_ROOT, 'car_flower_small/test/Car')
test_flower = os.path.join(DATA_ROOT, 'car_flower_small/test/Flower')

try :
    os.mkdir(os.path.join(DATA_ROOT, car))
    os.mkdir(os.path.join(DATA_ROOT, flower))
    os.mkdir(os.path.join(DATA_ROOT, train))
    os.mkdir(os.path.join(DATA_ROOT, test))
    os.mkdir(os.path.join(DATA_ROOT, train_car))
    os.mkdir(os.path.join(DATA_ROOT, train_flower))
    os.mkdir(os.path.join(DATA_ROOT, test_car))
    os.mkdir(os.path.join(DATA_ROOT, test_flower))
    
except FileExistsError:
    print('Directories already exist')
    pass


try:
    # move all files starting with 'car' to the car folder
    for file in os.listdir(car_flower_small):
        if file.startswith('car'):
            shutil.move(os.path.join(car_flower_small, file), car)
    # move all files starting with 'flower' to the flower folder
    for file in os.listdir(car_flower_small):
        if file.startswith('flower'):
            shutil.move(os.path.join(car_flower_small, file), flower)
except FileNotFoundError:
    print('Files already moved')
    pass


try:
    # move 80 of the images from Car folder to test/Car folder
    for file in random.sample(os.listdir(car), 80):
        shutil.move(os.path.join(car,file), test_car)

    # move 80 of the images from Flower folder to test/Flower folder
    for file in random.sample(os.listdir(flower), 80):
        shutil.move(os.path.join(flower,file), test_flower)

    # move remaining images from Car folder to train/Car folder
    for file in os.listdir(car):
        shutil.move( os.path.join(car,file) , train_car)

    # move remaining images from Flower folder to train/Flower folder
    for file in os.listdir(flower):
        shutil.move( os.path.join(flower,file), train_flower)
except FileNotFoundError:
    print('Files already moved')
    pass
except ValueError:
    print('Files already moved')
    pass

In [25]:
training_set_gen = train_datagen.flow_from_directory(directory=train, target_size=(64, 64), batch_size=32, class_mode='binary')

Found 1837 images belonging to 2 classes.


### Step 11: Repeat step 10 for the test set 
while setting the folder to the location of the test images, that is, 'test_set'

In [26]:
test_set_gen = test_datagen.flow_from_directory(test, target_size=(64, 64), batch_size=32, class_mode='binary')

Found 160 images belonging to 2 classes.


### Step 12: Finally, fit the data. 
Set the `steps_per_epoch` to `STEP_SIZE_TRAIN` and the `validation_steps` to `STEP_SIZE_TEST`. 

Why do we need `steps_per_epoch` ?

Keep in mind that a Keras data generator is meant to loop infinitely — it should never return or exit.

Since the function is intended to loop infinitely, Keras has no ability to determine when one epoch starts and a new epoch begins.

Therefore, we compute the `steps_per_epoch` value as the total number of training data points divided by the batch size. Once Keras hits this step count it knows that it’s a new epoch.

* when I was running this code, I got an error saying that some errors are present when loading images. We need to get rid of corrupted images first. I used this code to get rid of corrupted images.

In [21]:
[x[0] for x in os.walk(car_flower_small)]

['D:\\Study 2018 and later\\Mignimind Bootcamp\\Code\\P5-WarmUp\\Data for all Projects\\car_flower_small',
 'D:\\Study 2018 and later\\Mignimind Bootcamp\\Code\\P5-WarmUp\\Data for all Projects\\car_flower_small\\Car',
 'D:\\Study 2018 and later\\Mignimind Bootcamp\\Code\\P5-WarmUp\\Data for all Projects\\car_flower_small\\Flower',
 'D:\\Study 2018 and later\\Mignimind Bootcamp\\Code\\P5-WarmUp\\Data for all Projects\\car_flower_small\\test',
 'D:\\Study 2018 and later\\Mignimind Bootcamp\\Code\\P5-WarmUp\\Data for all Projects\\car_flower_small\\test\\Car',
 'D:\\Study 2018 and later\\Mignimind Bootcamp\\Code\\P5-WarmUp\\Data for all Projects\\car_flower_small\\test\\Flower',
 'D:\\Study 2018 and later\\Mignimind Bootcamp\\Code\\P5-WarmUp\\Data for all Projects\\car_flower_small\\train',
 'D:\\Study 2018 and later\\Mignimind Bootcamp\\Code\\P5-WarmUp\\Data for all Projects\\car_flower_small\\train\\Car',
 'D:\\Study 2018 and later\\Mignimind Bootcamp\\Code\\P5-WarmUp\\Data for all Pro

In [22]:
from keras.utils import load_img
from PIL import UnidentifiedImageError
import matplotlib.pyplot as plt
%matplotlib inline

for folder in [x[0] for x in os.walk(car_flower_small)]:
    # print (folder)
    for filename in os.scandir(folder):
        if filename.is_file():
            # print(filename)
            try:
                # print('Loading', filename.path)
                load_img(filename.path)
        
           
            except UnidentifiedImageError:
                print("Removing: ", filename.path)
                os.remove(filename.path)
            

Removing:  D:\Study 2018 and later\Mignimind Bootcamp\Code\P5-WarmUp\Data for all Projects\car_flower_small\train\Car\car.1115.jpg


* since we deleted 1 corrupted picture , we need to rerun out generators/iterators code if fir throws an non existing file error

In [27]:
STEP_SIZE_TRAIN=training_set_gen.n//training_set_gen.batch_size
STEP_SIZE_TEST=test_set_gen.n//test_set_gen.batch_size
print (STEP_SIZE_TRAIN)
print (STEP_SIZE_TEST)

classifier.fit(training_set_gen, steps_per_epoch=STEP_SIZE_TRAIN, epochs=5, validation_data=test_set_gen, validation_steps=STEP_SIZE_TEST)

57
5
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x1eeeea2e880>