# Stanford Dogs Classification with CNN

**Kernel:** The matrices iterated over the image with elementwise multiplication in convolution step. They help us to sharpen/blur the image or detect some specific features.  

**Stride:** How much we iterate the feature map in each step of convolution.  

**Pooling:** After convolution, we still preserve the spatial localites in our image. But it is not wanted. We want our network to recognize some features and patterns even though they are rotated, widened etc. This is also helpful to avoid overfitting. In pooling, we seperate the image into non overlapping pixel groups and pick a data from each group according to type of pooling. For example in max pooling, we pick the max pixel to get into resulting matrix. Getting the data by this helps us to determine a pattern even if it is located in somewhere else or in some other orientation in test image. Pooling also reduces the size of parameters.    

**Padding:** It refers to the amount of pixels added to an image when it is being processed by the kernel of a CNN. Padding is applied to overcome the border effect problem. As we convolve through the image, we lose data on the edges. By padding, we add artificial pixels to the borders and centralize our features. Padding is not strictly necessary for large images but is improtant foro small images.  

**Data generator:** A class of Keras that utilizes uploading the dataset into our model and ditributing the images into batches.   

**Dropout:** It is a regularization method that approximates training a large number of neural networks with different architectures in parallel. It eliminates a proportion of neural weights in each propagation. 

**Image data augmentation:** Transforms that include a range of operations from the field of image manipulation, such as shifts, flips, zooms, and much more.

Dataset: https://www.kaggle.com/jessicali9530/stanford-dogs-dataset

Paper about optimization: https://blog.paperspace.com/intro-to-optimization-in-deep-learning-gradient-descent/  

A full description: https://towardsdatascience.com/wtf-is-image-classification-8e78a8235acb  

A detailed tutorial: https://machinelearningmastery.com/padding-and-stride-for-convolutional-neural-networks/

A basic train example: https://towardsdatascience.com/building-a-convolutional-neural-network-cnn-in-keras-329fbbadc5f5

Image kernels: https://setosa.io/ev/image-kernels/   

Code reference: https://www.kaggle.com/hengzheng/dog-breeds-classifier

In [1]:
import tensorflow as tf
from keras.preprocessing.image import ImageDataGenerator

In [2]:
import os
from sklearn.model_selection import train_test_split
import numpy as np
import pandas as pd
from keras.utils import *
from keras.models import Sequential
from keras.layers import Dropout
from shutil import copyfile

## Preprocessing

In [15]:
train_datagen = ImageDataGenerator(rescale=1./255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    #width_shift_range=0.2,
    #height_shift_range=0.2,  ## bu faydalı mı bilemedim
    validation_split=0.2) # set validation split

train_generator = train_datagen.flow_from_directory(
    "./images/Images",
    target_size=(64, 64),
    batch_size=32,
    class_mode='categorical',
    subset='training') # set as training data

validation_generator = train_datagen.flow_from_directory(
    "./images/Images", # same directory as training data
    target_size=(64, 64),
    batch_size=32,
    class_mode='categorical',
    subset='validation') # set as validation data

Found 16508 images belonging to 120 classes.
Found 4072 images belonging to 120 classes.


## Training

In [16]:
cnn = tf.keras.models.Sequential()
cnn.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, activation='relu', input_shape=[64, 64, 3]))
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2))
cnn.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, activation='relu'))
# we do not add the input shape since it is not the input layer anymore
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2))
cnn.add(tf.keras.layers.Flatten())
cnn.add(Dropout(0.2))
cnn.add(tf.keras.layers.Dense(units=1024, activation='relu'))
cnn.add(Dropout(0.2))   #0.2 of the inputs will be randomly excluded from each update cycle.
cnn.add(tf.keras.layers.Dense(units=120, activation='softmax'))
cnn.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
cnn.fit(x = train_generator, validation_data = validation_generator, epochs = 12)

Epoch 1/12
Epoch 2/12
Epoch 3/12
Epoch 4/12
Epoch 5/12
Epoch 6/12
Epoch 7/12
Epoch 8/12
Epoch 9/12
Epoch 10/12
Epoch 11/12
Epoch 12/12


<tensorflow.python.keras.callbacks.History at 0x27bb90497c0>

In [60]:
import numpy as np
import cv2
import os
from tensorflow.keras.preprocessing import image

In [63]:
path = './images/Images/'
num_tests = 20
counter = 0  #counts the accuracy in first 5 predictions

for i in range(num_tests):
    
    x = np.random.randint(len(os.listdir(path)))
    type_name = os.listdir(path)[x]
    newpath = path+type_name+'/'
    x = np.random.randint(len(os.listdir(newpath)))
    image_name = os.listdir(newpath)[x]
    newpath +=image_name
    
    img = image.load_img(newpath, target_size=(64, 64))
    img_array = image.img_to_array(img)/255
    img_batch = np.expand_dims(img_array, axis=0)
    pred = cnn.predict(img_batch)
    mapping = train_generator.class_indices
    j = 0
    for key in mapping:
        mapping[key] = pred[0][j]
        j+=1
        
    mapping = dict(sorted(mapping.items(), key=lambda item: item[1],reverse = True))
    typeshort = type_name[type_name.find("-")+1:]
    text1 = "Actual Type: "+ typeshort
    text2 =""
    text2 +="Predictions:\n"
    j = 0
    for key in mapping:
        if(j==5):
            break
        text2 += key[key.find("-")+1:]+": "+str(mapping[key])+"\n"
        j+=1
    
    print(text1)
    print(text2)
    print("")
    if(typeshort in text2):
        counter+=1
    
    img2 = cv2.imread(newpath,0)
    cv2.imshow(type_name[type_name.find("-")+1:] ,img2)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    
    
print("The rate of reaching the correct type in first 5 predictions: "+str(counter/num_tests))

Actual Type: bloodhound
Predictions:
dingo: 0.3430635
basenji: 0.19292621
Leonberg: 0.18629146
dhole: 0.061713573
Afghan_hound: 0.04099191


Actual Type: redbone
Predictions:
redbone: 0.7349755
bloodhound: 0.050924223
Irish_terrier: 0.03833142
golden_retriever: 0.018872371
vizsla: 0.018436732


Actual Type: miniature_schnauzer
Predictions:
Scottish_deerhound: 0.35837412
keeshond: 0.09457108
Irish_wolfhound: 0.08901828
Weimaraner: 0.06453181
miniature_schnauzer: 0.041819356


Actual Type: Brabancon_griffon
Predictions:
Brabancon_griffon: 0.1443465
Bernese_mountain_dog: 0.14097139
boxer: 0.10542709
collie: 0.086436845
Yorkshire_terrier: 0.03927331


Actual Type: African_hunting_dog
Predictions:
African_hunting_dog: 0.48388088
Greater_Swiss_Mountain_dog: 0.19346884
Norwegian_elkhound: 0.04793422
bluetick: 0.039568275
Walker_hound: 0.025860952


Actual Type: Sussex_spaniel
Predictions:
Sussex_spaniel: 0.60307276
redbone: 0.031826302
Rhodesian_ridgeback: 0.031076202
silky_terrier: 0.0271793