<a href="https://colab.research.google.com/github/muhammadBadawy/CS231/blob/master/Chess_CNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Chess classifier (using CNN and data augmentation)

First let's start by importing the main libraries that we will need in our project

In [0]:
import numpy as np
import pandas as pd

# and for visualization

import seaborn as sns
import matplotlib as mpl
import matplotlib.pyplot as plt

Now we start the EDA, we need to see what our data looks like

1. The number of samples
2. The distribution of samples between classes
3. quality of the images

let's scan our directory to get the labels from the folder names

In [0]:
# We need os library to handle the pathes and directories
import os

folder_name = 'drive/My Drive/chess/Chess'
chess_folders = os.listdir(folder_name)

chess_pieces = {}
for piece in chess_folders:
    chess_pieces[piece] = len(os.listdir("drive/My Drive/chess/Chess/"+piece))
    
print(chess_pieces)

As we are seeing

1. The data samples are few
2. They are not that poorly distributed between classes

Let's make a visualization to make it more clear

In [0]:
plt.figure(figsize=(14, 6))
plt.bar(range(len(chess_pieces)), list(chess_pieces.values()), color="purple")
plt.xticks(range(len(chess_pieces)), list(chess_pieces))
plt.show()

In [0]:
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img

In [0]:
datagen = ImageDataGenerator(
        rotation_range=30,
        width_shift_range=0.1,
        height_shift_range=0.1,
        shear_range=0.05,
        zoom_range=0.1,
        horizontal_flip=True,
        fill_mode='nearest')


for piece in chess_folders:
    piece_images = os.listdir("drive/My Drive/chess/Chess/"+piece)
    
    for image in piece_images:
        image_name = image.split(".")[0]
        img = load_img('drive/My Drive/chess/Chess'+'/'+piece+'/'+image)
        x = img_to_array(img)
        x = x.reshape((1,) + x.shape)


        i = 0
        for batch in datagen.flow(x, batch_size=1,
                                  save_to_dir='drive/My Drive/chess/pieces'+'/'+piece, save_prefix=piece+"_"+image_name, save_format='jpeg'):
            i += 1
            if i > 4:
                break  # otherwise the generator would loop indefinitely



# img = load_img('Chess/Pawn/00000035.jpg')
# x = img_to_array(img)
# x = x.reshape((1,) + x.shape)


# i = 0
# for batch in datagen.flow(x, batch_size=1,
#                           save_to_dir='preview', save_prefix='cat', save_format='jpeg'):
#     i += 1
#     if i > 20:
#         break  # otherwise the generator would loop indefinitely

now after data augmentation let's check the data again

In [0]:
chess_pieces = {}
for piece in chess_folders:
    chess_pieces[piece] = len(os.listdir("drive/My Drive/chess/pieces/"+piece))
    
print(chess_pieces)

plt.figure(figsize=(14, 6))
plt.bar(range(len(chess_pieces)), list(chess_pieces.values()), color="purple")
plt.xticks(range(len(chess_pieces)), list(chess_pieces))
plt.show()

Now we got more data with accepted samples per class, Let's start the pre processing

Preprocessing steps:

1. Let's grayscale the images, because colors aren't realy good features to classify a chess piece
2. Resize the images to a fixed size to feed it to the the neural network

In [0]:
train_datagen = ImageDataGenerator(rescale=1./255)
input_shape = (300, 300, 1)
batch_size = 16

training_generator = train_datagen.flow_from_directory(
    'drive/My Drive/chess/pieces',
    target_size=(300, 300),
    color_mode='grayscale',
    batch_size=batch_size,
    class_mode='categorical',
#     subset='training',
    shuffle=True, #we shuffle our images for better performance
    seed=8)

validation_generator = train_datagen.flow_from_directory(
    'drive/My Drive/chess/testing',
    target_size=(300, 300),
    color_mode='grayscale',
    batch_size=batch_size,
    class_mode='categorical',
#     subset='validation',
    shuffle=True,
    seed=7)


Now it's time to build our neural network

In [0]:
from keras.models import Sequential
from keras.layers import Dense, Flatten, Conv2D, MaxPool2D, Dropout, Activation, BatchNormalization

In [0]:
model = Sequential([
  Conv2D(16, (5, 5), input_shape=input_shape, padding='same', activation='relu'),
  Conv2D(32, (3, 3), padding='same', activation='relu'),
  MaxPool2D((2, 2)),
  Dropout(0.2),
  Conv2D(64, (3, 3), padding='same', activation='relu'),
  Conv2D(128, (3, 3), padding='same', activation='relu'),
  MaxPool2D((2, 2)),
  Dropout(0.2),
  Flatten(),
  Dense(128, activation='relu'),
  Dropout(0.2),
  Dense(6, activation='softmax')
])

In [0]:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()

Now we train the model

In [0]:
history = model.fit_generator(
        training_generator,
        steps_per_epoch=25,
        epochs=20,
        validation_data=validation_generator,
        validation_steps=15)

I trained the model many times and found that 20 epochs are good enough and doesn't cause over-fitting

Now let's see how the training went, and the loss.

In [0]:
from sklearn.metrics import classification_report, confusion_matrix

plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

# Plot training & validation loss values
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()


Let's evauluate the model with some well known measures such as accuracy, recall, precision and f1 score

Now let's see the results in more details

In [0]:
num_of_test_samples = 122
# Confution Matrix and Classification Report
Y_pred = model.predict_generator(validation_generator, num_of_test_samples // batch_size + 1)
y_pred = np.argmax(Y_pred, axis=1)
matrix1 = confusion_matrix(validation_generator.classes, y_pred)

sns.heatmap(matrix1, annot=True, cbar=False);
plt.ylabel('True Label');
plt.xlabel('Predicted Label');
plt.title('Confusion Matrix');
plt.show()

print('\nClassification Report')
target_names = ['Bishop',
'King',
'Rook',
'Pawn',
'Queen',
'Knight']
class_report = classification_report(validation_generator.classes, y_pred, target_names=target_names)
print(class_report)