* In this notebook we are trying to build a model that has the ability to classify cats and dogs. 
* The input data we will use is from a dataset provided by kaggle.
* This dataset consists of 25,000 images of cats and dogs.
* The dataset we will use is saved for us in '/kaggle/input/dogs-vs-cats'.
* The following command is to show us the format of the data.

In [None]:
import os
for file in os.listdir('/kaggle/input/dogs-vs-cats'):
    print(file)

**As shown above, the data is stored in two zip files named test1.zip and train.zip. So in order to deal with the data, we first have to extract these files.**

In [None]:
from zipfile import ZipFile as zf
train_zip = zf('/kaggle/input/dogs-vs-cats/train.zip', 'r')
train_zip.extractall()
train_zip.close()
test_zip = zf('/kaggle/input/dogs-vs-cats/test1.zip', 'r')
test_zip.extractall()
test_zip.close()

Now we have two main directories containing the dataset, a directory for the training set and another one for the test set. Both directories are in the working directory "./". Let's see how big is the dataset.

In [None]:
print('there exist ' + str(len(os.listdir('./train'))) + ' training examples')
print('there exist ' + str(len(os.listdir('./test1'))) + ' test examples')
print(os.listdir('./train')[0:10]) # print the first ten file names in the training set
print(os.listdir('./test1')[0:10]) # print the first ten file names in the training set

Now that we have 25,000 training examples, we will split them into training set and validation set. We will not deal with the 12500 testing examoles now.

In [None]:
os.mkdir('./train1')
os.mkdir('./train1/cats')
os.mkdir('./train1/dogs')

In [None]:
import shutil

for file_name in os.listdir('./train'):
    if file_name.split('.')[0] == 'cat':
        shutil.copy(os.path.join('./train/', file_name) , os.path.join('./train1/cats', file_name))
    elif file_name.split('.')[0] == 'dog':
        shutil.copy(os.path.join('./train/', file_name) , os.path.join('./train1/dogs', file_name))

In [None]:
print(len(os.listdir('./train1/cats')))
print(len(os.listdir('./train1/dogs')))

Now we want to split the data into training and validation

In [None]:
os.mkdir('./training_set')
os.mkdir('./training_set/cats')
os.mkdir('./training_set/dogs')
os.mkdir('./val_set')
os.mkdir('./val_set/cats')
os.mkdir('./val_set/dogs')

In [None]:
import random
traincats = os.listdir('./train1/cats')
random.shuffle(traincats)
traindogs = os.listdir('./train1/dogs')
random.shuffle(traindogs)
for file_name in traincats[:10000]:
    shutil.copy(os.path.join('./train1/cats', file_name) , os.path.join('./training_set/cats', file_name))
for file_name in traincats[10000:]:
    shutil.copy(os.path.join('./train1/cats', file_name) , os.path.join('./val_set/cats', file_name))
for file_name in traindogs[:10000]:
    shutil.copy(os.path.join('./train1/dogs', file_name) , os.path.join('./training_set/dogs', file_name))
for file_name in traindogs[10000:]:
    shutil.copy(os.path.join('./train1/dogs', file_name) , os.path.join('./val_set/dogs', file_name))

In [None]:
print(len(os.listdir('./training_set/cats/')))
print(len(os.listdir('./training_set/dogs/')))
print(len(os.listdir('./val_set/cats/')))
print(len(os.listdir('./val_set/dogs/')))

In [None]:
import tensorflow as tf
import tensorflow.keras.layers as tfl

model = tf.keras.Sequential([
    tfl.Conv2D(64, (3,3), activation='relu', input_shape=(150, 150, 3)),
    tfl.BatchNormalization(),
    tfl.MaxPooling2D(2,2),
    tfl.Conv2D(128, (3,3), activation='relu'),
    tfl.BatchNormalization(),
    tfl.MaxPooling2D(2,2),
    tfl.Conv2D(256, (3,3), activation='relu'),
    tfl.BatchNormalization(),
    tfl.MaxPooling2D(2,2),
    tfl.Flatten(),
    tfl.Dense(512, activation='relu'),
    tfl.Dense(1, activation='sigmoid')
])

In [None]:
model.summary()

In [None]:
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate = 0.001), loss='binary_crossentropy', metrics=['binary_accuracy'])

Now, we are ready to train our Model. Except first we need to introduce our labeled data to our model. For this we will use the ImageDataGenerator.

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

In [None]:
train_datagen = ImageDataGenerator(rescale=1./255,
                                  rotation_range = 40,
                                  width_shift_range = 0.2,
                                  height_shift_range = 0.2,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True,
                                   fill_mode = 'nearest'
                                  )
train_generator = train_datagen.flow_from_directory(
    './training_set',
    target_size = (150,150),
    batch_size = 8,
    class_mode = 'binary'
)
val_datagen = ImageDataGenerator(rescale=1./255)
val_generator = val_datagen.flow_from_directory(
    './val_set/',
    target_size = (150,150),
    batch_size = 8,
    class_mode = 'binary'
)

Now we are ready to train our Model

In [None]:
history = model.fit_generator(train_generator, epochs=100, validation_data=val_generator)

In [None]:
import matplotlib.pyplot as plt 
acc = history.history['binary_accuracy']
val_acc = history.history['val_binary_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs_range = range(len(acc))

plt.figure(figsize=(15, 15))
plt.subplot(2, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(2, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()