### **Initialization**
* I use these 3 lines of code on top of my each notebook because it won't cause any trouble while reloading or reworking on the Project or Problem. And the third line of code helps to make visualization within the Notebook.

In [None]:
# Initialization
# I use these three lines of code on top of my each Notebook
%reload_ext autoreload
%autoreload 2
%matplotlib inline

**Downloading the Dependencies**
* I prefer to download all necessary Libraries and Dependencies on one particular cell which mainly focus on Libraries and Dependencies.

In [None]:
# Downloading all necessary Libraries and Dependencies
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import os
import random
import zipfile
import tensorflow as tf

#from google.colab import files
from shutil import  copyfile
from tensorflow.keras.preprocessing import image
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.applications.inception_v3 import InceptionV3
from tensorflow.keras.models import Model
from tensorflow.keras.preprocessing.image import ImageDataGenerator

**Getting the Data**
* I am using Google Colab for this Project, so the act of reading the Data might be different from different platforms. I have used the link below to download the full data of [Dog vs. Cat](https://www.kaggle.com/c/dogs-vs-cats/overview) from [Kaggle](https://www.kaggle.com/). You can manually download the Data from [Kaggle](https://www.kaggle.com/c/dogs-vs-cats/data) as well.

In [None]:
# Loading the Data or Downloading the Data.
# Using Google Colab for reading or loading the Data.
# Uncomment the line below
!wget --no-check-certificate \
    "https://download.microsoft.com/download/3/E/1/3E1C3F21-ECDB-4869-8368-6DEBA77B919F/kagglecatsanddogs_3367a.zip" \
    -O "/tmp/cats-and-dogs.zip"


**Processing the Data**
* The following Python code will use OS library to access the file system and zip file library, allowing you to unzip the file.

In [None]:
# Processing the zip file of the Data
local_zip = "/tmp/cats-and-dogs.zip"
zip_ref = zipfile.ZipFile(local_zip, "r")
zip_ref.extractall("/tmp")
zip_ref.close()

* Now, Let's find the total number of Cats and Dogs Images in the Data directories.

In [None]:
# Finding the total number of Cats and Dogs images in the directory.
# Total number of Cats.
print(f"Total number of Cats is {len(os.listdir('/tmp/PetImages/Cat/'))}")
# Total number of Dogs.
print(f"Total number of Dogs is {len(os.listdir('/tmp/PetImages/Dog/'))}")

**Creating new Directories**
* Creating a new directory for cats-vs-dogs and subdirectories for training and validation. These subdirectories will need more subdirectories for cats and dogs.

In [None]:
# Using os.mkdir to create new directories
# Creating new directories for training and validation
try:
  os.mkdir("/tmp/cats-vs-dogs")
  os.mkdir("/tmp/cats-vs-dogs/training")
  os.mkdir("/tmp/cats-vs-dogs/validation")
  os.mkdir("/tmp/cats-vs-dogs/training/Cats")
  os.mkdir("/tmp/cats-vs-dogs/training/Dogs")
  os.mkdir("/tmp/cats-vs-dogs/validation/Cats")
  os.mkdir("/tmp/cats-vs-dogs/validation/Dogs")
except OSError:
  pass

**Splitting the Data into Training and Validation**
* I will write a function which will takes a SOURCE directory containing the files, a TRAINING directory that a portion of files will be copied to, a VALIDATION directory that a portion of files will be copied to, and SPLIT_SIZE to determine the portion. 90% of the Images will be copied into TRAINING directory and remaining 10% of the Images will be copied into VALIDATION directory. Every Images will be checked, if any of the Images has zero file length then they won't be copied over.

In [None]:
# Writing the function which splits the data into Training and Validation or Testing.
def split_data(SOURCE, TRAINING, VALIDATION, SPLIT_SIZE):
  files = []
  for filename in os.listdir(SOURCE):
    file = SOURCE + filename
    if os.path.getsize(file) > 0:
      files.append(filename)
    else:
      print(filename, "is zero length, so ignoring!")
  
  training_length = int(len(files) * SPLIT_SIZE)
  validation_length = int(len(files) - training_length)
  shuffled_set = random.sample(files, len(files))
  training_set = shuffled_set[0:training_length]
  validation_set = shuffled_set[0:validation_length]

  for filename in training_set:
    this_file = SOURCE + filename
    destination = TRAINING + filename
    copyfile(this_file, destination)

  for filename in validation_set:
    this_file = SOURCE + filename
    destination = VALIDATION + filename
    copyfile(this_file, destination)


CAT_SOURCE_DIR = "/tmp/PetImages/Cat/"
TRAINING_CAT_DIR = "/tmp/cats-vs-dogs/training/Cats/"
VALIDATION_CAT_DIR = "/tmp/cats-vs-dogs/validation/Cats/"

DOG_SOURCE_DIR = "/tmp/PetImages/Dog/"
TRAINING_DOG_DIR = "/tmp/cats-vs-dogs/training/Dogs/"
VALIDATION_DOG_DIR = "/tmp/cats-vs-dogs/validation/Dogs/"

SPLIT_SIZE = 0.9

split_data(CAT_SOURCE_DIR, TRAINING_CAT_DIR, VALIDATION_CAT_DIR, SPLIT_SIZE)
split_data(DOG_SOURCE_DIR, TRAINING_DOG_DIR, VALIDATION_DOG_DIR, SPLIT_SIZE)

* Finding the total number of Images in Training and Validation Dataset. The Training Dataset has 90% of the total Images present in the directory and Validation Dataset has 10% of the total Images present in the directory.

In [None]:
# Total number of images in Training
print(f"Total number of training Cats is {len(os.listdir('/tmp/cats-vs-dogs/training/Cats/'))}")
print(f"Total number of training Dogs is {len(os.listdir('/tmp/cats-vs-dogs/training/Dogs/'))}")

# Total number of images in Validation
print(f"Total number of validation Cats is {len(os.listdir('/tmp/cats-vs-dogs/validation/Cats/'))}")
print(f"Total number of validation Dogs is {len(os.listdir('/tmp/cats-vs-dogs/validation/Dogs/'))}")

* Let's define each of these directories as follows:

In [None]:
# Directory with training cats images
train_cats_dir = os.path.join("/tmp/cats-vs-dogs/training/Cats")

# Directory with training dogs images.
train_dogs_dir = os.path.join("/tmp/cats-vs-dogs/training/Dogs")

# Directory with validation cats images
validation_cats_dir = os.path.join("/tmp/cats-vs-dogs/validation/Cats")

# Directory with validation dogs images
validation_dogs_dir = os.path.join("/tmp/cats-vs-dogs/validation/Dogs")

* Now, Let's look at the filenames in cats and dogs training and validation directories.

In [None]:
# Training Cat directory
train_cat_names = os.listdir(train_cats_dir)
print(train_cat_names[:10])

# Training Dog directory
train_dog_names = os.listdir(train_dogs_dir)
print(train_dog_names[:10])

# Validation Cat directory
validation_cat_names = os.listdir(validation_cats_dir)
print(validation_cat_names[:10])

# Validation Dog directory
validation_dog_names = os.listdir(validation_dogs_dir)
print(validation_dog_names[:10])

**Data Visualization**
* Now, Let's look at the few pictures of the Images to get the sense of how does the Data actually looks like.

In [None]:
# Parameters for our graph 
nrows = 4
ncols = 4

# Index for iterating over images
pic_index = 0

# Setup matplotlib figure
fig = plt.gcf()
fig.set_size_inches(ncols*4, nrows*4)

pic_index += 8
next_cat_px = [os.path.join(train_cats_dir, fname) for fname in train_cat_names[pic_index-8:pic_index]]
next_dog_px = [os.path.join(train_dogs_dir, fname) for fname in train_dog_names[pic_index-8:pic_index]]

for i, img_path in enumerate(next_cat_px + next_dog_px):
  # Set subplots
  sp = plt.subplot(nrows, ncols, i+1)
  sp.axis("Off")

  img = mpimg.imread(img_path)
  plt.imshow(img)

plt.show()

### **Convolutional Neural Network : InceptionV3**
* Building Convolutional Neural Network from scratch using Tensorflow and Keras API.
* Using InceptionV3 for the purpose of transfer learning.
* Since it is a two class Classification problem i.e a Binary Classfication problem, I will use sigmoid activation so that the output of my network will be a single scalar between 0 and 1, encodig the probability of the images.

In [None]:
# Creating convolutional neural network with transfer learning.
# Creating the pretrained model
pre_trained_model = InceptionV3(input_shape=(300, 300, 3),
                                weights="imagenet",
                                include_top=False)

# Making all the layers in pretrained model nontrainable
# Freezing all the layers of pretrained model
for layer in pre_trained_model.layers:
  layer.trainable = False
  
# Summary of the Model
pre_trained_model.summary()

**Processing the Model**
* Working on pretrained model.

In [None]:
# Working on pretrained model.
last_layer = pre_trained_model.get_layer("mixed9")
print(f"The shape of last output layer is {last_layer.output_shape}")
last_output = last_layer.output

**Callbacks**
* It stops the further execution of the program when the certain accuracy is achieved. I will build the callbacks which will stop the execution of the program after 99% accuracy is achieved by the model.

In [None]:
# Building the Callbacks 
class myCallback(tf.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):
    if(logs.get("accuracy") > 0.99):
      print("\nReached 99% accurcy so stopping the execution of the program!")
      self.model.stop_training = True

# Instantiation
callbacks = myCallback()

**Processing the Model**
* Preparing the Final Model from pretrained model.

In [None]:
# Processing the Model

# Flatten the output layer of pretrained model into 1 dimension
x = tf.keras.layers.Flatten()(last_output)
# Adding fully connected layer with relu activation
x = tf.keras.layers.Dense(units=1024, activation="relu")(x)
# Adding dropout with rate 0.2
x = tf.keras.layers.Dropout(0.2)(x)
# Adding final sigmoid layer for activation
x = tf.keras.layers.Dense(1, activation="sigmoid")(x)

# Preparing the final Model
model = Model(pre_trained_model.input, x)

* Next, I will configure the specifications for model training. I will train the model with binary_crossentropy loss, because it is a binary classification problem and the activation is sigmoid.
* Here, I will be using RMSprop which is preferable for Stochastic Gradient Descent because RMSprop automates learning rate tuning for us.

In [None]:
# Compile the Model
model.compile(loss="binary_crossentropy",
              optimizer=RMSprop(lr=0.0001),
              metrics=["accuracy"])

* Let's look at the summary of the Neural Network.

In [None]:
# Summary of Neural Network
model.summary()

**Data Processing**
* I will process our images by normalizing pixel values in range of [0, 1] which is originally in range of [0, 255]

In [None]:
# Normalizing all the images
# All images are rescaled by 1./255
train_datagen = ImageDataGenerator(rescale=1./255,
                                   rotation_range=40,
                                   width_shift_range=0.2,
                                   height_shift_range=0.2,
                                   shear_range=0.2,
                                   zoom_range=0.2,
                                   horizontal_flip=True,
                                   fill_mode="nearest")

validation_datagen = ImageDataGenerator(rescale=1./255)

# Flow training images in batches of 128 using train_datagen generator
train_generator = train_datagen.flow_from_directory(
    "/tmp/cats-vs-dogs/training",
    target_size=(300, 300),
    batch_size=128,
    class_mode="binary"
)

# Flow validation images in batches of 32 using validation_datagen generator
validation_generator = validation_datagen.flow_from_directory(
    "/tmp/cats-vs-dogs/validation",
    target_size=(300, 300),
    batch_size=32, 
    class_mode="binary"
)

### **Training the Model**
* I will train the Model for 100 epochs and 8 epoch per steps.
* The Loss and Accuracy are the great indication of the progress of training. It's making a guess to the classification of the training data and then measuring it against the known label calculating the result. Accuracy is the portion of the correct guesses.

In [None]:
# Training the Model
history = model.fit(
    train_generator,
    steps_per_epoch=8, 
    epochs=50,
    verbose=2,
    validation_data=validation_generator,
    validation_steps=8,
    callbacks=[callbacks]
)

**Model Visualization**
* Plotting Loss vs Accuracy

In [None]:
# Plotting loss vs accuracy

acc = history.history["accuracy"]
val_acc = history.history["val_accuracy"]
loss = history.history["loss"]
val_loss = history.history["val_loss"]

epochs = range(len(acc))

plt.plot(epochs, acc, "r", label="Training accuracy")
plt.plot(epochs, val_acc, "b", label="Validation accuracy")
plt.title("Training and Validation accuracy")

plt.figure()

plt.plot(epochs, loss, "r", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and Validation loss")

plt.legend()
plt.show()

**Important points to take:**

* Implementation of ImageDataGenerator API
* Implementation of Convolutional Neural Networks with Transfer Learning
* Implementation of Inception V3

**Please upvote my work. It inspires and motivates me alot.**
* You can access it in my [GitHub](https://github.com/ThinamXx) as well. You can also see my other projects my [GitHub](https://github.com/ThinamXx). My GitHub account is [ThinamXx](https://github.com/ThinamXx).