# Working with Custom Images

So far everything we've worked with has been nicely formatted for us already by Keras.

Let's explore what its like to work with a more realistic data set.

## The Data

-----------

## PLEASE NOTE: THIS DATASET IS VERY LARGE. IT CAN BE DOWNLOADED FROM THE PREVIOUS LECTURE. PLEASE WATCH THE VIDEO LECTURE ON HOW TO GET THE DATA.

## USE OUR VERSION OF THE DATA. WE ALREADY ORGANIZED IT FOR YOU!!

--------
----------
--------

ORIGINAL DATA SOURCE:

https://www.microsoft.com/en-us/download/confirmation.aspx?id=54765

-----------

The Kaggle Competition: [Cats and Dogs](https://www.kaggle.com/c/dogs-vs-cats-redux-kernels-edition) includes 25,000 images of cats and dogs. We will be building a classifier that works with these images and attempt to detect dogs versus cats!

The pictures are numbered 0-12499 for both cats and dogs, thus we have 12,500 images of Dogs and 12,500 images of Cats. This is a huge dataset!!

--------
----------
------------


**Note: We will be dealing with real image files, NOT numpy arrays. Which means a large part of this process will be learning how to work with and deal with large groups of image files. This is too much data to fit in memory as a numpy array, so we'll need to feed it into our model in batches. **

### Visualizing the Data


-------
Let's take a closer look at the data.

In [1]:
import matplotlib.pyplot as plt
import cv2
# Technically not necessary in newest versions of jupyter
%matplotlib inline

ModuleNotFoundError: No module named 'cv2'

In [None]:
cat4 = cv2.imread(r'C:\Users\ACER\Downloads\CATS_DOGS\train\CAT\4.jpg')
cat4 = cv2.cvtColor(cat4, cv2.COLOR_BGR2RGB)

In [None]:
type(cat4)

In [None]:
cat4.shape

In [None]:
plt.imshow(cat4)

In [None]:
dog2 = cv2.imread(r'C:\Users\ACER\Downloads\CATS_DOGS\train\DOG\2.jpg')
dog2 = cv2.cvtColor(dog2, cv2.COLOR_BGR2RGB)

In [None]:
dog2.shape

In [None]:
plt.imshow(dog2)

## Preparing the Data for the model

There is too much data for us to read all at once in memory. We can use some built in functions in Keras to automatically process the data, generate a flow of batches from a directory, and also manipulate the images.

### Image Manipulation

Its usually a good idea to manipulate the images with rotation, resizing, and scaling so the model becomes more robust to different images that our data set doesn't have. We can use the **ImageDataGenerator** to do this automatically for us. Check out the documentation for a full list of all the parameters you can use here!

In [None]:
from keras.preprocessing.image import ImageDataGenerator

In [None]:
image_gen = ImageDataGenerator(rotation_range = 30,
                              width_shift_range = 0.1,
                              height_shift_range = 0.1,
                              rescale = 1/255,
                              shear_range = 0.2,
                              zoom_range = 0.2,
                              horizontal_flip = True,
                              fill_mode = 'nearest')

In [None]:
plt.imshow(image_gen.random_transform(dog2))

In [None]:
plt.imshow(image_gen.random_transform(dog2))

In [None]:
plt.imshow(image_gen.random_transform(dog2))

### Generating many manipulated images from a directory


In order to use .flow_from_directory, you must organize the images in sub-directories. This is an absolute requirement, otherwise the method won't work. The directories should only contain images of one class, so one folder per class of images.

Structure Needed:

* Image Data Folder
    * Class 1
        * 0.jpg
        * 1.jpg
        * ...
    * Class 2
        * 0.jpg
        * 1.jpg
        * ...
    * ...
    * Class n

In [None]:
image_gen.flow_from_directory(r'C:\Users\ACER\Downloads\CATS_DOGS\train')

In [None]:
image_gen.flow_from_directory(r'C:\Users\ACER\Downloads\CATS_DOGS\test')

### Resizing Images

Let's have Keras resize all the images to 150 pixels by 150 pixels once they've been manipulated.

In [None]:
# width,height,channels
image_shape = (150, 150, 3)

# Creating the Model

In [None]:
from keras.models import Sequential
from keras.layers import Activation, Dropout, Flatten, Dense, Conv2D, MaxPooling2D

In [None]:
model = Sequential()

model.add(Conv2D(filters = 32, kernel_size = (3, 3), input_shape = (150, 150, 3), activation = 'relu'))
model.add(MaxPooling2D(pool_size = (2, 2)))

model.add(Conv2D(filters = 64, kernel_size = (3, 3), input_shape = (150, 150, 3), activation = 'relu'))
model.add(MaxPooling2D(pool_size = (2, 2)))

model.add(Conv2D(filters = 64, kernel_size = (3, 3), input_shape = (150, 150, 3), activation = 'relu'))
model.add(MaxPooling2D(pool_size = (2, 2)))

model.add(Flatten())

model.add(Dense(128))
model.add(Activation('relu'))

# Dropouts help reduce overfitting by randomly turning neurons off during training.
# Here we say randomly turn off 50% of neurons.
model.add(Dropout(0.5))

# Last layer, remember its binary, 0=cat , 1=dog
model.add(Dense(1))
model.add(Activation('softmax'))

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

In [None]:
model.summary()

### Training the Model

In [None]:
batch_size = 16
train_image_gen = image_gen.flow_from_directory(r'C:\Users\ACER\Downloads\CATS_DOGS\train',
                                               target_size = image_shape[:2],
                                               batch_size = batch_size,
                                               class_mode = 'binary')

In [None]:
# why only height and width of the image for target size?

In [None]:
batch_size = 16
test_image_gen = image_gen.flow_from_directory(r'C:\Users\ACER\Downloads\CATS_DOGS\test',
                                               target_size = image_shape[:2],
                                               batch_size = batch_size,
                                               class_mode = 'binary')

In [None]:
train_image_gen.class_indices

In [None]:
import warnings
warnings.filterwarnings('ignore')

In [None]:
results = model.fit_generator(train_image_gen, epochs = 10,
                             steps_per_epoch = 15,
                             validation_data = test_image_gen,
                             validation_steps = 5)

In [None]:
# model.save('cat_dog2.h5')

# Evaluating the Model

In [None]:
results.history['accuracy']

In [None]:
plt.plot(results.history['accuracy'])

In [None]:
# model.save('cat_dog_100epochs.h5')

# Predicting on new images

In [None]:
train_image_gen.class_indices

In [None]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.utils import load_img, img_to_array

dog_file = r'C:\Users\ACER\Downloads\CATS_DOGS\train\DOG\2.jpg'

dog_img = tf.keras.utils.load_img(dog_file, target_size = (150, 150))
dog_img = tf.keras.utils.img_to_array(dog_img)
dog_img = np.expand_dims(dog_img, axis = 0)
dog_img = dog_img/255

In [None]:
prediction_prob = model.predict(dog_img)

In [None]:
# Output prediction
print('Probability that image is a dog is:', prediction_prob)

# Great Job!