# Dog Project Work Book

#### Import Statements

In [1]:
import keras
from keras.preprocessing.image import ImageDataGenerator
import sklearn
from sklearn.datasets import load_files       
from keras.utils import np_utils
import numpy as np
from glob import glob
import random
import matplotlib.pyplot as plt                        
%matplotlib inline

Using TensorFlow backend.


## Organize the Data
The data needs to be organized so that it can be easily input into different models.

### Datasets

#### Dog/Human Data
This will be used to train the Dog/Human detectors.

In [2]:
# Collect all Dog Data
dog_data = np.expand_dims(np.array(glob('dogImages/*/*/*')), axis=1)
np.random.shuffle(dog_data)

# Collect all Human Data
human_data = np.expand_dims(np.array(glob('lfw/*/*')), axis=1)
np.random.shuffle(human_data)

# We want equal data from dog and human categories so we only use some of the human data
human_data = human_data[:dog_data.shape[0]]

So the data has been put in the numpy arrays and shuffled. The next step is to:
<ol>
    <li>Assign Labels</li>
    <li>Combine the Arrays</li>
    <li>Reshuffle the Data</li>
</ol>

In [3]:
# Assign labels to data; 1 for Human 0, for Dog
human_labels = np.ones(human_data.shape, dtype=np.float32)
dog_labels = np.zeros(dog_data.shape, dtype=np.float32)

# Combine labels with respective data, combine all data into one array
labeled_human_data = np.concatenate((human_data, human_labels), axis=1)
labeled_dog_data = np.concatenate((dog_data, dog_labels), axis=1)
detector_data = np.concatenate((labeled_human_data, labeled_dog_data), axis=0)

# Shuffled data for detector
np.random.shuffle(detector_data)

# Split into input and labels
detector_inputs = detector_data[:,0]
human_det_labels = detector_data[:,1].astype(np.float32)

# Flip labels for dog detector
dog_det_labels = 1 - human_det_labels

The next cell shows that the first 10 entries and the labels make sense

In [4]:
for item in zip(detector_inputs[:10], human_det_labels, dog_det_labels):
    print('Filename:', item[0], '     Human Label:', item[1], '     Dog Label:', item[2])

Filename: lfw\Tom_Brennan\Tom_Brennan_0001.jpg      Human Label: 1.0      Dog Label: 0.0
Filename: lfw\Jimmy_Carter\Jimmy_Carter_0002.jpg      Human Label: 1.0      Dog Label: 0.0
Filename: lfw\Hutomo_Mandala_Putra\Hutomo_Mandala_Putra_0001.jpg      Human Label: 1.0      Dog Label: 0.0
Filename: dogImages\train\048.Chihuahua\Chihuahua_03412.jpg      Human Label: 0.0      Dog Label: 1.0
Filename: dogImages\train\002.Afghan_hound\Afghan_hound_00138.jpg      Human Label: 0.0      Dog Label: 1.0
Filename: dogImages\train\114.Otterhound\Otterhound_07395.jpg      Human Label: 0.0      Dog Label: 1.0
Filename: lfw\Augustin_Calleri\Augustin_Calleri_0003.jpg      Human Label: 1.0      Dog Label: 0.0
Filename: dogImages\test\024.Bichon_frise\Bichon_frise_01759.jpg      Human Label: 0.0      Dog Label: 1.0
Filename: lfw\Brian_Pavlich\Brian_Pavlich_0001.jpg      Human Label: 1.0      Dog Label: 0.0
Filename: lfw\Sarah_Hughes\Sarah_Hughes_0002.jpg      Human Label: 1.0      Dog Label: 0.0


Finally, the data needs to be split into training, validation, and test sets.

In [5]:
detector_inputs.shape

(16702,)

In [6]:
# we want 1000 validation and 1000 test examples
test_index = (0, 1000)
validation_index = (1000, 3000)
train_index = (3000, 16701)

def get_data(inputs, labels, indices):
    input_data = inputs[indices[0]: indices[1]]
    data_labels = labels[indices[0]: indices[1]]
    return input_data, data_labels

test_a, test_b = get_data(detector_inputs, human_det_labels, test_index)

In [7]:
for item in zip(test_a[:10], test_b[:10]):
    print(item)

('lfw\\Tom_Brennan\\Tom_Brennan_0001.jpg', 1.0)
('lfw\\Jimmy_Carter\\Jimmy_Carter_0002.jpg', 1.0)
('lfw\\Hutomo_Mandala_Putra\\Hutomo_Mandala_Putra_0001.jpg', 1.0)
('dogImages\\train\\048.Chihuahua\\Chihuahua_03412.jpg', 0.0)
('dogImages\\train\\002.Afghan_hound\\Afghan_hound_00138.jpg', 0.0)
('dogImages\\train\\114.Otterhound\\Otterhound_07395.jpg', 0.0)
('lfw\\Augustin_Calleri\\Augustin_Calleri_0003.jpg', 1.0)
('dogImages\\test\\024.Bichon_frise\\Bichon_frise_01759.jpg', 0.0)
('lfw\\Brian_Pavlich\\Brian_Pavlich_0001.jpg', 1.0)
('lfw\\Sarah_Hughes\\Sarah_Hughes_0002.jpg', 1.0)


#### Dog Breed Data
This will be used to train the breed classifiers.

### Data Augmentation

#### Create Generator that will Augment Data

In [8]:
training_generator = ImageDataGenerator(rescale=1.,
                                        rotation_range=10.0,
                                        width_shift_range=0.1,
                                        height_shift_range=0.1,
                                        shear_range=0.1,
                                        zoom_range=0.1,
                                        horizontal_flip=True,
                                        vertical_flip=False,
                                        fill_mode="reflect")

validation_generator = ImageDataGenerator(rescale=1)

In [9]:
from keras.preprocessing import image                  
from tqdm import tqdm
from PIL import ImageFile                            
ImageFile.LOAD_TRUNCATED_IMAGES = True 

def path_to_tensor(img_path):
    # loads RGB image as PIL.Image.Image type
    img = image.load_img(img_path, target_size=(224, 224))
    # convert PIL.Image.Image type to 3D tensor with shape (224, 224, 3)
    x = image.img_to_array(img)
    # convert 3D tensor to 4D tensor with shape (1, 224, 224, 3) and return 4D tensor
    return np.expand_dims(x, axis=0)

def paths_to_tensor(img_paths):
    list_of_tensors = [path_to_tensor(img_path) for img_path in tqdm(img_paths)]
    return np.vstack(list_of_tensors)

In [10]:
x = paths_to_tensor(test_a)
training_data = training_generator.flow(x, test_b, batch_size=32, shuffle=False)

100%|█████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:04<00:00, 203.63it/s]


#### Create Augmented Train, Validation, and Test Datasets
One set for the Dog/Human data and one set for the Dog Breed data.

### Compute Bottleneck features


## Human/Dog Detector
Build a model that predicts if there is a dog, human, both or neither in the image.

### Human Detection

In [11]:
train_inputs, train_hlabels = get_data(detector_inputs, human_det_labels, train_index)

In [12]:
train_inputs = paths_to_tensor(train_inputs)

100%|███████████████████████████████████████████████████████████████████████████| 13701/13701 [01:09<00:00, 196.83it/s]


In [17]:
from keras.applications.inception_v3 import InceptionV3, preprocess_input

In [19]:
model = InceptionV3(include_top=False, weights='imagenet')

<tf.Tensor 'mixed10/concat:0' shape=(?, ?, ?, 2048) dtype=float32>

### Dog Detection

### Object Localization

### Split Image for Images Containing Both

## Breed Classifier for Dogs
Build a model to predict the dog breed, optimized for pictures of dogs.

### Build and Train Several Models
Use bottleneck features from several pretrained models

### Organize Models into an Ensemble

### Test the Ensemble

## Breed Classifier for Humans
Build a model that predicts a breed that looks similar to the human in the image.

### Face Detection

### Face Cropping

### Breed Classifier for Dog Faces

### Test Model on Human Faces

## Face Matching
Find a dog in the data that has a similar face as the human face.

### Eye and Nose Detector

### Process Dog Face
Adjust size and orientation so eye and nose location match the human.

## Feature Melding

### Generate Image of Human/Dog Face Combined

## Super Resolution

### Create a Super Resolution Model

### Apply Super Resolution to the Generated Image

## Final Algorithm

### Build Algorithm

### Test Algorithm