# Convolutional Neural Networks for Feature Extraction Trained on High Resolution Images

Training on low resolution images was relatively fast which allows to experiment. As mentioned earlier the task has no numerical metric to compare performances and is based on human perception of final output. Based on my subjective opinion and using BallTree nearest neighbor unsupervised algorithm for predicting 5 most similar looking ones to a given dress, I decided to use a CNN with 3 hidden convolutional layers. 

Now I will build a similarly structured model and train it on high resolution images. As I am running all notebooks on my local machine, the training might be slow. Therefore I will start by training the model with just one hidden convolutional layer and gradually add two more. 

In [23]:
# load packages
import pandas as pd
import numpy as np
import os

from timeit import default_timer as timer
from datetime import timedelta

from keras.preprocessing.image import ImageDataGenerator, image
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, BatchNormalization, Flatten, Dense

In [2]:
# load train, validation and test sets
train_data = pd.read_csv('data/train_images.csv', header=0)
validation_data = pd.read_csv('data/validation_images.csv', header=0)
test_data = pd.read_csv('data/test_images.csv', header=0)

# load two sets of unlabeled data for further model testing
test_dresses_small = pd.read_csv('data/test_dresses_small.csv', header=0)
test_dresses_large = pd.read_csv('data/test_dresses_large.csv', header=0)

notebook_start=timer()

print('Relevant dataframes loaded')

Relevant dataframes loaded


In [3]:
# create generators for train, validation and test images
train_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        validation_split=0.2)

attr_columns = list(train_data.drop(['url', 'Unnamed: 0', 'short_path', 'low_res_url'], axis=1).columns)

train_generator = train_datagen.flow_from_dataframe(
        dataframe=train_data,
        directory='data/cropped_images_450x300/',
        x_col='short_path',
        y_col=attr_columns,
        target_size=(450,300),
        batch_size=32,
        class_mode='raw')

test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_dataframe(
        dataframe=test_data,
        directory='data/cropped_images_450x300/',
        x_col='short_path',
        y_col=attr_columns,
        target_size=(450,300),
        batch_size=32,
        class_mode='raw')

validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_dataframe(
        dataframe=validation_data,
        directory='data/cropped_images_450x300/',
        x_col='short_path',
        y_col=attr_columns,
        target_size=(450,300),
        batch_size=32,
        class_mode='raw')



Found 40277 validated image filenames.
Found 11651 validated image filenames.
Found 9486 validated image filenames.


In [4]:
# define the model
model = Sequential()
n_row = 450
n_col = 300

model.add(Conv2D(10, kernel_size=3, activation='relu', padding = 'same', input_shape=[n_row,n_col,3]))
model.add(BatchNormalization())
model.add(MaxPooling2D(2))
model.add(Conv2D(128, kernel_size=3, activation='relu', padding='same'))
model.add(MaxPooling2D(2))
model.add(Flatten())
model.add(Dense(38, activation='sigmoid'))

In [5]:
# compile model
model.compile(optimizer='adam', loss='binary_crossentropy',
              metrics = ['accuracy'])

In [6]:
# train the model
start = timer()
history = model.fit_generator(train_generator, steps_per_epoch=10, epochs=10, 
                     validation_data=validation_generator, validation_steps=5)

end = timer()
print('First model trained in ',timedelta(seconds=end-start))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
First model trained in  0:06:59.436269


In [8]:
# define a model reusing first layers of the previous one and adding extra convolutional layer
model2 = Sequential(model.layers[:-2])
model2.add(Conv2D(128, kernel_size=3, activation='relu', padding='same'))
model2.add(MaxPooling2D(2))
model2.add(Flatten())
model2.add(Dense(38, activation='sigmoid'))

# freeze layers from the first model
for layer in model2.layers[:-4]:
    layer.trainable = False

In [9]:
# compile model
model2.compile(optimizer='adam', loss='binary_crossentropy',
              metrics = ['accuracy'])

In [11]:
# train the model
start = timer()
history = model2.fit_generator(train_generator, steps_per_epoch=10, epochs=10, 
                     validation_data=validation_generator, validation_steps=5)

end = timer()
print('Model#2 trained in ',timedelta(seconds=end-start))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Model#2 trained in  0:05:41.354718


In [12]:
# define a model reusing first layers of the previous one and adding extra convolutional layer 
# model with three hidden layers
model3 = Sequential(model2.layers[:-2])
model3.add(Conv2D(128, kernel_size=3, activation='relu', padding='same'))
model3.add(MaxPooling2D(2))
model3.add(Flatten())
model3.add(Dense(38, activation='sigmoid'))

# freeze layers from the first model
for layer in model3.layers[:-4]:
    layer.trainable = False

In [13]:
# compile model
model3.compile(optimizer='adam', loss='binary_crossentropy',
              metrics = ['accuracy'])

In [14]:
# train the model
start = timer()
history = model3.fit_generator(train_generator, steps_per_epoch=10, epochs=10, 
                     validation_data=validation_generator, validation_steps=5)

end = timer()
print('Model#3 trained in ',timedelta(seconds=end-start))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Model#3 trained in  0:05:07.327152


I will now unfreeze all layers and perform training on the whole model for few epochs to give the CNN the idea of the full picture.

In [15]:
# unfreeze layers from the first model
for layer in model3.layers[:-4]:
    layer.trainable = True

In [16]:
# compile model
model3.compile(optimizer='adam', loss='binary_crossentropy',
              metrics = ['accuracy'])

In [18]:
# train the model
start = timer()
history = model3.fit_generator(train_generator, steps_per_epoch=10, epochs=5, 
                     validation_data=validation_generator, validation_steps=5)

end = timer()
print('Model#3 retrained in ',timedelta(seconds=end-start))

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Model#3 retrained in  0:04:44.317095


After model training, I will use the latest model to make predictions for 3 test sets and save them for future use.

In [19]:
# predict features for test images from labeled dataset
test_predictions = model3.predict_generator(test_generator)

In [27]:
# predict features for small test set
predicted_small = []
directory = 'data/test_dresses_small/'
for i in range(len(test_dresses_small)):
    image_path = directory + 'img' + str(i) + '.png'
    img = image.load_img(image_path, target_size=(450,300))
    img = image.img_to_array(img)
    img = np.expand_dims(img, axis=0)
    pred = model3.predict(img)
    predicted_small.append(pred)

In [32]:
# predict features for large test set
predicted_large = []
directory = 'data/test_dresses_large/'
for i in range(len(test_dresses_large)):
    image_path = directory + 'img' + str(i) + '.png'
    img = image.load_img(image_path, target_size=(450,300))
    img = image.img_to_array(img)
    img = np.expand_dims(img, axis=0)
    pred = model3.predict(img)
    predicted_large.append(pred)

In [33]:
# create a directory
directory = 'data/predictions/'
if not os.path.isdir(directory):
    os.mkdir(directory)

np.save(directory+'test_predictions_hr.npy', test_predictions)
np.save(directory+'small_predictions_hr.npy', np.asarray(predicted_small))
np.save(directory+'large_predictions_hr.npy', np.asarray(predicted_large))

In [30]:
# save model 
directory = 'data/models/'
if not os.path.isdir(directory):
    os.mkdir(directory)
model3.save(directory+'CNN_trained_on_high_res_images.h5')

In [31]:
notebook_end = timer()
print(timedelta(seconds=notebook_end-notebook_start))

1:38:54.237723
