# The Price Is Right: Predicting Prices with Product Images

### Milestone Report

------------

**Steven Chen, Edward Chou, Richard Yang**

(Edward Chou and Richard Yang are not part of 230, but are part of 229.)



## Loading Packages

---------------

For this project, we choose to use Keras with a Tensorflow backend. Keras is well suited for building complex CNNs, and we have experience with both Tensorflow and Keras from the CS230 programming assignments.

In [None]:
import csv
import math

import matplotlib.pyplot as plt
import numpy as np

from keras import applications
from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential, Model
from keras.layers import Dropout, Flatten, Dense, Input
from keras.initializers import glorot_uniform

from keras.applications.vgg16 import preprocess_input

from sklearn.model_selection import train_test_split




We initialize the VGG-16 network without the final (top) layer, using the learned ImageNet weights. VGG-16 is a very deep CNN trained for object recognition on the ImageNet challenge.

In [None]:
# build the VGG16 network
input_tensor = Input(shape=(224,224,3))
model = applications.VGG16(weights='imagenet', include_top=False, input_tensor = input_tensor)

We build our own layer on top of VGG. In particular, we flatten the final feature mapping of VGG-16 (consisting of 512 7 by 7 filters) into a single dimension. We then add a fully connected layer of 256 hidden units with ReLU activations, and use uniform Xavier initialization.

We finish our model with an output layer of a single linear activation neuron, which will output the predicted price.

In [None]:
# build a classifier model to put on top of the convolutional model
top_model = Sequential()
print(model.output_shape[1:])
top_model.add(Flatten(input_shape=(model.output_shape[1:])))


# Output layer
# We do random weight intialization
# Maybe this is why our loss is so bad?
top_model.add(Dense(256, activation='relu', kernel_initializer='glorot_uniform'))
top_model.add(Dense(1, activation='linear', name='output', kernel_initializer='glorot_uniform'))

We set the pretrained VGG layers to be non-trainable so that we do spend time learning them. Instead, our learning will focus on the new layers we have added.

In [None]:
# add the model on top of the convolutional base
new_model = Model(inputs= model.input, outputs = top_model(model.output))

# set the first 19 layers (up to the last conv block)
# to non-trainable (weights will not be updated)
for layer in new_model.layers[:19]:
    layer.trainable = False

new_model.summary()

Above, we can see our added layer as sequential_2. Only our new layer is trainable: the rest are not.

We compile the model using mean squared error as the loss (since we are performing regression), and use an RMSprop optimizer.

In [None]:

# SGD
#new_model.compile(loss='mean_squared_error',
#              optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
#              metrics=['accuracy'])

# RMSprop
new_model.compile(loss='mean_squared_error',
                  optimizer=optimizers.RMSprop(lr=0.01, rho=0.9, epsilon=1e-07, decay=0.0))

We use the great Keras ImageDataGenerator to process our images. We rescale the image colors to be between 0 and 1, then perform mean subtraction on each image channel, in order to help our images be more standardized and similar to images the VGG network has seen before.

Our dataset relies on the bike images and the price csv to be in the root directory, because that is where FloydHub puts them. As of now, we read in the images into a large numpy array, then feed this into the network. We hit memory issues when trying to load all 20000 plus images, so for now we load a smaller subset.

In [None]:
# read the CSV into memory
prices = []
image_paths = []

data_path = "../datasets/bikes_im/"
with open("../datasets/bikes_filtered.csv") as file:
    reader = csv.reader(file)
    i = -1
    for row in reader:
        i += 1
        index = row[0]
        name = row[1]
        msrp = row[2]
        
        image_path = data_path + index + '.jpg'
        image_paths.append(image_path)
        prices.append(int(msrp))

        
def image_generator(indices, batch_size):

    num_batches = int(len(indices) / batch_size)
    
    while True:
        for batch_i in range(num_batches):
            if batch_i == num_batches - 1:
                # special case: return as many as possible
                start_i = batch_i * batch_size
                batch_indices = indices[start_i:]
                
                X = np.zeros((len(batch_indices), 224, 224, 3))
                Y = np.zeros((len(batch_indices), 1))
            
            else:
                start_i = batch_i * batch_size
                end_i = start_i + batch_size

                batch_indices = indices[start_i:end_i]

                X = np.zeros((batch_size, 224, 224, 3))
                Y = np.zeros((batch_size, 1))
            
            for i, index in enumerate(batch_indices):
                img = image.load_img(image_paths[index], target_size=(224, 224))
                X[i, :, :, :] = image.img_to_array(img)                
                Y[i] = prices[index]
            
            # use vgg16 preprocessing
            X = preprocess_input(X)
            
            yield (X, Y)

In [None]:
# create random permutation of number of data points, then cut
# into train/test split

# we have 21843 bike images total.
dataset_indices = np.random.permutation(21843)

# 90% train, 10% test
cutoff = int(len(dataset_indices) * 0.9)
train_indices = dataset_indices[:cutoff]
test_indices = dataset_indices[cutoff:]

print(len(train_indices))
print(len(test_indices))

In [None]:
epochs = 3
minibatch_size = 32

train_steps = math.ceil(len(train_indices) / minibatch_size)
test_steps = math.ceil(len(test_indices) / minibatch_size)

# fine-tune the model
new_model.fit_generator(
    image_generator(train_indices, minibatch_size),
    steps_per_epoch=train_steps,
    epochs=epochs,
    validation_data=image_generator(test_indices, minibatch_size),
    nb_val_samples=test_steps)