# Using Keras' Pretrained Neural Networks for Visual Similarity Recommendations

Explore an unsupervised, deep learning-based model. You'll find that the implementation is fairly simple with remarkably promising results which is almost a smack in the face to all of that effort put in earlier.
We are going to build a model-to-model recommender using thumbnail images for models as our input and the visual similarity between models as our recommendation score. I was inspired to do this after reading Christopher Bonnett's post on product classification, so we will follow a similar approach.

Since our goal is to measure visual similarity, we will need to generate features from our images and then calculate some similarity measure between different images using said features. Back in the day, maybe one would employ fancy wavelets or SIFT keypoints or something for creating features, but this is the Era of Deep Learning and manual feature extraction is for old people.

Staying on-trend, we will use a pretrained neural network (NN) to extract features. The NN was originally trained to classify images among 1000 labels (e.g. "dog", "train", etc...). We'll chop off the last 3 fully-connected layers of the network which do the final mapping between deep features and class labels and use the fourth-to-last layer as a long feature vector describing our images.

Thankfully, all of this is extremely simple to do with the pretrained models in Keras. Keras allows one to easily build deep learning models on top of either Tensorflow or Theano. Keras also now comes with pretrained models that can be loaded and used. For more information about the available models, visit the Applications section of the documentation. For our purposes, we'll use the VGG16 model because that's what other people seemed to use and I don't know enough to have a compelling reason to stray from the norm.

Our task is now as follows:

1. Load and process images
2. Feed images through NN.
3. Calculate image similarities.
4. Recommend models!
## Load and process images
The first step, which we won't go through here, was to download all of the image thumbnails. There seems to be a standard thumbnail for each Sketchfab model accessible via their API, so I added a function to the rec-a-sketch crawl.py script to automate downloading of all the thumbnails.
Let's load in our libraries and take a look at one of these images.

In [None]:
import sys
import requests
import os
import glob
import pickle
import time

from IPython.display import display, Image, HTML
from keras.applications import VGG16
from keras.applications.vgg16 import preprocess_input
from keras.preprocessing import image as kimage
import numpy as np
import pandas as pd
from scipy.sparse import lil_matrix as little_matrix
import skimage.io

sys.path.append('../')

In [None]:
file = './img/0fb30f82-d82a-49a2-850a-6bebcb9137f2.jpg'
img = skimage.io.imread(file)

In [None]:
img.shape

In [None]:
Image(filename=file) 

In [None]:
img = kimage.load_img(file, target_size=(224, 224))
x = kimage.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
print(x.shape)

We see that the image can be represented as a 3D matrix through two spatial dimensions (200 x 200) and then a third RGB dimension. We have to do a couple of preprocessing steps before feeding an image through the VGG16 model. The images must be resized to 224 x 224, the color channels must be normalized, and an extra dimension must be added due to Keras expecting to recieve multiple models. Thankfully, Keras has built-in functions to handle most of this.

We can now load our model in and try feeding the image through.
(VGG16 model for Keras)[https://gist.github.com/baraldilorenzo/07d7802847aaad0a35d3]
This is the Keras model of the 16-layer network used by the VGG team in the ILSVRC-2014 competition.

It has been obtained by directly converting the Caffe model provived by the authors.

Details about the network architecture can be found in the following arXiv paper:

Very Deep Convolutional Networks for Large-Scale Image Recognition
K. Simonyan, A. Zisserman
arXiv:1409.1556

In [None]:
# image_top=False removes final connected layers
model = VGG16(include_top=False, weights='imagenet')

In [None]:
pred = model.predict(x)
print(pred.shape)
print(pred.ravel().shape)

We will later have to flatten the output of the model into a long feature vector. One thing that should be noted is the time that it takes to run a single model though the NN on my 4-core machine

In [None]:
%%timeit -n25
pred = model.predict(x)

### Feed images through NN
With our set of valid model IDs in hand, we can now run through the long process of loading in all of the image files, preprocessing them, and running them through the VGG prediction. This takes a long time, and certain steps blowup memory. I've decided to batch things up below and include some print statements so that one can track progress. Beware: this takes a long time!

In [None]:
directory='img'
items=[]
for filename in os.listdir(directory):
    if filename.endswith(".jpg") or filename.endswith(".png"):
        items.append(filename)

idx_to_mid = {}
batch_size = 25
min_idx = 0
max_idx = min_idx + batch_size
total_max = len(items)
print(total_max)
n_dims = pred.ravel().shape[0]
px = 224
print(n_dims)

# Initialize predictions matrix
preds = little_matrix((len(items), n_dims))

import PIL

while min_idx < total_max - 1:
    t0 = time.time()
    X = np.zeros(((max_idx - min_idx), px, px, 3))
    # For each file in batch, 
    # load as row into X
    for i in range(min_idx, max_idx):
        print(i)
        item = items[i]
        idx_to_mid[i] = item
        
        image='./'+ directory +'/'+item
        img = kimage.load_img(image, target_size=(px, px))
        img_array = kimage.img_to_array(img)
               
        X[i - min_idx, :, :, :] = img_array
        if i % 200 == 0 and i != 0:
            t1 = time.time()
            print('{}: {}'.format(i, (t1 - t0) / i))
            t0 = time.time()
            
    max_idx = i
    t1 = time.time()
    print('{}: {}'.format(i, (t1 - t0) / i))
    
    print('Preprocess input')
    t0 = time.time()
    X = preprocess_input(X)
    t1 = time.time()
    print('{}'.format(t1 - t0))
    
    print('Predicting')
    t0 = time.time()
    these_preds = model.predict(X)
    shp = ((max_idx - min_idx) + 1, n_dims)
    
    # Place predictions inside full preds matrix.
    preds[min_idx:max_idx + 1, :] = these_preds.reshape(shp)
    t1 = time.time()
    print('{}'.format(t1 - t0))
    
    min_idx = max_idx
    max_idx = np.min((max_idx + batch_size, total_max))

In [None]:
def cosine_similarity(ratings):
    sim = ratings.dot(ratings.T)
    if not isinstance(sim, np.ndarray):
        sim = sim.toarray()
    norms = np.array([np.sqrt(np.diagonal(sim))])
    return (sim / norms / norms.T)

In [None]:
from keras.models import load_model

In [None]:
# Return a copy of this matrix in Compressed Sparse Row format
# from keras.models import load_model
# model.save('my_model.h5')  # creates a HDF5 file 'my_model.h5'
# del model  # deletes the existing model
# returns a compiled model

# model = load_model('my_model.h5')

preds = preds.tocsr()
# save the model

# Compute cosine similarity
sim = cosine_similarity(preds)

In [None]:
def get_thumbnails(sim, idx, idx_to_mid, N=10):
    row = sim[idx, :]
    thumbs = []
    mids = []
    for x in np.argsort(-row)[:N]:
        thumbs.append(idx_to_mid[x])
        mids.append(idx_to_mid[x])
    return thumbs, mids

def display_thumbs(thumbs, mids, N=5):
    thumb_html = "<a href='{}' target='_blank'>\
                  <img style='width: 300px; margin: 0px; \
                  float: left; border: 1px solid black; display:inline-block' \
                  src='./img/{}' /></a>"
    images = "<div class='line' style='max-width: 1024px; display: block;'>"
    display(HTML('<font size=5>'+'Input Model'+'</font>'))
    link = './'+ directory +'/{}'.format(mids[0])
    url = thumbs[0]
    display(HTML(thumb_html.format(link, url)))
    display(HTML('<font size=5>'+'Similar Models'+'</font>'))

    for (url, mid) in zip(thumbs[1:N+1], mids[1:N+1]):
        link = './'+ directory +'/{}'.format(mid)
        images += thumb_html.format(link, url)

    images += '</div>'
    display(HTML(images))


In [None]:
import pprint
# pprint.pprint(idx_to_mid)
# print(sim)
display_thumbs(*get_thumbnails(sim, 290, idx_to_mid, N=14), N=14)