# Building an Inverse Image Search Service

Leveraging the power of transfer learning can be very useful in many applications. Particularly, in this notebook we'll harness the readiness of pre-trained network to build an inverse image search engine or, more clearly, a "search by example" service.

Of course, all deep learning tasks begin with data acquisition, so this will be no different. 

The images we will use will be a subset of Imagenet and then passed through a pre-trained network to compile a "embedding" dictionary that'll later on will allow us to fetch similar images using a simple nearest neighbor search.

## Prerequisites

Let's import the libraries we'll need.

In [12]:
%matplotlib inline

from sklearn.neighbors import NearestNeighbors
import matplotlib.pyplot as plt
import requests
from functools import partial
import random
import os
import base64
from sklearn.decomposition import TruncatedSVD
from keras.models import Model
from hashlib import md5
import pickle
from urllib.parse import unquote
from urllib.request import urlretrieve
from PIL import Image
from io import BytesIO
from IPython.display import HTML, Image as IPImage, display
import numpy as np
import glob
from tqdm import tqdm
from keras.applications.inception_v3 import InceptionV3, preprocess_input
from keras.preprocessing import image
import io
import multiprocessing

## Getting Images from Imagenet

Imagenet is an enormous corpus of Images that span among many thousands of classes. Of course, training a network from scratch on a dataset this big is a titanic endeavor. That's why most of the pretrained models we use in Keras have been trained on Imagenet already, probably for lots and epochs and with lots of computational power.

Imagenet alone is around 100GB big. However, we can workaround this volume constraint by only fetching those photos required to compile our subset. So, if we want to compile a subset of 1000 images, we just need to fetch those. 

At `/floyd/input/imagenet_urls/fall11_urls.txt` there's a complete list of the URLs of the images that comprise the Imagenet dataset.

In [2]:
IMAGENET_URLS_LOCATION = '/floyd/input/imagenet_urls/fall11_urls.txt'

Let's load the images' urls and ids in memory:

In [3]:
images_metadata = []

with io.open(IMAGENET_URLS_LOCATION, mode='r', encoding='utf-8', errors='replace') as f:
    for row in f:
        elements = row.split()
        image_id = elements[0]
        url = elements[-1]
        url = url.strip()
        images_metadata.append({'id': image_id, 'url': url})
        
print(f'Loaded metadata of {len(images_metadata)} images.')
random.sample(images_metadata, 5)

Loaded metadata of 14197122 images.


[{'id': 'n14976759_2687',
  'url': 'http://www.kloeber-hpi.biz/uk/products/images/Profile-Line_Single_Pantile_Tile_Vent_installed.jpg'},
 {'id': 'n09763784_33380',
  'url': 'http://farm4.static.flickr.com/3475/3395486790_a2bd146ccf.jpg'},
 {'id': 'n13173259_157',
  'url': 'http://stevevaughn.com/panoramics/images/4.jpg'},
 {'id': 'n03540595_44191',
  'url': 'http://farm2.static.flickr.com/1057/988242337_d6cb527b74.jpg'},
 {'id': 'n03099147_6748',
  'url': 'http://www.whitesplumbing.com/images/forced_hot_water_steam_heat.jpg'}]

Let's write a function to fetch a single image.

In [4]:
def fetch_image(destination_directory_path, image_metadata):
    try:
        destination_directory_path = os.path.join(destination_directory_path, image_metadata['id'] + '.jpg')
        urlretrieve(image_metadata['url'], destination_directory_path)
    except:
        print(f'Error fetching {image_metadata}')

In [5]:
SAMPLE_INDEX = 0
fetch_image('.', images_metadata[SAMPLE_INDEX])

In [6]:
image = Image.open(f'./{images_metadata[SAMPLE_INDEX]["id"]}.jpg')
plt.imshow(image)

FileNotFoundError: [Errno 2] No such file or directory: './n00004475_6590.jpg'

Good. Now that we can fetch a single image, let's write a function to fetch a subset of them:

In [7]:
def fetch_image_set(images_metadata, directory_name='./data', size=8000, randomize=True):
    if not os.path.exists(directory_name):
        os.makedirs(directory_name)
        
    image_set_metadata = []
    
    if randomize:
        image_set_metadata = random.sample(images_metadata, size)
    else:
        image_set_metadata = images_metadata[:size]
        
    with multiprocessing.Pool() as pool:
        pool.map(partial(fetch_image, directory_name), image_set_metadata)

In [8]:
fetch_image_set(images_metadata, size=100)

Error fetching {'id': 'n12537253_1280', 'url': 'http://www.sparklinglotusink.com/sitebuilder/images/RSredsnaps-328x240.jpg'}
Error fetching {'id': 'n02768655_5263', 'url': 'http://www.lp-support.com/images/715.JPG'}
Error fetching {'id': 'n03770316_16038', 'url': 'http://www.yexian.gov.cn/uploadfile/jpg/2008-4/2008430104547301.jpg'}
Error fetching {'id': 'n03943714_3605', 'url': 'http://www.brolliesgalore.co.uk/acatalog/pic_CT_PinstripeDome5.jpg'}
Error fetching {'id': 'n02358091_5816', 'url': 'http://www.southwestbirders.com/SD_20020824/ca%20ground%20squirrel%20_001_1_thumb.jpg'}
Error fetching {'id': 'n04535370_16516', 'url': 'http://www.nieuwvast.nl/images/Vrijstaand-herenhuis.jpg'}
Error fetching {'id': 'n03968293_18514', 'url': 'http://www.ecofriend.org/images/die_electric1_2405.jpg'}
Error fetching {'id': 'n03852280_1974', 'url': 'http://www.termomed.net/images/71140.jpg'}
Error fetching {'id': 'n03415252_11946', 'url': 'http://img.2dehands.nl/f/view/58638551-zware-vissers-parapl

As we can see, there were a bunch of errors. Let's see effectively how many items our dataset has:

In [10]:
dataset_size = len(os.listdir('./data'))
print(f'Dataset size: {dataset_size}')

Dataset size: 76


It seems there're enough images for the purposes of our project. Let's move on!

## Projecting Images into an N-Dimensional Space

Given the set of images we gathered in the last step, we can organize them such that similar images are near each other in the vectorial space produced by the last-but-one layer of a pre-trained network. In other words, treat the weights of these layer as _image embeddings_.

Let's start by loading a pre-trained model.

In [13]:
base_model = InceptionV3(weights='imagenet', include_top=True)
base_model.summary()

Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.5/inception_v3_weights_tf_dim_ordering_tf_kernels.h5
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            (None, None, None, 3 0                                            
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, None, None, 3 864         input_1[0][0]                    
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, None, None, 3 96          conv2d_1[0][0]                   
__________________________________________________________________________________________________
activation_1 (Activation)       (None, None, None, 3 0         

Our target layer is `avg_pool`, which produces outputs of 2048 elements. This will be the dimension of our image embeddings:

In [22]:
model = Model(inputs=base_model.input, outputs=base_model.get_layer('avg_pool').output)

Good. Let's create a model that given an image, return its vectorial representation:

In [23]:
images = []

for picture in glob.glob('./data/*.jpg'):
    try:
        images.append(image.load_img(picture))
    except:
        print(f'Could not load image {picture}')

Could not load image ./data/n02127292_11034.jpg
Could not load image ./data/n00480993_5229.jpg
Could not load image ./data/n07713395_3431.jpg
Could not load image ./data/n07886463_606.jpg


In [None]:
def get_vector(model, img):
    if not isinstance(img, list):
        images = [img]
    else:
        images = img
        
    target_size = (299, 299)  # Default for ImagenetV3
    images = [i.resize(target_size, Image.ANTIALIAS)
             for i in images]
    
    numpy_images = [image.img_to_array(i) for i in images]
    pre_processed_images = preprocess_input(np.asarray(numpy_images))
    
    return model.predict(pre_processed_images)

x = get_vector(model, images)
x.shape