# **Building an Image Search Engine - Image Encoding**


## Overview


This guided project will help you build an image querying prototype. The project is split into two parts. The first part focuses on building the image encoding system and the second part focuses on building the image query system.

This application has many business use cases such as:

1.  Enabling customers to look for similar apparel, furniture, auto parts etc.
2.  Help eliminate near duplicate images from databases or catalogues.
3.  Enable image to be used as feature embedding for modeling tasks.
4.  Build image based recommendation systems.


## Objectives

After completing this notebook you will be able to:

*   Setup an Image Encoding service that accepts input images and produces embeddings
*   Explore techniques for generating embeddings
*   Generate the embeddings for the dataset and save it on disc


## Setup Runtime

*   we recommend the use of anaconda to manage the runtime.
*   install the dependencies within the Anaconda runtime.


System requirements:

1.  Stable internet access (to download the dataset)
2.  TensorFlow 2.x
3.  Jupyter notebook
4.  2GB of storage


In [None]:
!pip install -U tensorflow

In [None]:
import tensorflow as tf
import csv
import random
import numpy as np
import pandas as pd
from random import shuffle
import zipfile

import PIL
import PIL.Image as Image

from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.utils import to_categorical
from tensorflow.keras import regularizers
from keras.preprocessing.image import ImageDataGenerator

from tensorflow.keras import backend as K
from tensorflow.keras.callbacks import Callback

from keras.applications.inception_v3 import InceptionV3
from keras.applications.vgg16 import VGG16
from keras.applications.vgg19 import VGG19
from keras.applications.xception import Xception

import matplotlib.pyplot as plt

- ## Download Image Dataset

For this prototype we will use a clothing dataset of tshirts/apparel created by [Alexey Grigorev](https://github.com/alexeygrigorev). A fork of the dataset can be found [here](https://github.com/CODAIT/clothing-dataset) on IBM CODAIT's GitHub.

*   Click [the link](https://github.com/CODAIT/clothing-dataset) to download the data manually.
*   Save the downloaded dataset to your local file system.

Alternatively you can use the `git clone` command below to download the dataset within the notebook kernel.


In [None]:
# Download the dataset
!git clone https://github.com/CODAIT/clothing-dataset.git

## Preprocessing the Dataset

We will now build a pandas dataframe that contains image file paths of valid images. We will discard invalid images.
To do this, we will first try to open the images using the `PIL.Image.open()` method and drop the image if the method failed to open the image.


In [None]:
from os import walk 

path = 'clothing-dataset/images/'
filename_list = []

# collect all files
for (dirpath, dirnames, filenames) in walk(path):
    filename_list.extend(filenames)
    break

# validate images
filename_list_verified = []
for index, fname in enumerate(filename_list):
    try:
        im = Image.open(path + fname)
        filename_list_verified.append(fname)
    except Exception as e:
        print('invalid image index:', index)

df = pd.DataFrame(data={'filename': filename_list_verified})
df['full_path_file_name'] = path + df['filename']
df['class'] = "1"
print(df.shape)
df.head()

Once we have a dataframe of all valid images and their full paths on the disc, we store that as a CSV file for quick access later.


In [None]:
pd.DataFrame.to_csv(df, 'image_dictionary.csv', index=False)

##  Implement The LSH TF Layer

Local Sensitive Hashing is a popular hashing technique which we will use in our prototype to generate embeddings for the images.
To perform that, we first need to create a custom `Layer` object in Keras/tensorflow.
We can achieve that by doing the following steps:

*   Implement the custom TF layer by subclass the `tf.keras.layers.Layer` class
*   Implement the `HyperPlane Hashing` in the `call()` method of `layer` class

For more details about Local Sensitive Hashing and HyperPlane Hashing please follow this [link](https://web.stanford.edu/class/cs246/slides/03-lsh.pdf?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkQuickLabsIBMGPXX0W3UEN35891832-2022-01-01)


In [None]:
# perform hyper plan hashing
# the output is intended for LSH algorithm
class HyperPlaneHashingLayer(tf.keras.layers.Layer):

    # perform input independent initialization
    # initialize the random_seed
    def __init__(self, n, random_seed=628):
        """
        @param::random_seed: the random seed used to generate the hyperplan
        @param::n: the number of hashing performed
        return none
        """
        #super(tf.keras.layers.Layer, self).__init__()
        super().__init__()
        self.random_seed = random_seed
        self.n = n
        self.hyperplanes = []

    # initialize the hyperplane matrix based on the input size 
    def build(self, input_shape):
        """
        @param::input_shape: the shape of input tensor
        return none
        """
        if len(input_shape) <= 3:
          raise Exception('> input dimension need to greater than 3.')
        tf.random.set_seed(self.random_seed)
        # each column represent a hashing vector
        self.hyperplanes = K.random_uniform((input_shape[-1], self.n),  
                                              minval=-1., 
                                              maxval=1., 
                                              seed=self.random_seed)

    # return the hyperplane hashing result
    def call(self,input):
        """
        @param::input: the input tensor
        return the hyperplane hashed representation of each input data points
        """
        # scaled = K.mean(input, axis=0) - input
        scaled = tf.reshape(input,(-1,input.shape[-2]*input.shape[-3],input.shape[-1]))
        # print(scaled.shape)
        hash_val = tf.matmul(scaled, self.hyperplanes)
        hash_result = (hash_val) > 0
        return K.cast(hash_result, tf.int32)


##  Build Encoding Network

*   Use the pretrained VGG16 network as the feature extraction network. More detail about [pretrained models](https://keras.io/api/applications/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkQuickLabsIBMGPXX0W3UEN35891832-2022-01-01) in keras.
*   Use the LSH layer built in the previous section to reduce the dimensionality of the embeddings output by the pretrained network


In [None]:
# VGG16
# Preprocessing expected input: 0-255 float32
i = tf.keras.layers.Input([None, None, 3], dtype = tf.float32)
x = tf.keras.applications.vgg16.preprocess_input(i) 
x = VGG16(include_top=False, weights='imagenet', input_shape=(224,224,3))(x)
#x = tf.keras.layers.MaxPool2D()(x)
#x = tf.keras.layers.MaxPool2D()(x)
#x = tf.keras.layers.Flatten()(x)
x = HyperPlaneHashingLayer(500)(x)
model = tf.keras.Model(inputs=[i], outputs=[x])
model.summary()

##  Encode the Images

We now encode all the images in the dataset and store the embeddings on disc for retrieval tasks later in the pipeline.
Since the image dataset is large, it might take a long time to generat the embeddings for all the images. Here we show how to generate embeddings for a subset of the images.


In [None]:
# image subset
df_subset = df[0:128]

In [None]:
df_subset

In [None]:
# init the image generator
image_datagen = ImageDataGenerator()
image_generator = image_datagen.flow_from_dataframe(dataframe=df_subset, 
                                                    x_col="full_path_file_name",
                                                    y_col="class", 
                                                    color_mode='rgb',
                                                    target_size=(224,224),
                                                    shuffle=False,
                                                    validate_filenames=True,
                                                    batch_size=128)

# feed forward the network to generate the encoding
image_encodings = model.predict(image_generator)
print(image_encodings.shape)

In [None]:
# print out a single enconding/embedding
image_encodings[0,:,:]

In [None]:
# Vislualize a single encoding/embedding
plt.imshow(image_encodings[0,:,:])