<a href="https://colab.research.google.com/github/cmagliano/Proj/blob/main/ImageSearchOnE_CommerceSites.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Perform Image-Driven Reverse Image Search on E-Commerce Sites with ImageBind and Qdrant

Author:Cláudia Magliano

Date:14/07/2024



**install the dependencies first to get started with the reverse product image search.**

In [1]:
!pip install opendatasets gradio qdrant-client transformers sentence_transformers sentencepiece tqdm

Collecting opendatasets
  Downloading opendatasets-0.1.22-py3-none-any.whl (15 kB)
Collecting gradio
  Downloading gradio-4.38.1-py3-none-any.whl (12.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.4/12.4 MB[0m [31m16.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting qdrant-client
  Downloading qdrant_client-1.10.1-py3-none-any.whl (254 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m254.1/254.1 kB[0m [31m19.7 MB/s[0m eta [36m0:00:00[0m
Collecting sentence_transformers
  Downloading sentence_transformers-3.0.1-py3-none-any.whl (227 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m227.1/227.1 kB[0m [31m23.1 MB/s[0m eta [36m0:00:00[0m
Collecting aiofiles<24.0,>=22.0 (from gradio)
  Downloading aiofiles-23.2.1-py3-none-any.whl (15 kB)
Collecting altair<6.0,>=5.0 (from gradio)
  Downloading altair-5.3.0-py3-none-any.whl (857 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m857.8/857.8 kB[0m [31m

**Loading the Dataset**

Using the opendatasets library, download the Kaggle dataset using your username and key. You can obtain them by visiting the Settings page on Kaggle. Click on “Access API Keys,” and a kaggle.json file will be downloaded. This file will contain your username and API key.

In [2]:
import opendatasets as od
od.download("https://www.kaggle.com/datasets/vikashrajluhaniwal/fashion-images")
#user: cvmagliano

Please provide your Kaggle credentials to download this dataset. Learn more: http://bit.ly/kaggle-creds
Your Kaggle username: cvmagliano@gmail.com
Your Kaggle Key: ··········
Dataset URL: https://www.kaggle.com/datasets/vikashrajluhaniwal/fashion-images
Downloading fashion-images.zip to ./fashion-images


100%|██████████| 335M/335M [00:11<00:00, 30.3MB/s]





**Storing the images in a list so that we can easily access the images.**

In [3]:
import random
import gradio as gr
from PIL import Image
from qdrant_client import QdrantClient
from qdrant_client.http import models
import tempfile
import os
from tqdm import tqdm

def get_image_paths(directory):
  # Initialize an empty list to store the image paths
  image_paths = []
  # Iterate through all files and directories within the given directory
  for (root, dirs, files) in os.walk(directory):
    for file in files:
       # Check if the file has an image extension (e.g., .jpg, .png, .jpeg, etc.)
      if file.lower().endswith(('.png', '.jpg', '.jpeg', '.gif', '.bmp')):
         # Construct the full path to the image file
         image_path = os.path.join(root, file)
         # Append the image path to the list
         image_paths.append(image_path)
  return image_paths

# Directory paths
women_directory = './fashion-images/data/Footwear/Women/Images/images_with_product_ids/'
men_directory = './fashion-images/data/Footwear/Men/Images/images_with_product_ids/'
girls_directory = './fashion-images/data/Apparel/Girls/Images/images_with_product_ids/'
boys_directory = './fashion-images/data/Apparel/Boys/Images/images_with_product_ids/'


# Get image paths for different categories
image_paths_Women = get_image_paths(women_directory)
image_paths_Men = get_image_paths(men_directory)
image_paths_Girls = get_image_paths(girls_directory)
image_paths_Boys = get_image_paths(boys_directory)

all_image_paths = []
all_image_paths.append(image_paths_Boys)
all_image_paths.append(image_paths_Girls)
all_image_paths.append(image_paths_Men)
all_image_paths.append(image_paths_Women)



**Initializing the Qdrant Vector DB**

Initialize the Qdrant Client with in-memory storage. The collection name will be “imagebind_data” and we will be using cosine distance.

In [6]:
# Initialize Qdrant client and load collection
client = QdrantClient(":memory:")
client.recreate_collection(collection_name = "imagebind_data",
vectors_config = {"image": models.VectorParams( size = 1024, distance = models.Distance.COSINE ) } )

  client.recreate_collection(collection_name = "imagebind_data",


True

**Image Embeddings with ImageBind**

ImageBind is an innovative model developed by Meta AI’s FAIR Lab. This model is designed to learn a joint embedding across six different modalities: images, text, audio, depth, thermal, and IMU data. One of the key features of ImageBind is its ability to learn this joint embedding without requiring all combinations of paired data. It has been discovered that only image-paired data is necessary to bind the modalities together effectively. This unique capability allows ImageBind to leverage recent large-scale vision-language models and extend their zero-shot capabilities to new modalities simply by utilizing their natural pairing with images.

mageBind will be used for creating embeddings, but before diving deep, first, we follow some steps required for installing ImageBind.

In [9]:
!git clone https://github.com/facebookresearch/ImageBind.git

Cloning into 'ImageBind'...
remote: Enumerating objects: 146, done.[K
remote: Counting objects: 100% (99/99), done.[K
remote: Compressing objects: 100% (60/60), done.[K
remote: Total 146 (delta 61), reused 48 (delta 39), pack-reused 47[K
Receiving objects: 100% (146/146), 2.65 MiB | 26.55 MiB/s, done.
Resolving deltas: 100% (66/66), done.


In [11]:
os.chdir('./ImageBind')

In [12]:
!pip install -r requirements.txt

Collecting pytorchvideo@ git+https://github.com/facebookresearch/pytorchvideo.git@28fe037d212663c6a24f373b94cc5d478c8c1a1d (from -r requirements.txt (line 4))
  Cloning https://github.com/facebookresearch/pytorchvideo.git (to revision 28fe037d212663c6a24f373b94cc5d478c8c1a1d) to /tmp/pip-install-zkypez5r/pytorchvideo_f7220c7e94124a2d993c5105c20e1fd0
  Running command git clone --filter=blob:none --quiet https://github.com/facebookresearch/pytorchvideo.git /tmp/pip-install-zkypez5r/pytorchvideo_f7220c7e94124a2d993c5105c20e1fd0
  Running command git rev-parse -q --verify 'sha^28fe037d212663c6a24f373b94cc5d478c8c1a1d'
  Running command git fetch -q https://github.com/facebookresearch/pytorchvideo.git 28fe037d212663c6a24f373b94cc5d478c8c1a1d
  Running command git checkout -q 28fe037d212663c6a24f373b94cc5d478c8c1a1d
  Resolved https://github.com/facebookresearch/pytorchvideo.git to commit 28fe037d212663c6a24f373b94cc5d478c8c1a1d
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collectin

**Load the model.**

In [13]:
import sys
sys.path.append("./ImageBind/")
device = "cuda"
import imagebind
from imagebind.models import imagebind_model
model = imagebind_model.imagebind_huge(pretrained=True)
model.eval()
model.to(device)



Downloading imagebind weights to .checkpoints/imagebind_huge.pth ...


  0%|          | 0.00/4.47G [00:00<?, ?B/s]

ImageBindModel(
  (modality_preprocessors): ModuleDict(
    (vision): RGBDTPreprocessor(
      (cls_token): tensor((1, 1, 1280), requires_grad=True)
      
      (rgbt_stem): PatchEmbedGeneric(
        (proj): Sequential(
          (0): PadIm2Video()
          (1): Conv3d(3, 1280, kernel_size=(2, 14, 14), stride=(2, 14, 14), bias=False)
        )
      )
      (pos_embedding_helper): SpatioTemporalPosEmbeddingHelper(
        (pos_embed): tensor((1, 257, 1280), requires_grad=True)
        
      )
    )
    (text): TextPreprocessor(
      (pos_embed): tensor((1, 77, 1024), requires_grad=True)
      (mask): tensor((77, 77), requires_grad=False)
      
      (token_embedding): Embedding(49408, 1024)
    )
    (audio): AudioPreprocessor(
      (cls_token): tensor((1, 1, 768), requires_grad=True)
      
      (rgbt_stem): PatchEmbedGeneric(
        (proj): Conv2d(1, 768, kernel_size=(16, 16), stride=(10, 10), bias=False)
        (norm_layer): LayerNorm((768,), eps=1e-05, elementwise_affine=

** Initializing the model, we will now create embeddings.**

In [14]:
from imagebind.models.imagebind_model import ModalityType
from imagebind import data
import torch
embeddings_list = []

for image_paths in [image_paths_Boys, image_paths_Girls, image_paths_Men, image_paths_Women]:
  inputs = {ModalityType.VISION: data.load_and_transform_vision_data(image_paths, device)}
  with torch.no_grad():
    embeddings = model(inputs)
  embeddings_list.append(embeddings)

FileNotFoundError: [Errno 2] No such file or directory: './fashion-images/data/Apparel/Boys/Images/images_with_product_ids/35877.jpg'