# Azure Computer Vision 4 (Florence)

## Fashion Visual Search - Gradio App

![Image](florence.jpg)

![Image](fashionheader.png)

<br>
<i>Note: this image was generated with Azure Open AI Dall-e 2</i>

### Visual search with vector embeddings
**Vector embeddings** are a way of representing content such as text or images as vectors of real numbers in a high-dimensional space. These embeddings are often learned from large amounts of textual and visual data using machine learning algorithms like neural networks. Each dimension of the vector corresponds to a different feature or attribute of the content, such as its semantic meaning, syntactic role, or context in which it commonly appears. By representing content as vectors, we can perform mathematical operations on them to compare their similarity or use them as inputs to machine learning models.

![Image](embeddings.jpg)


### Business applications
- **Digital asset management**: Image retrieval can be used to manage large collections of digital images, such as in museums, archives, or online galleries. Users can search for images based on visual features and retrieve the images that match their criteria.
- **Medical image retrieval**: Image retrieval can be used in medical imaging to search for images based on their diagnostic features or disease patterns. This can help doctors or researchers to identify similar cases or track disease progression.
- **Security and surveillance**: Image retrieval can be used in security and surveillance systems to search for images based on specific features or patterns, such as in, people & object tracking, or threat detection.
- **Forensic image retrieval**: Image retrieval can be used in forensic investigations to search for images based on their visual content or metadata, such as in cases of cyber-crime.
- **E-commerce**: Image retrieval can be used in online shopping applications to search for similar products based on their features or descriptions or provide recommendations based on previous purchases.
- **Fashion and design**: Image retrieval can be used in fashion and design to search for images based on their visual features, such as color, pattern, or texture. This can help designers or retailers to identify similar products or trends.

### Visual Search Process
![Image](fashionprocess.png)

### Image Retrieval with Azure Computer Vision Documentation
- https://learn.microsoft.com/en-us/azure/cognitive-services/computer-vision/concept-image-retrieval
- https://learn.microsoft.com/en-us/azure/cognitive-services/computer-vision/how-to/image-retrieval

### Demo images
Demo images are a sample of this collection of images: https://www.kaggle.com/competitions/h-and-m-personalized-fashion-recommendations/data
<br><br>
> Serge Retkowsky | Microsoft | https://github.com/retkowsky | 3rd of May, 2023

## 1. <a name="chapt1"></a> Librairies

In [1]:
%load_ext nb_black

ModuleNotFoundError: No module named 'nb_black'

In [2]:
import datetime
import glob
import gradio as gr
import json
import os
import pandas as pd
import requests
import sys
import time

from PIL import Image

In [3]:
# Getting Azure CV endpoint and key from the azure.env file
from dotenv import load_dotenv

load_dotenv("azure.env")
key = os.getenv("azure_cv_key")
endpoint = os.getenv("azure_cv_endpoint")

### Importing our specific functions

In [4]:
pyfile = "azure.py"

print("Python file:", pyfile, "Date:", time.ctime(os.path.getmtime(pyfile)))

Python file: azure.py Date: Thu Jul  6 18:00:33 2023


In [5]:
from azure import (
    get_cosine_similarity,
    image_embedding,
    text_embedding,
    remove_background,
)

## 2. <a name="chapt2"></a> Informations

In [6]:
sys.version

'3.11.1 (main, Mar 24 2023, 00:55:06) [GCC 11.3.0]'

In [7]:
print("Today is", datetime.datetime.today())

Today is 2023-07-06 19:28:19.842189


## 3. <a name="chapt3"></a> Our products images

In [8]:
IMAGES_DIR = "fashion"

In [9]:
image_files = glob.glob(IMAGES_DIR + "/*")

print("Directory of images:", IMAGES_DIR)
print("Total number of catalog images =", "{:,}".format(len(image_files)))

Directory of images: fashion
Total number of catalog images = 1,473


## 4. <a name="chapt4"></a> Loading vector embeddings

In [10]:
JSON_DIR = "json"

glob.glob(JSON_DIR + "/*.json")

['json/img_embed_06Jul2023_191005.json']

In [11]:
print("Importing vectors embeddings...")

jsonfiles = [entry.name for entry in os.scandir(JSON_DIR) if entry.is_file()]
jsonfiles = [f for f in jsonfiles if os.path.isfile(os.path.join(JSON_DIR, f))]

# Get the most recent file
modification_times = [
    (f, os.path.getmtime(os.path.join(JSON_DIR, f))) for f in jsonfiles
]
modification_times.sort(key=lambda x: x[1], reverse=True)
most_recent_file = JSON_DIR + "/" + modification_times[0][0]

# Loading the most recent file
print(f"Loading the most recent file of the vector embeddings: {most_recent_file}")

with open(most_recent_file) as f:
    list_emb = json.load(f)

print(f"\nDone: number of imported vector embeddings = {len(list_emb):,}")

Importing vectors embeddings...
Loading the most recent file of the vector embeddings: json/img_embed_06Jul2023_191005.json

Done: number of imported vector embeddings = 1,473


## 5. <a name="chapt5"></a> Gradio webapp

### Generic gradio elements

In [12]:
footnote = "Powered by Azure Computer Vision 4 (Florence)"

### Visual Search using an image

In [13]:
def visual_search_from_image_app(image, list_emb=list_emb, topn=3):
    """
    Function for visual search using an image for the gradio app
    """
    # Reference image embeddding
    nobackground_image = remove_background(image)
    image_emb = image_embedding(nobackground_image)
    # Comparing with all the images embeddings
    results_list = [
        get_cosine_similarity(image_emb, emb_image) for emb_image in list_emb
    ]
    # Topn results
    df = pd.DataFrame(
        list(zip(image_files, results_list)), columns=["image_file", "similarity"]
    )
    df = df.sort_values("similarity", ascending=False)
    topn_list = df.nlargest(topn, "similarity")["image_file"].tolist()

    return topn_list

In [15]:
header_image = "Visual Search with Azure Computer Vision (Florence) using an image"
images_examples = [
    "test/test1.jpg",
    "test/test2.jpg",
    "test/test3.jpg",
    "test/test4.jpg",
]

topn_list_images = [""] * 3
refimage = gr.components.Image(label="Your image:", type="filepath", shape=((200, 200)))
list_img_results_prompt = [
    gr.components.Image(
        label=f"Top {i+1}: {topn_list_images[i]}", type="filepath", shape=((200, 200))
    )
    for i in range(3)
]

webapp_image = gr.Interface(
    visual_search_from_image_app,
    refimage,
    list_img_results_prompt,
    title=header_image,
    examples=images_examples,
    theme="gstaff/sketch",
    article=footnote,
)

### We can run this app

In [16]:
webapp_image.launch(share=True)

Running on local URL:  http://127.0.0.1:7860
Running on public URL: https://61cf1121b92457e9fb.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




Removing background from the image using Azure Computer Vision 4.0...
Done


## 6. <a name="chapt6"></a> Visual search using some text

In [17]:
def visual_search_from_prompt_app(query, list_emb=list_emb, topn=3):
    """
    Function for visual search using a prompt for the gradio app
    """
    # Text Embedding of the prompt
    text_emb = text_embedding(query)
    # Comparing the Text embedding with all the images embeddings
    results_list = [
        get_cosine_similarity(text_emb, emb_image) for emb_image in list_emb
    ]
    # Top5 results
    df = pd.DataFrame(
        list(zip(image_files, results_list)), columns=["image_file", "similarity"]
    )
    df = df.sort_values("similarity", ascending=False)
    topn_list = df.nlargest(topn, "similarity")["image_file"].tolist()

    return topn_list

In [18]:
header_prompt = "Visual Search with Azure Computer Vision (Florence) using a prompt"

prompt_examples = [
    "a red dress",
    "a red dress with long sleeves",
    "blue shirt",
    "shirt with Italian cities name",
    "Ray-Ban",
    "NYC cap",
]

topn_list_prompt = [""] * 3
prompt = gr.components.Textbox(
    lines=2,
    label="What do you want to search?",
    placeholder="Enter your prompt for the visual search...",
)

list_img_results_image = [
    gr.components.Image(label=f"Top {i+1}: {topn_list_prompt[i]}", type="filepath")
    for i in range(3)
]

webapp_prompt = gr.Interface(
    visual_search_from_prompt_app,
    prompt,
    list_img_results_image,
    title=header_prompt,
    examples=prompt_examples,
    theme="gstaff/sketch",
    article=footnote,
)

### We can run this app

In [19]:
webapp_prompt.launch(share=True)

Running on local URL:  http://127.0.0.1:7861
Running on public URL: https://d88128f48a488ee8d7.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




## Unified webapp
### We can combine the 2 apps into a single one

In [19]:
# Combining the two gradio apps into one

visual_search_webapp = gr.TabbedInterface(
    [webapp_prompt, webapp_image],
    ["1) Visual search from a prompt", "2) Visual search from an image"],
    css="body {background-color: black}",
    theme="freddyaboulton/dracula_revamped",  # Themes: https://huggingface.co/spaces/gradio/theme-gallery
)

visual_search_webapp.launch(share=True)



Running on local URL:  http://127.0.0.1:7860
Running on public URL: https://c59fd6302c8c0d36e6.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces




<IPython.core.display.Javascript object>

![Image](webapp1.jpg)

![Image](webapp2.jpg)
