# Generating Image Embeddings

Multimodal embeddings APIs of Azure AI Vision let you search for images using text. They turn images and text into coordinates in a multi-dimensional space, matching them based on how closely they relate in meaning. This means you can find images without needing tags or metadata, often getting better search results.

To better understand how embeddings work, you can watch my video that explains this in this LinkedIn Video: 
https://www.linkedin.com/learning/building-rag-solutions-with-azure-ai-foundry-formerly-azure-ai-studio/vector-embeddings-how-words-connect-to-each-other

Further Reading:

https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/concept-image-retrieval

https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/how-to/image-retrieval?tabs=python

## Load Azure Configuration

In [1]:
import os

azure_computer_vision_endpoint = os.environ["AZURE_COMPUTER_VISION_ENDPOINT"]
azure_computer_vision_key = os.environ["AZURE_COMPUTER_VISION_KEY"]

## Call the Vectorize Image API

In [2]:
import requests

# Function to vectorize an image
def vectorize_image(image_source, is_url=True):
    # API URL
    url = f"{azure_computer_vision_endpoint}/computervision/retrieval:vectorizeImage?api-version=2024-02-01&model-version=2023-04-15"

    headers = {
        "Ocp-Apim-Subscription-Key": azure_computer_vision_key
    }

    try:
        if is_url:
            # Set headers for URL
            headers["Content-Type"] = "application/json"
            data = {
                "url": image_source
            }
            # Make the request
            response = requests.post(url, headers=headers, json=data)
        else:
            # Read the image file
            with open(image_source, "rb") as image_file:
                image_data = image_file.read()

            # Set headers for image file
            headers["Content-Type"] = "application/octet-stream"
            # Make the request
            response = requests.post(url, headers=headers, data=image_data)

        response.raise_for_status()  # Raise an exception for HTTP errors

        # Return the response
        return response.json()

    except requests.exceptions.RequestException as e:
        print(f"Error: {e}")
        return None

In [4]:
bus1_result = vectorize_image("../Data/customclassificationchallenge/bus/bus1.jpeg", False)
print("Bus 1: ", bus1_result["vector"])

bus2_result = vectorize_image("../Data/customclassificationchallenge/bus/bus2.jpeg", False)
print("Bus 2: ", bus2_result["vector"])

japcherry1_result = vectorize_image("../Data/customclassification/japanese_cherry/japanese_cherry_1.jpg", False)
print("Japanese Cherry 1: ", japcherry1_result["vector"])

japcherry2_result = vectorize_image("../Data/customclassification/japanese_cherry/japanese_cherry_2.jpg", False)
print("Japanese Cherry 2: ", japcherry2_result["vector"])


Bus 1:  [0.73583984, -0.42285156, 0.9838867, -0.19787598, 0.1505127, -1.3271484, 0.4675293, 1.1806641, -1.4570312, 0.025466919, 1.2226562, -0.36572266, 1.9960938, -1.4658203, -0.7519531, -0.3803711, 1.2353516, -3.5585938, -0.24023438, -0.6586914, 1.4384766, -0.53027344, 0.6635742, 0.78808594, 0.27124023, 1.1201172, -53.34375, 0.79833984, -0.45874023, 1.7880859, 2.3808594, 1.4257812, 0.06506348, 2.0351562, -2.8554688, -2.1738281, -0.04119873, -0.47070312, 0.34228516, -1.5341797, -0.011695862, -1.2431641, -1.9960938, -1.5390625, -1.7548828, -1.5976562, -0.2763672, 4.3085938, -0.3334961, -0.34326172, -2.0507812, 0.47851562, 1.9414062, 0.91845703, 1.1484375, 2.1425781, 1.3857422, 1.1269531, 2.2753906, -0.13171387, -0.6328125, 0.09063721, 0.71484375, -1.4765625, -0.18518066, -1.3662109, 1.7666016, -1.1738281, -0.16748047, 0.99853516, -0.023040771, 0.24536133, -1.203125, -0.81396484, -1.4697266, 1.5273438, -2.4375, -0.8491211, -0.42358398, 1.8730469, 0.96533203, -0.75634766, -2.2539062, 3.33

## Calculate Vector Similarity

In [5]:
import numpy as np

def cosine_similarity(vector1, vector2):
    return np.dot(vector1, vector2) / (np.linalg.norm(vector1) * np.linalg.norm(vector2))

In [6]:
print("Bus 1 vs Bus 2")
print(cosine_similarity(bus1_result["vector"], bus2_result["vector"]))

print("Bus vs Japanese Cherry")
print(cosine_similarity(bus1_result["vector"], japcherry1_result["vector"]))
print(cosine_similarity(bus2_result["vector"], japcherry1_result["vector"]))
print(cosine_similarity(bus1_result["vector"], japcherry2_result["vector"]))
print(cosine_similarity(bus2_result["vector"], japcherry2_result["vector"]))

print("Japanese Cherry 1 vs Japanese Cherry 2")
print(cosine_similarity(japcherry1_result["vector"], japcherry2_result["vector"]))


Bus 1 vs Bus 2
0.9461144137794776
Bus vs Japanese Cherry
0.597166185503186
0.5631446370506417
0.5880458941278699
0.55466963384388
Japanese Cherry 1 vs Japanese Cherry 2
0.8691557113858598


## Call the Vectorize Text API

In [7]:
import requests

def vectorize_text(text):
    
    # API URL
    url = f"{azure_computer_vision_endpoint}/computervision/retrieval:vectorizeText?api-version=2024-02-01&model-version=2023-04-15"

    # Set headers
    headers = {
        "Content-Type": "application/json",
        "Ocp-Apim-Subscription-Key": azure_computer_vision_key
    }

    # Set the data payload
    data = {
        "text": text
    }

    try:
        # Make the request
        response = requests.post(url, headers=headers, json=data)
        response.raise_for_status()  # Raise an exception for HTTP errors

        # Return the JSON response
        return response.json()

    except requests.exceptions.RequestException as e:
        print(f"Error: {e}")
        return None


In [9]:
user_input = "bus"
text_vector = vectorize_text(user_input)
print("Text: ", user_input)
print(cosine_similarity(text_vector["vector"], bus1_result["vector"]))
print(cosine_similarity(text_vector["vector"], bus2_result["vector"]))
print(cosine_similarity(text_vector["vector"], japcherry1_result["vector"]))
print(cosine_similarity(text_vector["vector"], japcherry2_result["vector"]))

user_input = "japanese cherry"
print("Text: ", user_input)
text_vector = vectorize_text(user_input)
print(cosine_similarity(text_vector["vector"], bus1_result["vector"]))
print(cosine_similarity(text_vector["vector"], bus2_result["vector"]))
print(cosine_similarity(text_vector["vector"], japcherry1_result["vector"]))
print(cosine_similarity(text_vector["vector"], japcherry2_result["vector"]))


Text:  bus
0.3951606958416321
0.38633370097158953
0.29928874714158005
0.2935154848052578
Text:  japanese cherry
0.2936482669444252
0.288866117294573
0.3772405648025272
0.3765293007548187
