## Install FashionClip Model

https://github.com/patrickjohncyh/fashion-clip

In [1]:
%pip install fashion-clip
%pip install psycopg2-binary pgvector

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


**If you have lists of texts and image paths, it is very easy to generate embeddings:**

In [2]:
from PIL import Image
import numpy as np
from fashion_clip.fashion_clip import FashionCLIP

fclip = FashionCLIP('fashion-clip')

#images = Image.open("images/01.jpg")
images = ["images/02.jpg"] 



# we create image embeddings and text embeddings
image_embeddings = fclip.encode_images(images, batch_size=32)


# we normalize the embeddings to unit norm (so that we can use dot product instead of cosine similarity to do comparisons)
image_embeddings = image_embeddings/np.linalg.norm(image_embeddings, ord=2, axis=-1, keepdims=True)

print(image_embeddings)
print(np.shape(image_embeddings))

Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
1it [00:00,  5.33it/s]

[[ 3.52903008e-02 -8.80019590e-02 -4.07405235e-02 -9.84467268e-02
   1.51528781e-02  3.51057691e-03 -2.94977333e-02  3.96311730e-02
   2.19209827e-02 -2.05044658e-03  3.22011509e-03 -6.30028127e-03
  -1.93404425e-02 -1.41926967e-02 -3.49319093e-02  2.61567086e-02
  -3.33395191e-02 -3.55669335e-02  2.80668754e-02 -3.06902062e-02
  -2.63958238e-02 -2.05375813e-02  1.38720975e-03 -1.62038265e-03
   4.76276595e-03 -2.08955258e-02  3.41099221e-03  3.74505073e-02
   6.57289699e-02  5.95800346e-03 -4.78201695e-02 -1.49145992e-02
   4.90564816e-02 -4.51627746e-02  1.83031335e-02 -4.00529653e-02
  -3.91408317e-02  1.74777117e-02 -3.48770781e-03  4.37955633e-02
  -3.66900004e-02 -1.15779918e-02  7.80193582e-02  3.20335105e-02
   1.13945943e-03  8.13993812e-03 -7.65624121e-02 -1.31227383e-02
  -1.61863696e-02 -3.66223790e-02  2.14825440e-02  1.56270750e-02
   5.85720222e-03 -1.79311191e-03 -4.54758406e-02  8.08460936e-02
  -1.74460206e-02  4.44497466e-02 -1.18747912e-02 -4.39580604e-02
  -4.57188




Insert embedding database:

In [3]:
import psycopg2

## Database connection component

In [4]:
def connect_db():
    return psycopg2.connect( 
        host = 'localhost',
        database = 'postgres',
        user = 'postgres',
        password = 'password',
        port = '5432' 
    )

## Table creation component
This component creates the documents table if it does not already exist. The table includes columns for `id`, `title`, `image_path`, and `embedding`. 

In [5]:
with connect_db() as conn:
    with conn.cursor() as cur:
        cur.execute("""
            CREATE TABLE IF NOT EXISTS imagedocuments (
                id SERIAL PRIMARY KEY,
                title TEXT,
                image_path TEXT,
                embedding VECTOR(512)
            );
        """)

## Dataset Preparation

In this tutorial, we will create dummy data of different locations with their description. 

In [20]:
title = "Blue jean, suspenders, and a white tanktop."
image_path = "images/02.jpg"
image_embeddings = image_embeddings[0]

In [6]:
import psycopg2
from pgvector.psycopg2 import register_vector


In [21]:
with connect_db() as conn:
    with conn.cursor() as cur:
        register_vector(conn)
        cur.execute("INSERT INTO imagedocuments (title, image_path, embedding) VALUES (%s, %s, %s)", (title, image_path, image_embeddings))


## Search Image

In [9]:
#images = Image.open("images/01.jpg")
images = ["images/03.jpg"] 

# we create image embeddings and text embeddings
query_embeddings = fclip.encode_images(images, batch_size=32)

# we normalize the embeddings to unit norm (so that we can use dot product instead of cosine similarity to do comparisons)
query_embeddings = query_embeddings/np.linalg.norm(query_embeddings, ord=2, axis=-1, keepdims=True)

print(query_embeddings[0])

1it [00:00,  6.91it/s]

[ 2.33304966e-02 -1.22211628e-01  3.95654961e-02 -3.76470610e-02
  8.54529161e-03 -5.44758653e-03 -3.34705710e-02 -8.03911034e-03
 -1.21029164e-03 -3.54502746e-03 -9.47846565e-03  2.67372504e-02
 -3.79736088e-02  2.89187636e-02 -4.18465063e-02  5.24023455e-03
 -9.31326598e-02 -2.88204011e-02  2.91170105e-02 -3.84499431e-02
 -4.80019636e-02 -5.50174452e-02 -3.87962279e-03  7.15688318e-02
  3.22999842e-02 -2.45879702e-02 -1.21555319e-02  5.06119011e-03
  1.74262710e-02  1.35777267e-02 -3.85218598e-02 -5.35769900e-03
  3.50072756e-02 -4.10348065e-02 -1.22901294e-02  4.46815491e-02
 -2.30495539e-02  8.23621973e-02  2.20025599e-04  6.32382371e-03
  3.90589493e-03 -1.89860389e-02  2.75444705e-02  3.68196331e-03
  5.84029127e-03  3.31564769e-02 -1.28580956e-02  2.63690650e-02
 -3.07912566e-02  3.54519784e-02 -1.79946534e-02 -3.90764000e-03
 -2.25560856e-03  9.76635050e-03  5.46092764e-02  7.12525398e-02
 -1.78975724e-02  6.32461458e-02  3.75246489e-03 -1.64733380e-02
 -4.15418297e-02 -2.14349




## Retrieve and generate response component

This component takes a query, embeds it, retrieves the most relevant documents based on cosine similarity, and generates a response using the [`ollama_generate`](https://github.com/timescale/pgai/blob/main/docs/ollama.md#generate) function.

In [13]:
with connect_db() as conn:
    with conn.cursor() as cur:
        register_vector(conn)
        # Retrieve relevant documents based on cosine distance
        cur.execute("""
            SELECT title, image_path, 1 - (embedding <=> %s) AS similarity
            FROM imagedocuments
            ORDER BY similarity DESC
            LIMIT 3;
        """, (query_embeddings[0],))

        rows = cur.fetchall()
            
        print(rows)


[('Blue jean, suspenders, and a white tanktop.', 'images/02.jpg', 0.5453727841377258)]
