# Explore CLIP

## Purpose of this notebook

> Assess how CLIP understands local attributes for the furniture domain.

The goal is to assess how well CLIP is able to identifies specific details from the images from our domain, i.e. how well it can differentiate details for furniture.

<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/bootstrap@4.0.0/dist/css/bootstrap.min.css" integrity="sha384-Gn5384xqQ1aoWXA+058RXPxPg6fy4IWvTNh0E263XmFcJlSAwiGgFAW/dAiS6JXm" crossorigin="anonymous">

<div class="alert alert-primary" role="alert">

  <h4 class="alert-heading">Example</h4>

Is "a metallic chair" the same as "a chair with metallic legs"? How about "a char with metallic back"?
</div>

It is crucial to have a fine tunned CLIP model, since it affects the whole generation process.


In [4]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [5]:
# loads libs
import os
import pandas as pd
from PIL import Image
from IPython.core.display import HTML

from src.helpers import path_to_image_html
from src.models.clip import get_clip_df

In [8]:
# set input parameters
text_prompts = [
    'a chair',
    'a wooden chair',
    'a chair with wooden legs',
    'a chair with wooden back',
    'a chair with wooden seat',
    'a table',
    'a dog'
    ]

img_urls = ['https://www.iconmobel.de/2897-large_default/stuhl-tolix-wood-style.jpg',
             'https://3dwarehouse.sketchup.com/warehouse/v1.0/content/public/3c602043-655e-4259-b656-47e70224fa29',
             'https://media.designconnected.com/vfs/1f8f73b360dc9221777f7b1dcec8c357_1/756923d1ce62e45147d79e7d6df25f5a.jpg',
             'https://image.made-in-china.com/2f0j00bmuEdokzffcn/Emes-Chair-Replica-Design-Dining-Chair-Plastic-Chair-with-Wooden-Legs.jpg'
             ]

# infers similarities
df = get_clip_df(img_urls, text_prompts)

# presents data
df.style.background_gradient(axis=1).format_index(path_to_image_html)

100%|██████████| 4/4 [00:11<00:00,  2.79s/it]


Unnamed: 0,a chair,a wooden chair,a chair with wooden legs,a chair with wooden back,a chair with wooden seat,a table,a dog
,0.19873,0.064514,0.201904,0.239746,0.289307,0.00573,1.1e-05
,0.185547,0.19751,0.188477,0.158691,0.261719,0.008026,1e-05
,0.328125,0.075562,0.181274,0.222046,0.187012,0.006012,2.1e-05
,0.073608,0.05304,0.437012,0.245239,0.187988,0.003086,4e-06


### Conclusion

CLIP seems to identify local details in some cases.