# Find similar images with CLIP

Build visual similarity search to find images that look alike using OpenAI's CLIP model.


## Problem

You have a collection of images and need to find visually similar onesâ€”for duplicate detection, content recommendations, or visual search.

| Query | Expected matches |
|-------|------------------|
| sunset photo | Other sunset/beach images |
| product image | Similar products |
| user upload | Matching content in library |


## Solution

**What's in this recipe:**
- Create image embeddings with CLIP
- Search by image similarity
- Search by text description (cross-modal)

You add an embedding index using CLIP, which understands both images and text. This enables finding similar images or searching images by text description.


### Setup


In [None]:
%pip install -qU pixeltable sentence-transformers


In [None]:
import pixeltable as pxt
from pixeltable.functions.huggingface import clip


### Load images


In [None]:
# Create a fresh directory
pxt.drop_dir('image_search_demo', force=True)
pxt.create_dir('image_search_demo')


In [None]:
images = pxt.create_table('image_search_demo.images', {'image': pxt.Image})


In [None]:
# Insert sample images
images.insert([
    {'image': 'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images/000000000036.jpg'},
    {'image': 'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images/000000000090.jpg'},
    {'image': 'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images/000000000106.jpg'},
    {'image': 'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images/000000000127.jpg'},
])


### Create CLIP embedding index

Add an embedding index using CLIP for cross-modal search:


In [None]:
# Add CLIP embedding index (supports both image and text queries)
images.add_embedding_index(
    column='image',
    image_embed=clip.using(model_id='openai/clip-vit-base-patch32'),
    string_embed=clip.using(model_id='openai/clip-vit-base-patch32')
)


### Search by text description

Find images matching a text query:


In [None]:
# Search by text description
query = "people eating food"
sim = images.image.similarity(query)

results = (
    images
    .order_by(sim, asc=False)
    .select(images.image, score=sim)
    .limit(2)
)
results.collect()


## Explanation

**Why CLIP:**

CLIP (Contrastive Language-Image Pre-training) understands both images and text in the same embedding space. This enables:
- Image-to-image search (find similar photos)
- Text-to-image search (find photos matching a description)

**Index parameters:**

| Parameter | Description |
|-----------|-------------|
| `image_embed` | Model for embedding images |
| `string_embed` | Model for embedding text queries |

**Both must use the same model** for cross-modal search to work.

**New images are indexed automatically:**

When you insert new images, embeddings are generated without extra code.


## See also

- [Semantic text search](./search-semantic-text.ipynb)
- [Vector database documentation](https://docs.pixeltable.com/datastore/vector-database)
