# Image Search with CLIP
This recipe demonstrates how build image search with `CLIP` model ([multi2vec-clip](https://weaviate.io/developers/weaviate/modules/retriever-vectorizer-modules/multi2vec-clip)).

CLIP allows us to search through text and images.

This recipe will focus on searching through images only (skipping searching through text):
* [text-to-image search](#text-to-image-search) - provide text as input to search through images
* [image-to-image search](#image-to-image-search) - provide image as input to search through images

## Weaviate Setup

The CLIP model is only available with local Weaviate deployments with Docker or Kubernetes.

CLIP is not supported with Weaviate Cloud Services (WCS).

### Steps to deploy Weaviate locally with CLIP

1. Get a docker compose file.
    
    Run the following command in your terminal:

    ```
    curl -o docker-compose.yml "https://configuration.weaviate.io/v2/docker-compose/docker-compose.yml?clip_model=sentence-transformers-clip-ViT-B-32-multilingual-v1&generative_cohere=false&generative_openai=false&generative_palm=false&media_type=clip&modules=modules&ner_module=false&qna_module=false&ref2vec_centroid=false&reranker_cohere=false&reranker_transformers=false&runtime=docker-compose&spellcheck_module=false&sum_module=false&weaviate_version=v1.21.8&weaviate_volume=named-volume"
    ```

    This will download `docker-compose.yml` file for you.

2. Run Weaviate+CLIP with Docker Compose

    > If you are new to `Docker Compose`, [here are instructions on how to install it](https://docs.docker.com/compose/install/).

    To start the docker image defined in the `docker-compose.yml` file, call:

    ```
    docker compose up
    ```
    
    > Note #1 - the first time you run the command, Docker will download a ~3GB image.
    
    > Note #2 – to shut down a running docker image, press CMD+C or CTRL+C.

### Dependencies

In [None]:
!pip install weaviate-client

## Configuration

In [None]:
import weaviate

# Connect to Weaviate
client = weaviate.Client(
  url="http://localhost:8080",  # URL to your local Weaviate instance
)

client.is_ready() # Test the connection

### Create `Animals` collection

The collection has the following key characteristics:
1. Name: `"Animals"`
2. Vectorizer: `multi2vec-clip`
3. Image property: `"image"` - Weaviate will use values in "image" property to generate vectors. Note, you can call it anything you want.

In [None]:
# Delete the collection if it exists.
# Note you should skip this step if you don't want to reimport the data every time.
if client.schema.exists("Animals"):
    client.schema.delete_class("Animals")

animals = {
    "classes": [
        {
            "class": "Animals",
            "vectorizer": "multi2vec-clip",
            "moduleConfig": {
                "multi2vec-clip": {
                    "textFields": ["name"],
                    "imageFields": ["image"],
                    "weights": {
                        "textFields": [0], # ignore text in the search
                        "imageFields": [1], # focus search on images
                    }
                }
            },
        }
    ]
}

client.schema.create(animals)
print("Successfully created Animals collection.")

### Import Images
For every object, we will store:
* `name` - the file name 
* `path` - path to the file, so that we could display returned images at query time.
* `image` - a base64 representation of the image file, Weaviate will use it to generate a vector - see `imageFields`.

In [None]:

import base64

# Helper function to convert a file to base64 representation
def toBase64(path):
    with open(path, 'rb') as file:
        return base64.b64encode(file.read()).decode('utf-8')

In [None]:
# List of source images 
source = ["cat1.jpg", "cat2.jpg", "cat3.jpg",
          "dog1.jpg", "dog2.jpg", "dog3.jpg",
          "meerkat1.jpg", "meerkat2.jpg", "meerkat3.jpg"]

client.batch.configure(batch_size=3)  # Load images in batches of 3
with client.batch as batch:

    for name in source:
        print(f"Adding {name}")

        # Build the path to the image file
        path = "./source/image/" + name

        # Object to store in Weaviate
        properties = {
            "name": name,
            "path": path,
            "image": toBase64(path), # Weaviate will use the base64 representation of the file to generate a vector.
        }

        # Add the object to Weaviate
        client.batch.add_data_object(properties, "Animals")

### Check number of objects in the Animals collection

In [None]:
# Display the number of objects in the Animals collection
client.query.aggregate("Animals").with_meta_count().do()

## Query examples

In [299]:
# Helper functions to display results
import json
from IPython.display import Image

def json_print(data):
    print(json.dumps(data, indent=2))

def display_image(path):
    display(Image(path))

### Text to Image search

In [None]:
# Search for images with "dog", "dog with glasses", "dog with a sign"
response = (
    client.query
    .get("Animals", "name path")
    .with_near_text(
        {"concepts": "dog"}
        # {"concepts": "dog with glasses"}
        # {"concepts": "dog with a sign"}
    )
    .with_limit(3)
    .do()
)

# Print results
result = response["data"]["Get"]["Animals"]
json_print(result)

# Display the first image
display_image(result[0]["path"])

### Image to Image search

In [None]:
# Search for images that are similar to the provided image of test-meerkat, test-dog, test-cat
response = (
    client.query
    .get("Animals", "name path")
    .with_near_image(
        {"image": "./test/test-meerkat.jpg"}, # Use file path as the input for the query
        # {"image": "./test/test-dog.jpg"}, # Use file path as the input for the query
        # {"image": "./test/test-cat.jpg"}, # Use file path as the input for the query
    )
    .with_limit(3)
    .do()
)

# Print results
result = response["data"]["Get"]["Animals"]
json_print(result)

# Display the first image
display_image(result[0]["path"])