[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/weaviate/recipes/blob/main/weaviate-features/media-search/image_search_voyageai.ipynb)

# Image Search with VoyageAI
This recipe demonstrates how build image search with VoyageAI's multimodal model ([multi2vec-voyageai](https://weaviate.io/developers/weaviate/modules/retriever-vectorizer-modules/multi2vec-voyageai)).

Voyage multimodal embedding models support text and content-rich images — such as figures, photos, slide decks, and document screenshots — eliminating the need for complex text extraction or ETL pipelines.

## Weaviate Setup

The VoyageAI model is available with Weaviate Cloud Database (WCD) and local Weaviate deployments also.

You will need to run:
1. Weaviate version: `1.25.28`, `1.26.12`, `1.27.8`, or newer versions
2. Weaviate Python version: `>=4.10.0`

[Documentation](https://weaviate.io/developers/weaviate/model-providers/voyageai/embeddings-multimodal)

### Dependencies

In [None]:
!pip install -U weaviate-client

## Configuration

In [None]:
import weaviate

# Connect to Weaviate
client = weaviate.connect_to_local(headers={
    "X-VoyageAI-Api-Key": "<YOUR_API_KEY>",
})

client.is_ready() # Test the connection

### Create `Animals` collection

The collection has the following key characteristics:
1. Name: `"Animals"`
2. Vectorizer: `multi2vec-voyageai`
3. Image property: `"image"` - Weaviate will use values in "image" property to generate vectors. Note, you can call it anything you want.

In [None]:
from weaviate.classes.config import Configure, Multi2VecField, Property, DataType

# Delete the collection if it exists.
# Note you should skip this step if you don't want to reimport the data every time.
if client.collections.exists("Animals"):
    client.collections.delete("Animals")

animals = client.collections.create(
    name="Animals",
    vectorizer_config=Configure.Vectorizer.multi2vec_voyageai(
        model="voyage-multimodal-3",
        output_encoding=None,
        text_fields=[Multi2VecField(name="text", weight=0.5)],
        image_fields=[Multi2VecField(name="image", weight=0.5)],
    ),
    properties=[
        Property(name="text", data_type=DataType.TEXT),
        Property(name="image", data_type=DataType.BLOB)
    ]
)
print("Successfully created Animals collection.")

### Import Images
For every object, we will store:
* `name` - the file name 
* `path` - path to the file, so that we could display returned images at query time.
* `image` - a base64 representation of the image file, Weaviate will use it to generate a vector - see `imageFields`.

In [None]:

import base64

# Helper function to convert a file to base64 representation
def toBase64(path):
    with open(path, 'rb') as file:
        return base64.b64encode(file.read()).decode('utf-8')

In [None]:
# List of source images 
source = ["cat1.jpg", "cat2.jpg", "cat3.jpg",
          "dog1.jpg", "dog2.jpg", "dog3.jpg",
          "meerkat1.jpg", "meerkat2.jpg", "meerkat3.jpg"]


with animals.batch.dynamic() as batch:
    for name in source:
        print(f"Adding {name}")
        # Build the path to the image file
        path = "./source/image/" + name
        # Object to store in Weaviate
        properties = {
            "name": name,
            "path": path,
            "image": toBase64(path), # Weaviate will use the base64 representation of the file to generate a vector.
        }
        batch.add_object(
            properties=properties,
        )
print(animals.batch.results)

### Check number of objects in the Animals collection

In [None]:
# Display the number of objects in the Animals collection
animals.aggregate.over_all(total_count=True)

In [None]:
## Now let's get one vector
animals.query.fetch_objects(include_vector=True, limit=1).objects[0].vector

## Query examples

In [None]:
# Helper functions to display results
import json
from IPython.display import Image, display

def display_image(path):
    display(Image(path))

### Text to Image search

In [None]:
# Search for images with "dog", "dog with glasses", "dog with a sign"
response = animals.query.near_text(
    query="dog",
    limit=3
)

# Print first result
result = response.objects[0]
print(result)

# Display the first image
display_image(result.properties.get("path"))

### Image to Image search

In [None]:
from weaviate.classes import query
response = animals.query.near_media(
    media="./test/test-dog.jpg",
    media_type=query.NearMediaType.IMAGE,
    limit=3
)

# Print results
result = response.objects[0]
print(result)

# Display the first image
display_image(result.properties.get("path"))