# **Milvus Image Search**
This notebook demonstrates how to perform image similarity search and text-to-image search using Milvus and the CLIP model.

**Session Goal**: **To show how to store image embeddings in Milvus and perform visual similarity searches and cross-modal (text-to-image) searches.**

## **Requirements**


*   A milvus instance setup and running (ex: MilvusLite, Standalone Milvus, ...)
*   Image embedding model, like CLIP model
*   Images to be stored in Milvus
*   Search images



# Install Libraries and Set Up Milvus Lite


In [None]:
!pip install pymilvus[standalone_milvus] sentence-transformers matplotlib pillow==9.5.0 --quiet

# Image and text embedding Model:
In our test we are going to use a CLIP embedding model:  **clip-ViT-B-32**

In [None]:
# Load CLIP Model
print("Loading CLIP model (this may take a moment)...")
# 'clip-ViT-B-32' is a good balance of performance and speed
model = SentenceTransformer('clip-ViT-B-32')
print("CLIP model loaded successfully.")

# **Generate Embeddings and Ingest them**
Now we will generate embeddings for our images using ***clip-ViT-B-32*** model and insert the embeddings into Milvus

In [None]:
print("Generating embeddings and inserting into Milvus...")
milvus_entities = []
for item in tqdm(image_data, desc="Encoding & Ingesting"):
    # Encode the image
    img_embedding = model.encode(item["image_obj"]).tolist() # Convert numpy array to list
    milvus_entities.append({
        "image_id": item["id"],
        "image_path": item["path"],
        "embedding": img_embedding
    })

# Insert entities in batches (MilvusClient handles batching automatically)
client.insert(collection_name=COLLECTION_NAME, data=milvus_entities)

# Ensure data is indexed and available for search
client.flush(collection_name=COLLECTION_NAME)
client.load_collection(collection_name=COLLECTION_NAME)

print(f"Successfully ingested {len(milvus_entities)} image embeddings into Milvus.")
print(f"Collection entities count: {client.get_collection_stats(COLLECTION_NAME)['row_count']}")

# **Perform Image Search (Image-to-Image)**
Here we pick an image and ask milvus to find similar ones.

In [None]:
query_item # ... selected image

if query_item is None:
    print(f"Error: Image with ID {query_image_id} not found.")
else:
    query_image_path = query_item['path']
    query_image_obj = query_item['image_obj']
    query_vector = model.encode(query_image_obj).tolist()

    print(f"Querying with image (ID: {query_image_id}):")
    display_images([query_image_path], titles=["Query Image"], fig_size=(3,3))

    # Perform the search
    print("\nSearching for similar images...")
    search_results = client.search(
        collection_name=COLLECTION_NAME,
        data=[query_vector],
        anns_field="embedding",
        param={"metric_type": "COSINE", "params": {"nprobe": 10}},
        limit=5, # Get top 5 results
        output_fields=["image_path", "image_id"]
    )

    # Process and display results
    retrieved_image_paths = []
    retrieved_titles = []
    print("\nSearch Results (Image-to-Image):")
    for hit in search_results[0]: # Loop through results for the first query
        if hit.id != query_image_id: # Exclude the query image itself from results
            retrieved_image_paths.append(hit["image_path"])
            retrieved_titles.append(f"ID: {hit.id}, Dist: {hit.distance:.4f}")

    display_images(retrieved_image_paths, titles=retrieved_titles, fig_size=(15, 4))

# **Perform Text Search (Text-to-Image)**
Thanks to CLIP, we can also search for images using a text description!

In [None]:
# Define your text query
text_query = "a furry animal in the snow" # Try changing this!
# text_query = "a person skiing"
# text_query = "a bridge over water"
# text_query = "a big grey animal"

# Generate embedding for the text query
query_text_vector = model.encode(text_query).tolist()

print(f"Querying with text: '{text_query}'")

# Perform the search
print("\nSearching for images based on text query...")
search_results = client.search(
    collection_name=COLLECTION_NAME,
    data=[query_text_vector],
    anns_field="embedding",
    param={"metric_type": "COSINE", "params": {"nprobe": 10}},
    limit=5, # Get top 5 results
    output_fields=["image_path", "image_id"]
)

# Process and display results
retrieved_image_paths = []
retrieved_titles = []
print("\nSearch Results (Text-to-Image):")
for hit in search_results[0]:
    retrieved_image_paths.append(hit["image_path"])
    retrieved_titles.append(f"ID: {hit.id}, Dist: {hit.distance:.4f}")

display_images(retrieved_image_paths, titles=retrieved_titles, fig_size=(15, 4)