# Zilliz Cloud (Managed Milvus) Vector Database Demo

This notebook demonstrates using **Zilliz Cloud** (managed Milvus) with 100 sample articles.

## Zilliz Cloud Key Features
- **Enterprise-Grade** - Used by eBay, Walmart, PayPal
- **AutoIndex** - Automatic index optimization (no manual tuning!)
- **Fully Managed** - No infrastructure management needed
- **Dynamic Fields** - Flexible metadata via $meta JSON
- **Free Tier** - 5GB storage, 2.5M vCUs
- **Scalability** - Handles billions of vectors
- **Simpler than Milvus** - Abstracts away manual index/load steps

In [1]:
# Reload
%reload_ext autoreload
%autoreload 2

## 1. Setup and Imports

In [2]:
import os
import sys
from pathlib import Path
import time

# Add parent directory to path
parent_dir = Path().resolve().parent
sys.path.insert(0, str(parent_dir))

# Load environment variables
from dotenv import load_dotenv
load_dotenv()

# Import utilities
from utils.embeddings import EmbeddingGenerator
from utils.data_loader import load_articles, get_article_metadata
from utils.date_utils import timestamp_to_datetime_string

# Use MilvusClient for simpler Zilliz Cloud operations
from pymilvus import MilvusClient, DataType

print("✓ All imports successful")

✓ All imports successful


## 2. Load Embedding Model

Using `sentence-transformers/all-MiniLM-L6-v2` (384 dimensions)

In [3]:
# Initialize embedding model
embedding_model = EmbeddingGenerator()

# Test the model
test_text = "This is a test sentence for embedding generation."
test_embedding = embedding_model.embed_text(test_text)

print(f"  - Embedding dimension: {len(test_embedding)}")
print(f"  - Sample values: {test_embedding[:5]}")

Loading embedding model: sentence-transformers/all-MiniLM-L6-v2
✓ Model loaded successfully. Embedding dimension: 384
  - Embedding dimension: 384
  - Sample values: [0.00306019 0.00200206 0.05544939 0.07702641 0.00857853]


## 3. Load Sample Articles

In [4]:
import json

# Load articles
articles = load_articles("../sample_articles.json")

print(f"\nLoaded {len(articles)} articles")

# preview the first article
print("\nSample article:")
print(json.dumps(articles[0], indent=2))

Loaded 100 articles from ../sample_articles.json

Loaded 100 articles

Sample article:
{
  "id": 1212,
  "item_source": "MY_GRAND_CANYON",
  "item_title": "Make This Your First Stop Grand Canyon Stop",
  "item_subtitle": "Lunch, souvenirs, tour info and an immersive big screen experience.",
  "body_content": "Nothing can properly prepare you for the heart-pumping magnificence of the Grand Canyon\u2014except maybe a visit to the Grand Canyon Visitor Center and IMAX in Tusayan (Too-Say-Ann), located just 1.5 miles from the South Entrance to the national park. This is where your journey into one of the world\u2019s most awe-inspiring landscapes begins.\nSee the IMAX movie &#8220;The Grand Canyon: Rivers of Time&#8221; (Photo courtesy Grand Canyon Visitors Center/IMAX)\nOn the giant, six-story IMAX screen, catch the movie Grand Canyon: Rivers of Time, which won a 2023 award for best visual effects from the Giant Screen Cinema Association. In 37 breathtaking minutes, you\u2019ll be transpor

## 4. Connect to Zilliz Cloud

Using Zilliz Cloud (managed Milvus service)

In [5]:
ZILLIZ_ENDPOINT = os.getenv("ZILLIZ_ENDPOINT")
ZILLIZ_TOKEN = os.getenv("ZILLIZ_TOKEN")

# Connect to Zilliz Cloud using MilvusClient
try:
    client = MilvusClient(
        uri=ZILLIZ_ENDPOINT,
        token=ZILLIZ_TOKEN
    )
    
    print(f"✓ Connected to Zilliz Cloud")
    
    # Test connection by listing collections
    collections = client.list_collections()
    print(f"  - Existing collections: {collections}")
    
except Exception as e:
    error_msg = str(e)
    if "authentication" in error_msg.lower() or "401" in error_msg:
        raise ValueError(
            "Authentication failed. Please check your ZILLIZ_API_KEY.\n\n"
            "The API key format should be: db_admin:your_api_key\n"
            "Or just: your_api_key\n\n"
            "Create/find API keys at: https://cloud.zilliz.com/clusters → Your Cluster → API Keys"
        )
    else:
        # Re-raise the original exception if it's something else
        raise

✓ Connected to Zilliz Cloud
  - Existing collections: ['articles']


## 5. Create or Get Collection

Zilliz Cloud with AutoIndex - much simpler than self-hosted Milvus!

In [6]:
COLLECTION_NAME = "articles"

# Check if collection exists
if client.has_collection(COLLECTION_NAME):
    print(f"Collection '{COLLECTION_NAME}' already exists")
    
    # Get collection stats
    stats = client.get_collection_stats(COLLECTION_NAME)
    row_count = stats.get('row_count', 0)
    
    print(f"✓ Using existing collection: {COLLECTION_NAME}")
    print(f"  - Current count: {row_count} articles")
    
    # Ask user if they want to delete and recreate
    recreate = input("\nDo you want to delete and recreate? (y/n): ").lower().strip()
    if recreate == 'y':
        client.drop_collection(COLLECTION_NAME)
        print(f"✓ Deleted collection: {COLLECTION_NAME}")

# Create collection if it doesn't exist
if not client.has_collection(COLLECTION_NAME):
    print(f"Creating new collection with AutoIndex: {COLLECTION_NAME}")
    
    # Define schema fields
    schema = client.create_schema(
        auto_id=False,
        enable_dynamic_field=True  # Allows flexible metadata
    )
    
    schema.add_field(
        field_name="article_id",
        datatype=DataType.INT64,
        is_primary=True,
        description="Unique article identifier"
    )
    schema.add_field(
        field_name="embedding",
        datatype=DataType.FLOAT_VECTOR,
        dim=384  # all-MiniLM-L6-v2 dimensions
    )
    schema.add_field(field_name="title", datatype=DataType.VARCHAR, max_length=500)
    schema.add_field(field_name="subtitle", datatype=DataType.VARCHAR, max_length=500)
    schema.add_field(field_name="category", datatype=DataType.VARCHAR, max_length=100)
    schema.add_field(field_name="source", datatype=DataType.VARCHAR, max_length=100)
    schema.add_field(field_name="tags", datatype=DataType.VARCHAR, max_length=500)
    schema.add_field(field_name="evergreen", datatype=DataType.BOOL)
    schema.add_field(field_name="url", datatype=DataType.VARCHAR, max_length=1000)
    schema.add_field(field_name="created_at", datatype=DataType.INT64)
    
    # Prepare AutoIndex parameters (Zilliz Cloud feature!)
    index_params = client.prepare_index_params()
    index_params.add_index(
        field_name="embedding",
        index_type="AUTOINDEX",  # Automatic optimization!
        metric_type="COSINE"
    )
    
    # Create collection with AutoIndex
    # Zilliz Cloud automatically loads collection when created with index
    client.create_collection(
        collection_name=COLLECTION_NAME,
        schema=schema,
        index_params=index_params
    )
    
    print(f"✓ Created new collection: {COLLECTION_NAME}")
    print(f"  - Vector dimensions: 384")
    print(f"  - Index: AUTOINDEX (automatically optimized!)")
    print(f"  - Metric: COSINE")
    print(f"  - Auto-loaded: Yes (ready for queries immediately!)")
    print(f"  - Schema: 10 fields (+ dynamic fields enabled)")

Collection 'articles' already exists
✓ Using existing collection: articles
  - Current count: 0 articles
✓ Deleted collection: articles
Creating new collection with AutoIndex: articles
✓ Created new collection: articles
  - Vector dimensions: 384
  - Index: AUTOINDEX (automatically optimized!)
  - Metric: COSINE
  - Auto-loaded: Yes (ready for queries immediately!)
  - Schema: 10 fields (+ dynamic fields enabled)


## 6. Generate Embeddings and Insert Data

Process articles in batches, matching the approach used in other notebooks

In [8]:
# Check current count
stats = client.get_collection_stats(COLLECTION_NAME)
current_count = stats.get('row_count', 0)

if current_count >= len(articles):
    print(f"Collection already has {current_count} articles. Skipping insertion.")
    print("Run the cell above to recreate collection if needed.")
else:
    # Process in batches - same approach as other notebooks
    BATCH_SIZE = 20
    total_articles = len(articles)
    
    print(f"Processing {total_articles} articles in batches of {BATCH_SIZE}...\n")
    
    start_time = time.time()
    
    from tqdm.auto import tqdm
    
    for i in tqdm(range(0, total_articles, BATCH_SIZE), desc="Upserting batches"):
        batch = articles[i:i + BATCH_SIZE]
        
        # Generate embeddings for batch - same as other notebooks
        texts = [
            f"Title: {a['item_title']}\nSubtitle: {a.get('item_subtitle', '')}\nContent: {a['body_content'][:500]}"
            for a in batch
        ]
        embeddings = embedding_model.embed_batch(texts, show_progress=False)
        
        # Prepare metadata - use "milvus" to get timestamps
        metadatas = [get_article_metadata(a, db_type="milvus") for a in batch]
        
        # Prepare data in row-based format for Milvus
        data = []
        for metadata, embedding in zip(metadatas, embeddings):
            row = {
                "article_id": metadata["id"],
                "embedding": embedding.tolist(),
                "title": metadata["title"][:500],
                "subtitle": metadata["subtitle"][:500],
                "category": metadata["category"][:100],
                "source": metadata["source"][:100],
                "tags": metadata["tags"][:500],  # Stored as comma-separated string
                "evergreen": metadata["evergreen"],
                "url": metadata["url"][:1000],
                "created_at": metadata["created_at"],  # Timestamp (int)
            }
            data.append(row)
        
        # Upsert batch into Milvus (insert new or update existing by article_id)
        client.upsert(
            collection_name=COLLECTION_NAME,
            data=data
        )
    
    elapsed_time = time.time() - start_time
    
    # Get updated count
    stats = client.get_collection_stats(COLLECTION_NAME)
    final_count = stats.get('row_count', 0)
    
    print(f"\n✓ Successfully upserted {total_articles} articles")
    print(f"  - Time taken: {elapsed_time:.2f} seconds")
    print(f"  - Average: {elapsed_time/total_articles:.2f} seconds per article")
    print(f"  - Collection count: {final_count}")

Collection already has 100 articles. Skipping insertion.
Run the cell above to recreate collection if needed.


## 7. Basic Semantic Search

Search using vector similarity (no manual loading needed!)

In [9]:
# Test query - SAME AS CHROMA
query_text = "Most haunted hikes in the US"

print(f"Query: '{query_text}'\n")

# Generate query embedding
query_embedding = embedding_model.embed_text(query_text)

# Perform search using MilvusClient
results = client.search(
    collection_name=COLLECTION_NAME,
    data=[query_embedding.tolist()],
    limit=5,
    output_fields=["title", "category", "source", "url"]
)

print(f"Top 5 Results:\n")
for i, hit in enumerate(results[0]):
    # Zilliz Cloud with COSINE returns distance (lower = more similar)
    distance = hit['distance']
    entity = hit['entity']
    
    print(f"{i+1}. {entity.get('title')[:70]}...")
    print(f"   Category: {entity.get('category')} | Source: {entity.get('source')}")
    print(f"   Distance: {distance:.4f}")
    print(f"   URL: {entity.get('url')}...")
    print()

Query: 'Most haunted hikes in the US'

Top 5 Results:

1. 13 of the Most Haunted Hikes in the U.S....
   Category: Destinations | Source: OUTSIDE
   Distance: 0.8058
   URL: https://www.outsideonline.com/adventure-travel/destinations/haunted-hikes/...

2. A Missing Dog Helped a Stranded Hiker Return to Shadow Mountain Trail....
   Category: Hiking | Source: OUTSIDE
   Distance: 0.4578
   URL: https://www.outsideonline.com/outdoor-adventure/hiking-and-backpacking/arizona-lost-hiker-missing-dog-shadow-mountain/...

3. An Inside Look at Outside’s 2025 Winter Editors’ Choice Testing Trip...
   Category: Gear | Source: OUTSIDE
   Distance: 0.3660
   URL: https://www.outsideonline.com/outdoor-gear/winter-editors-choice-trip-maine/...

4. Two Hikers in British Columbia Were Hospitalized After a Grizzly Sow A...
   Category: Hiking | Source: OUTSIDE
   Distance: 0.3364
   URL: https://www.outsideonline.com/outdoor-adventure/hiking-and-backpacking/two-hikers-in-british-columbia-were-hospitalize

## 8. Metadata Filtering - Category

Filter using Milvus boolean expressions

In [10]:
# Filter by category - SAME AS CHROMA
query_text = "Women's Ironman World Championship"
target_category = "News"

print(f"Query: '{query_text}'")
print(f"Filter: category = '{target_category}'\n")

# Generate query embedding
query_embedding = embedding_model.embed_text(query_text)

# Search with category filter
results = client.search(
    collection_name=COLLECTION_NAME,
    data=[query_embedding.tolist()],
    limit=5,
    filter=f'category == "{target_category}"',  # Milvus expression syntax
    output_fields=["title", "category", "source", "created_at"]
)

print(f"Top 5 Results (Category: {target_category}):\n")
for i, hit in enumerate(results[0]):
    distance = hit['distance']
    entity = hit['entity']
    
    print(f"{i+1}. {entity.get('title')[:70]}...")
    print(f"   Category: {entity.get('category')} | Source: {entity.get('source')}")
    print(f"   Created: {entity.get('created_at')}")
    print(f"   Distance: {distance:.4f}")
    print()

Query: 'Women's Ironman World Championship'
Filter: category = 'News'

Top 5 Results (Category: News):

1. After Joy of Women's-Only Ironman World Championship, Grief Sets In...
   Category: News | Source: TRIATHLETE
   Created: 1760296873
   Distance: 0.7770

2. What a Race! Here's Where the Ironman Pro Series Stands After the Iron...
   Category: News | Source: TRIATHLETE
   Created: 1760352009
   Distance: 0.6464

3. The Fastest Shoes at 2025 Ironman World Championship Kona...
   Category: News | Source: TRIATHLETE
   Created: 1760353908
   Distance: 0.6182

4. The DNF Files: 2025 Ironman World Championship Kona...
   Category: News | Source: TRIATHLETE
   Created: 1760441445
   Distance: 0.5917

5. In Sweltering Conditions, Norway’s Solveig Løvseth Takes 2025 Ironman ...
   Category: News | Source: TRIATHLETE
   Created: 1760160735
   Distance: 0.5619



## 9. Metadata Filtering - Date Range

Filter by timestamp (stored as INT64)

In [11]:
from utils.date_utils import date_string_to_timestamp

# Filter by date - SAME AS CHROMA
query_text = "cycling deals"
cutoff_date = "2025-10-08"

print(f"Query: '{query_text}'")
print(f"Filter: created_at >= '{cutoff_date}'\n")

# Generate query embedding
query_embedding = embedding_model.embed_text(query_text)

# Convert date to timestamp
cutoff_timestamp = date_string_to_timestamp(cutoff_date)

# Search with date filter
results = client.search(
    collection_name=COLLECTION_NAME,
    data=[query_embedding.tolist()],
    limit=5,
    filter=f'created_at >= {cutoff_timestamp}',  # Numeric comparison
    output_fields=["title", "category", "created_at", "tags"]
)

print(f"Top 5 Recent Results (after {cutoff_date}):\n")
for i, hit in enumerate(results[0]):
    entity = hit['entity']
    created_timestamp = entity.get('created_at')
    created_str = timestamp_to_datetime_string(created_timestamp)
    
    print(f"{i+1}. {entity.get('title')}")
    print(f"   Category: {entity.get('category')}")
    print(f"   Created: {created_str}")
    print(f"   Tags: {entity.get('tags', 'No tags')}")
    print()

Query: 'cycling deals'
Filter: created_at >= '2025-10-08'

Top 5 Recent Results (after 2025-10-08):

1. Opinion: Cycling's Soccer-Inspired Relegation System Is a Hot Mess That Solves Nothing
   Category: Road Racing
   Created: 2025-10-15 22:42:10
   Tags: Analysis, ASO, Cofidis, Tour de France, Tour de Hoody

2. Deal: Tailwind Endurance Fuel Is the Cycling Nutrition I Actually Use
   Category: Road Gear
   Created: 2025-10-13 04:30:52
   Tags: Velo Deals

3. Pogačar's Bonuses and Brand Deals Revealed: Inside His $14 Million Pay Check
   Category: Road Racing
   Created: 2025-10-13 20:39:12
   Tags: Alex Carera, Remco Evenepoel, Tadej Pogačar, Transfers, UAE Emirates

4. Shop Evo's Anniversary Sale and Save up to 50% on Ski, Snowboard, and MTB Gear
   Category: Gear News
   Created: 2025-10-14 03:53:27
   Tags: Commerce, Deals

5. Deal: One of the Best Headphones for Cycling Is 50% Off
   Category: Road Gear
   Created: 2025-10-15 05:12:34
   Tags: headphones, Velo Deals



## 10. Combined Filters - Evergreen + Date

Combine multiple filters with boolean logic

In [12]:
# Combine multiple filters - SAME AS CHROMA
query_text = "Halloween outdoor activities"
cutoff_date = "2025-10-09"

print(f"Query: '{query_text}'")
print(f"Filters:")
print(f"  - evergreen = True (timeless content)")
print(f"  - created_at >= '{cutoff_date}'\n")

# Generate query embedding
query_embedding = embedding_model.embed_text(query_text)

# Convert date to timestamp
cutoff_timestamp = date_string_to_timestamp(cutoff_date)

# Combine filters with AND
combined_filter = f'evergreen == True and created_at >= {cutoff_timestamp}'

# Search with combined filters
results = client.search(
    collection_name=COLLECTION_NAME,
    data=[query_embedding.tolist()],
    limit=10,  # Increased to 10 since evergreen articles might be fewer
    filter=combined_filter,
    output_fields=["title", "category", "evergreen", "tags", "created_at"]
)

if results[0]:
    print(f"Top Evergreen Results (After {cutoff_date}):\n")
    for i, hit in enumerate(results[0]):
        entity = hit['entity']
        created_timestamp = entity.get('created_at')
        created_str = timestamp_to_datetime_string(created_timestamp)
        
        print(f"{i+1}. {entity.get('title')[:70]}...")
        print(f"   Category: {entity.get('category')} | Evergreen: {entity.get('evergreen')}")
        print(f"   Tags: {entity.get('tags', 'No tags')}")
        print(f"   Created: {created_str}")
    print(f"\nTotal results: {len(results[0])}")
else:
    print("No evergreen articles found after this date.")

Query: 'Halloween outdoor activities'
Filters:
  - evergreen = True (timeless content)
  - created_at >= '2025-10-09'

Top Evergreen Results (After 2025-10-09):

1. 13 of the Most Haunted Hikes in the U.S....
   Category: Destinations | Evergreen: True
   Tags: evergreen, Halloween, Hiking
   Created: 2025-10-16 04:22:41
2. The Thule Outset Hitch-Mounted Tent Turns Your Car Into a Campsite on ...
   Category: Camping | Evergreen: True
   Tags: 2025 Gear Reviews, Car Camping, Car Racks, Commerce, evergreen
   Created: 2025-10-14 03:30:11
3. The Best Daypacks for Every Kind of Hiker (2025)...
   Category: Daypacks | Evergreen: True
   Tags: 2025 Gear Reviews, 2025 Summer Gear Guide, backpack, Commerce, Day Packs
   Created: 2025-10-16 04:31:44
4. Everything You Need To Know Before Skiing Telluride For The First Time...
   Category: Resort Skiing | Evergreen: True
   Tags: evergreen, Telluride Ski Resort
   Created: 2025-10-13 07:39:24
5. He’s Hunted for Elk for 40 Years but Hasn’t Killed

## 11. Advanced Expression Filtering

Demonstrate Milvus's powerful expression syntax

In [13]:
query_text = "outdoor activities"

print(f"Query: '{query_text}'")
print(f"Filter: category IN ['Attractions', 'Events', 'Hiking']\n")

# Generate query embedding
query_embedding = embedding_model.embed_text(query_text)

# Use IN operator for multiple categories
filter_expr = 'category in ["Attractions", "Events", "Hiking"]'

# Search with IN filter
results = client.search(
    collection_name=COLLECTION_NAME,
    data=[query_embedding.tolist()],
    limit=5,
    filter=filter_expr,
    output_fields=["title", "category", "source"]
)

print(f"Found {len(results[0])} results:\n")
for i, hit in enumerate(results[0]):
    distance = hit['distance']
    entity = hit['entity']
    
    print(f"{i+1}. {entity.get('title')[:70]}...")
    print(f"   Category: {entity.get('category')} | Source: {entity.get('source')}")
    print(f"   Distance: {distance:.4f}")

Query: 'outdoor activities'
Filter: category IN ['Attractions', 'Events', 'Hiking']

Found 4 results:

1. A Missing Dog Helped a Stranded Hiker Return to Shadow Mountain Trail....
   Category: Hiking | Source: OUTSIDE
   Distance: 0.2711
2. Make This Your First Stop Grand Canyon Stop...
   Category: Attractions | Source: MY_GRAND_CANYON
   Distance: 0.2397
3. Two Hikers in British Columbia Were Hospitalized After a Grizzly Sow A...
   Category: Hiking | Source: OUTSIDE
   Distance: 0.2142
4. She Became the First Woman to Complete This 3,600-Mile Thru-Hike—and B...
   Category: Hiking | Source: OUTSIDE
   Distance: 0.1772


## 12. Performance Summary

In [14]:
from utils.benchmark import benchmark_queries

# Define query function for Milvus/Zilliz Cloud
def milvus_query_fn(query_text: str):
    """Query function for Milvus benchmarking."""
    query_embedding = embedding_model.embed_text(query_text)
    return client.search(
        collection_name=COLLECTION_NAME,
        data=[query_embedding.tolist()],
        limit=10
    )

# Run standardized benchmark
results = benchmark_queries(milvus_query_fn)

# Get collection stats
stats = client.get_collection_stats(COLLECTION_NAME)
print(f"\nCollection Statistics:")
print(f"  - Total articles: {stats.get('row_count', 0)}")
print(f"  - Vector dimensions: 384")
print(f"  - Index: AUTOINDEX (automatically optimized)")
print(f"  - Distance metric: COSINE")

Running performance benchmark...

'outdoor hiking adventures' -> 66.4ms
'cycling race performance' -> 41.5ms
'travel destinations and tips' -> 51.2ms
'fitness training techniques' -> 45.1ms
'gear reviews and recommendations' -> 49.4ms

Performance Summary:
  - Average query time: 50.7ms
  - Min query time: 41.5ms
  - Max query time: 66.4ms

Collection Statistics:
  - Total articles: 100
  - Vector dimensions: 384
  - Index: AUTOINDEX (automatically optimized)
  - Distance metric: COSINE


## 13. Cleanup (Optional)

## Key Takeaways - Zilliz Cloud (Managed Milvus)

### ✅ Strengths
1. **Enterprise-Grade** - Battle-tested by eBay, Walmart, PayPal
2. **AutoIndex** - Automatic index optimization (no manual HNSW tuning!)
3. **Fully Managed** - No infrastructure, index creation, or loading steps
4. **Powerful Expressions** - Rich boolean expressions with AND/OR/IN operators
5. **High Performance** - Optimized for large-scale applications
6. **Scalability** - Designed for billions of vectors
7. **Dynamic Fields** - Flexible metadata via $meta JSON
8. **Simpler API** - MilvusClient abstracts complexity

### ⚠️ Considerations
1. **Strict Schema** - VARCHAR fields need max_length
2. **No Native Arrays** - Tags stored as VARCHAR (comma-separated)
3. **Timestamps Only** - No native date type, must use INT64 timestamps
4. **Expression Syntax** - Different from SQL (but powerful)
5. **Free Tier Limits** - 5GB storage, 2.5M vCUs (monitor usage)

### 🎯 Best For
- **Enterprise deployments** - Production-grade reliability
- **Large-scale applications** - Billions of vectors
- **Complex filtering** - Rich boolean expressions
- **Managed service** - Don't want to manage infrastructure
- **Performance-critical** - Optimized indexing and search

### 📊 Comparison with Self-Hosted Milvus
- **✅ AutoIndex** vs ❌ Manual HNSW configuration
- **✅ Auto-loading** vs ❌ Manual collection.load()
- **✅ Row-based insert** vs ❌ Columnar format only
- **✅ MilvusClient API** vs ❌ Complex Collection API
- **✅ No flush needed** vs ❌ Manual flush() required

### 📊 Comparison with Other DBs
- **vs Chroma**: More powerful expressions, better scalability, AutoIndex
- **vs Weaviate**: Faster at scale, but no GraphQL, no native dates/arrays
- **vs Qdrant**: Similar performance, but stricter schema, managed service
- **vs Pinecone**: More flexible expressions, better for self-hosting option

### 💡 Key Zilliz Cloud Advantages
1. **No manual index creation** - AUTOINDEX handles it
2. **No manual loading** - Collection auto-loads after creation
3. **Simpler insertion** - Row-based format supported
4. **Managed service** - No infrastructure management
5. **Enterprise support** - Production SLAs available

In [None]:
# Close connection
client.close()
print("✓ Disconnected from Zilliz Cloud")