<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/vector_stores/ZeusDBIndexDemo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ZeusDB Vector Store

This document explains how to use ZeusDB as a vector store in LlamaIndex. 

[ZeusDB](https://www.zeusdb.com) is a high-performance vector database written in Rust, offering features like product quantization, persistent storage, and enterprise-grade logging. 

Follow these instructions and examples below to enhance your LlamaIndex apps with ZeusDB's production capabilities.

---

## Setup

Install the ZeusDB LlamaIndex integration package from PyPi:

In [None]:
pip install llama-index-vector-stores-zeusdb

*Setup in Jupyter Notebooks*

> 💡 Tip: If you’re working inside Jupyter or Google Colab, use the %pip magic command so the package is installed into the active kernel:

In [None]:
%pip install llama-index-vector-stores-zeusdb

---

## Getting Started

This example uses OpenAIEmbeddings, which requires an OpenAI API key – [Get your OpenAI API key here](https://platform.openai.com/api-keys)

Install the LlamaIndex Core and OpenAI integration packages from PyPi:

In [None]:
pip install llama-index-core
pip install llama-index-llms-openai
pip install llama-index-embeddings-openai

# Use these commands if inside Jupyter Notebooks
# %pip install llama-index-core
# %pip install llama-index-llms-openai
# %pip install llama-index-embeddings-openai

#### Please choose an option below for your OpenAI key integration

*Option 1: 🔑 Enter your API key each time*  

Use getpass in Jupyter to securely input your key for the current session:

In [None]:
import os
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")

*Option 2: 🗂️ Use a .env file*

Keep your key in a local .env file and load it automatically with python-dotenv

In [None]:
from dotenv import load_dotenv

load_dotenv()  # reads .env and sets OPENAI_API_KEY

🎉🎉 That's it! You are good to go.

---

## Initialization

In [None]:
# Import required Packages and Classes
from llama_index.core import VectorStoreIndex, Document, StorageContext
from llama_index.vector_stores.zeusdb import ZeusDBVectorStore
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings

In [None]:
# Set up embedding model and LLM
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
Settings.llm = OpenAI(model="gpt-5")

---

## Quickstart

In [None]:
# Create ZeusDB vector store
vector_store = ZeusDBVectorStore(
    dim=1536, distance="cosine", index_type="hnsw"  # OpenAI embedding dimension
)

# Create storage context
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# Create documents
documents = [
    Document(text="ZeusDB is a high-performance vector database."),
    Document(text="LlamaIndex provides RAG capabilities."),
    Document(text="Vector search enables semantic similarity."),
]

# Create index and store documents
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

# Query the index
query_engine = index.as_query_engine()
response = query_engine.query("What is ZeusDB?")
print(response)

---

## Direct Query Interface Example

In [None]:
from llama_index.core.vector_stores.types import VectorStoreQuery

# Create query
embed_model = Settings.embed_model
query_embedding = embed_model.get_text_embedding("machine learning")

query_obj = VectorStoreQuery(query_embedding=query_embedding, similarity_top_k=2)

# Execute query
results = vector_store.query(query_obj)

# Results contain IDs and similarities
print(f"Found {len(results.ids or [])} results:")
for node_id, similarity in zip(results.ids or [], results.similarities or []):
    print(f"  ID: {node_id}, Similarity: {similarity:.4f}")

---

## MMR Search

In [None]:
# MMR search via direct query
mmr_results = vector_store.query(
    query_obj,
    mmr=True,
    fetch_k=10,
    mmr_lambda=0.7,  # 0.0=max diversity, 1.0=pure relevance
)

print(f"MMR Results: {len(mmr_results.ids or [])} items (with diversity)")

---

## Search with Metadata Filtering

In [None]:
from llama_index.core.vector_stores.types import (
    MetadataFilters,
    FilterOperator,
    FilterCondition,
)

# Create a fresh vector store for this example
vector_store = ZeusDBVectorStore(dim=1536, distance="cosine")
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# Create documents with metadata
documents_with_meta = [
    Document(
        text="Python is great for data science",
        metadata={"category": "tech", "year": 2024},
    ),
    Document(
        text="JavaScript is for web development",
        metadata={"category": "tech", "year": 2023},
    ),
    Document(
        text="Climate change impacts ecosystems",
        metadata={"category": "environment", "year": 2024},
    ),
]

# Build index with metadata
index = VectorStoreIndex.from_documents(
    documents_with_meta, storage_context=storage_context
)

# Create metadata filter
filters = MetadataFilters.from_dicts(
    [
        {"key": "category", "value": "tech", "operator": FilterOperator.EQ},
        {"key": "year", "value": 2024, "operator": FilterOperator.GTE},
    ],
    condition=FilterCondition.AND,
)

# Use the retriever with filters (recommended approach)
retriever = index.as_retriever(similarity_top_k=5, filters=filters)
filtered_results = retriever.retrieve("programming")

# Process results
for r in filtered_results:
    print(f"- {r.node.get_content(metadata_mode='none')}")
    print(f"  Metadata: {r.node.metadata}\n")

---

## Save and Load indexes

In [None]:
# Save index
save_path = "my_index.zdb"
vector_store.save_index(save_path)
print(f"✅ Index saved to {save_path}")
print(f"   Vector count: {vector_store.get_vector_count()}")

In [None]:
# Load index
loaded_store = ZeusDBVectorStore.load_index(save_path)
print(f"✅ Index loaded from {save_path}")
print(f"   Vector count: {loaded_store.get_vector_count()}")

---

## Quantization Example

In [None]:
# Create quantized vector store for memory efficiency
quantization_config = {
    "type": "pq",
    "subvectors": 8,
    "bits": 8,
    "training_size": 1000,
    "storage_mode": "quantized_only",
}

vector_store = ZeusDBVectorStore(
    dim=1536,
    distance="cosine",
    index_type="hnsw",
    quantization_config=quantization_config,
)

# Check quantization status
print(f"Is quantized: {vector_store.is_quantized()}")
print(f"Can use quantization: {vector_store.can_use_quantization()}")
print(f"Training progress: {vector_store.get_training_progress():.1f}%")
print(f"Storage mode: {vector_store.get_storage_mode()}")

---

## Delete Operations Example

In [None]:
from llama_index.core import VectorStoreIndex, Document, StorageContext
from llama_index.vector_stores.zeusdb import ZeusDBVectorStore

# Create a fresh vector store for this example
delete_vs = ZeusDBVectorStore(dim=1536, distance="cosine")
delete_sc = StorageContext.from_defaults(vector_store=delete_vs)

# Create documents
delete_docs = [Document(text=f"Document {i}", metadata={"doc_id": i}) for i in range(5)]

# Build index
delete_index = VectorStoreIndex.from_documents(delete_docs, storage_context=delete_sc)

print(f"Before delete: {delete_vs.get_vector_count()} vectors")

# Get node IDs to delete
retriever = delete_index.as_retriever(similarity_top_k=10)
results = retriever.retrieve("document")

if results:
    # Extract node IDs from results
    node_ids_to_delete = [result.node.node_id for result in results[:2]]
    print(f"Deleting node IDs: {node_ids_to_delete[0][:8]}...")

    # Delete by node IDs
    delete_vs.delete_nodes(node_ids=node_ids_to_delete)
    print(f"After delete: {delete_vs.get_vector_count()} vectors")
    print("✅ delete_nodes(node_ids=[...]) works!")

# Demonstrate unsupported delete by ref_doc_id
try:
    delete_vs.delete(ref_doc_id="doc_1")
    print("❌ Should have raised NotImplementedError")
except NotImplementedError as e:
    print("❌ delete(ref_doc_id='...') raises NotImplementedError")
    print(f"   (This is expected - not supported by backend)")

---

## Async Operations Example

In [None]:
import asyncio
from llama_index.core.schema import TextNode

# In Jupyter, use nest_asyncio to handle event loops
try:
    import nest_asyncio

    nest_asyncio.apply()
except ImportError:
    pass


async def async_operations():
    # Create nodes
    nodes = [TextNode(text=f"Document {i}", metadata={"doc_id": i}) for i in range(10)]

    # Generate embeddings (required before adding)
    embed_model = Settings.embed_model
    for node in nodes:
        node.embedding = embed_model.get_text_embedding(node.text)

    # Add nodes asynchronously
    node_ids = await vector_store.async_add(nodes)
    print(f"Added {len(node_ids)} nodes")

    # Query asynchronously
    query_embedding = embed_model.get_text_embedding("document")
    query_obj = VectorStoreQuery(query_embedding=query_embedding, similarity_top_k=3)

    results = await vector_store.aquery(query_obj)
    print(f"Found {len(results.ids or [])} results")

    # Delete asynchronously
    await vector_store.adelete_nodes(node_ids=node_ids[:2])
    print(f"Deleted 2 nodes, {vector_store.get_vector_count()} remaining")


# Run async function
await async_operations()  # In Jupyter
# asyncio.run(async_operations())  # In regular Python scripts

---

## Performance Monitoring

In [None]:
# Get index statistics
stats = vector_store.get_zeusdb_stats()
print(f"Key stats: vectors={stats.get('total_vectors')}, space={stats.get('space')}")

# Get vector count
count = vector_store.get_vector_count()
print(f"Vector count: {count}")

# Get detailed index info
info = vector_store.info()
print(f"Index info: {info}")

# Check quantization status
if vector_store.is_quantized():
    progress = vector_store.get_training_progress()
    quant_info = vector_store.get_quantization_info()
    print(f"Quantization: {progress:.1f}% complete")
    print(f"Compression: {quant_info['compression_ratio']:.1f}x")
else:
    print("Index is not quantized")