# My Vector Database - Demo

A lightweight, production-ready vector database with a RESTful API and Python SDK

**Topics Covered**:
1. Architecture Overview
2. Creating Data
3. Reading & Updating
4. Vector Search
5. Filtering
6. Persistence
7. Agno Integration
8. Design Patterns

---

## 1. Architecture Overview

The vector database is organized in a **3-tier hierarchy**:
- **Libraries**: Top-level containers with vector index configuration
- **Documents**: Logical groupings within a library
- **Chunks**: Individual searchable units with text, embeddings, and metadata

![alt text](../docs/data_model.png)

### Connection & Validation

First, verify the API server is running and check initial state.

In [1]:
from my_vector_db.sdk import VectorDBClient

client = VectorDBClient(base_url="http://localhost:8000")

# Validate connection and check initial state
status = client.get_health_status()
status

{'status': 'healthy',
 'service': 'vector-db',
 'version': '0.1.0',
 'storage': {'libraries': 1, 'documents': 7, 'chunks': 7}}

---

## 2. Creating Data: Hierarchical Structure

The hierarchical design allows flexible organization:
- **Libraries** define index configuration (FLAT, HNSW) and distance metrics
- **Documents** group related chunks (e.g., chapters in a book)
- **Chunks** are the actual searchable units with embeddings

In [2]:
# Create library with index configuration
library = client.create_library(
    name="my_demo_library",
    index_type="flat",
    index_config={"metric": "cosine"},
    metadata={"description": "Sample library", "version": "1.0"},
)

print(f"Created library: {library.id}")
print(f"Index type: {library.index_type}")
print(f"Metric: {library.index_config['metric']}")

Created library: 84aed1c8-9272-4a35-8f7e-e0e47f231212
Index type: IndexType.FLAT
Metric: cosine


In [3]:
# Create document within library
document = client.create_document(
    library_id=library.id,
    name="demo_document",
    metadata={"year": 2024, "source": "tech blogs"},
)

print(f"Created document: {document.id}")
print(f"Name: {document.name}")

Created document: 97601ceb-4397-4e2c-904f-00fa900dfb91
Name: demo_document


In [4]:
from my_vector_db import Chunk

# Define chunks with Chunk objects
demo_chunks = [
    Chunk(
        document_id=document.id,
        text="The quick brown fox jumps over the lazy dog",
        embedding=[0.1, 0.2, 0.3, 0.4, 0.5],
        metadata={"source": "example", "position": 1}
    ),
    Chunk(
        document_id=document.id,
        text="A fast red bird flies through the clear blue sky",
        embedding=[0.9, 0.1, 0.5, 0.3, 0.7],
        metadata={"source": "example", "position": 2}
    )
]

client.add_chunks(chunks=demo_chunks)

[Chunk(id=UUID('24840eb0-0138-4d12-8ca8-c1b148bfbfce'), text='The quick brown fox jumps over the lazy dog', embedding=[0.1, 0.2, 0.3, 0.4, 0.5], metadata={'source': 'example', 'position': 1}, document_id=UUID('97601ceb-4397-4e2c-904f-00fa900dfb91'), created_at=datetime.datetime(2025, 11, 5, 14, 48, 38, 950485), updated_at=datetime.datetime(2025, 11, 5, 14, 48, 38, 950485)),
 Chunk(id=UUID('d0b822a9-6e5e-41d6-8253-454a0e7a42bf'), text='A fast red bird flies through the clear blue sky', embedding=[0.9, 0.1, 0.5, 0.3, 0.7], metadata={'source': 'example', 'position': 2}, document_id=UUID('97601ceb-4397-4e2c-904f-00fa900dfb91'), created_at=datetime.datetime(2025, 11, 5, 14, 48, 38, 950496), updated_at=datetime.datetime(2025, 11, 5, 14, 48, 38, 950496))]

In [5]:
for chunk in client.list_chunks(document_id=document.id):
    chunk.metadata["reviewed_by"] = "demo_bot"
    client.update_chunk(chunk)

print("Updated all chunks with review metadata")

# Verify update
sample = client.list_chunks(document_id=document.id)[0]
sample

Updated all chunks with review metadata


Chunk(id=UUID('24840eb0-0138-4d12-8ca8-c1b148bfbfce'), text='The quick brown fox jumps over the lazy dog', embedding=[0.1, 0.2, 0.3, 0.4, 0.5], metadata={'source': 'example', 'position': 1, 'reviewed_by': 'demo_bot'}, document_id=UUID('97601ceb-4397-4e2c-904f-00fa900dfb91'), created_at=datetime.datetime(2025, 11, 5, 14, 48, 38, 950485), updated_at=datetime.datetime(2025, 11, 5, 14, 48, 38, 960591))

In [6]:
# Update document name
client.update_document(document=document.id, name="demo_document_v2")

Document(id=UUID('97601ceb-4397-4e2c-904f-00fa900dfb91'), name='demo_document_v2', chunk_ids=[UUID('24840eb0-0138-4d12-8ca8-c1b148bfbfce'), UUID('d0b822a9-6e5e-41d6-8253-454a0e7a42bf')], metadata={'year': 2024, 'source': 'tech blogs'}, library_id=UUID('84aed1c8-9272-4a35-8f7e-e0e47f231212'), created_at=datetime.datetime(2025, 11, 5, 14, 48, 38, 942356), updated_at=datetime.datetime(2025, 11, 5, 14, 48, 38, 968985))

In [7]:
# delete document: cascade deletes chunks too
client.delete_document(document_id=document.id)

### Sample Dataset

This dataset includes:
- **7 articles (chunks)** across 3 categories (AI, Cloud, Security)
- **Rich metadata**: category, topic, confidence scores, authors
- **5D embeddings**: Simplified for demo purposes (production typically uses 768-1536 dimensions)

In [8]:
chunks = [
    # AI/ML chunks
    {
        "text": "Machine learning models use neural networks for pattern recognition",
        "embedding": [0.9, 0.8, 0.1, 0.2, 0.3],
        "metadata": {
            "category": "ai",
            "topic": "machine learning",
            "confidence": 0.95,
            "author": "Alice",
        },
    },
    {
        "text": "Deep learning architectures enable complex AI applications",
        "embedding": [0.85, 0.75, 0.15, 0.25, 0.35],
        "metadata": {
            "category": "ai",
            "topic": "deep learning",
            "confidence": 0.88,
            "author": "Bob",
        },
    },
    {
        "text": "Reinforcement learning enables autonomous decision making",
        "embedding": [0.87, 0.77, 0.13, 0.23, 0.33],
        "metadata": {
            "category": "ai",
            "topic": "reinforcement learning",
            "confidence": 0.90,
            "author": "Alice",
        },
    },
    # Cloud computing chunks
    {
        "text": "Cloud infrastructure provides scalable computing resources",
        "embedding": [0.3, 0.2, 0.9, 0.8, 0.1],
        "metadata": {
            "category": "cloud",
            "topic": "infrastructure",
            "confidence": 0.92,
            "author": "Charlie",
        },
    },
    {
        "text": "Kubernetes orchestrates containerized applications in the cloud",
        "embedding": [0.35, 0.25, 0.85, 0.75, 0.15],
        "metadata": {
            "category": "cloud",
            "topic": "kubernetes",
            "confidence": 0.89,
            "author": "Alice",
        },
    },
    # Security chunks
    {
        "text": "Cybersecurity best practices protect against data breaches",
        "embedding": [0.1, 0.2, 0.3, 0.9, 0.8],
        "metadata": {
            "category": "security",
            "topic": "cybersecurity",
            "confidence": 0.91,
            "author": "Bob",
        },
    },
    {
        "text": "Machine learning detects anomalies in network security",
        "embedding": [0.6, 0.5, 0.4, 0.7, 0.6],
        "metadata": {
            "category": "security",
            "topic": "machine learning",
            "confidence": 0.82,
            "author": "Charlie",
        },
    },
]

print(f"Prepared {len(chunks)} chunks")

Prepared 7 chunks


In [9]:
# Create library with index configuration
library = client.create_library(
    name="tech_articles",
    index_type="flat",
    index_config={"metric": "cosine"},
    metadata={"description": "Technology articles", "version": "1.0"},
)

print(f"Created library: {library.id}")
print(f"Index type: {library.index_type}")
print(f"Metric: {library.index_config['metric']}")

Created library: 5fbbba2b-ca18-4ad0-8aae-a652c3a5de76
Index type: IndexType.FLAT
Metric: cosine


In [10]:
for i, chunk in enumerate(chunks):
    new_doc = client.create_document(
        library_id=library.id,
        name=f"doc_{i+1}",
        metadata={"source": "demo_dataset"},
    )
    client.add_chunks(chunks=[chunk], document_id=new_doc.id)

### Batch Insert

**Design Pattern**: Batch operations are preferred over individual inserts.

**Benefits**:
- Single API call reduces HTTP round-trips
- Atomic transactions (all-or-nothing)
- Better performance for large datasets

**Best Practice**: Always use `add_chunks()` for multiple inserts tyo single document rather than looping with individual `add_chunk()` calls.

In [11]:
# Batch insert all chunks at once
my_doc = client.create_document(
    library_id=library.id,
    name="batch_doc",
    metadata={"source": "demo_dataset"},
)
chunks = client.add_chunks(document_id=my_doc.id, chunks=chunks)

print(f"Inserted {len(chunks)} chunks")
print(f"Sample: {chunks[0].text[:50]}...")

Inserted 7 chunks
Sample: Machine learning models use neural networks for pa...


---

## 3. Reading & Updating Data

After creation, all entities can be accessed by their parent UUID. This simplifies API usage while maintaining referential integrity.

In [12]:
# List operations at each level
libraries = client.list_libraries()
print(f"Libraries: {len(libraries)}")

documents = client.list_documents(library_id=library.id)
print(f"Documents: {len(documents)}")

chunks = client.list_all_chunks(library_id=library.id)
print(f"Chunks: {len(chunks)}")

Libraries: 3
Documents: 8
Chunks: 14


### Update Operations

Updates are performed on full objects. Notice how the `updated_at` timestamp changes while `created_at` remains unchanged.

In [13]:
# Update metadata on all chunks
documents = client.list_documents(library_id=library.id)
for document in documents:
    for chunk in client.list_chunks(document_id=document.id):
        chunk.metadata["reviewed_by"] = "demo_bot"
        client.update_chunk(chunk)

print("Updated all chunks with review metadata")

# Verify update
sample = client.list_chunks(document_id=documents[-1].id)
sample

Updated all chunks with review metadata


[Chunk(id=UUID('82bce628-36e7-4f30-97de-b2ffdf6451cc'), text='Machine learning models use neural networks for pattern recognition', embedding=[0.9, 0.8, 0.1, 0.2, 0.3], metadata={'category': 'ai', 'topic': 'machine learning', 'confidence': 0.95, 'author': 'Alice', 'reviewed_by': 'demo_bot'}, document_id=UUID('d2bbe16e-9c5f-4a0d-9f85-987700dce22c'), created_at=datetime.datetime(2025, 11, 5, 14, 48, 39, 14387), updated_at=datetime.datetime(2025, 11, 5, 14, 48, 39, 54390)),
 Chunk(id=UUID('eeedae02-f631-45da-89f3-8fdba9946bcf'), text='Deep learning architectures enable complex AI applications', embedding=[0.85, 0.75, 0.15, 0.25, 0.35], metadata={'category': 'ai', 'topic': 'deep learning', 'confidence': 0.88, 'author': 'Bob', 'reviewed_by': 'demo_bot'}, document_id=UUID('d2bbe16e-9c5f-4a0d-9f85-987700dce22c'), created_at=datetime.datetime(2025, 11, 5, 14, 48, 39, 14392), updated_at=datetime.datetime(2025, 11, 5, 14, 48, 39, 55300)),
 Chunk(id=UUID('09026f40-714c-4563-be18-013965b36b9f'), t

---

## 4. Vector Search

The vector database performs **k-nearest neighbor (kNN)** search using the configured distance metric.

**Current Implementation**: FLAT index
- **Complexity**: O(n) - exhaustive search
- **Recall**: 100% (exact search, guaranteed true nearest neighbors)
- **Best for**: < 10,000 vectors

**Alternative**: HNSW index (planned)
- **Complexity**: O(log n) - approximate search
- **Recall**: ~95-99% (tunable)
- **Best for**: Millions of vectors

In [14]:
# Query about AI/ML topics
query_vector = [0.88, 0.78, 0.12, 0.22, 0.32]

search_results = client.search(
    library_id=library.id,
    embedding=query_vector,
    k=5,
)

print(f"Query time: {search_results.query_time_ms:.2f}ms")
print(f"Searched {len(search_results.results)} chunks")
print(f"Index type: {library.index_type}\n")

for i, result in enumerate(search_results.results, 1):
    print(f"{i}. Score: {result.score:.4f}")
    print(f"   {result.text}")
    print(f"   Category: {result.metadata['category']}")
    print(f"   Document: {client.get_document(result.document_id).name}")
    print()

Query time: 0.81ms
Searched 5 chunks
Index type: IndexType.FLAT

1. Score: 0.9999
   Reinforcement learning enables autonomous decision making
   Category: ai
   Document: doc_3

2. Score: 0.9999
   Reinforcement learning enables autonomous decision making
   Category: ai
   Document: batch_doc

3. Score: 0.9995
   Machine learning models use neural networks for pattern recognition
   Category: ai
   Document: doc_1

4. Score: 0.9995
   Machine learning models use neural networks for pattern recognition
   Category: ai
   Document: batch_doc

5. Score: 0.9987
   Deep learning architectures enable complex AI applications
   Category: ai
   Document: doc_2



### Add a new chunk and rebuild index

In [15]:
new_doc = client.create_document(
    library_id=library.id,
    name="additional_ai_article",
    metadata={"year": 2024, "source": "tech blogs"},
)

new_chunk = Chunk(
    document_id=new_doc.id,
    text="AI is super cool",
    embedding=[0.8, 0.78, 0.12, 0.22, 0.3],
    metadata={
        "category": "ai",
        "topic": "machine learning",
        "confidence": 0.95,
        "author": "Alice",
    },
)

client.add_chunk(chunk=new_chunk, document_id=new_doc.id)

build_index_response = client.build_index(library_id=library.id)
build_index_response

BuildIndexResult(library_id=UUID('5fbbba2b-ca18-4ad0-8aae-a652c3a5de76'), total_vectors=15, dimension=5, index_type=<IndexType.FLAT: 'flat'>, index_config={'metric': 'cosine'})

In [16]:
# Same query after adding new chunk
query_vector = [0.88, 0.78, 0.12, 0.22, 0.32]

# index is automatically updated
search_results = client.search(
    library_id=library.id,
    embedding=query_vector,
    k=5,
)

print(f"Query time: {search_results.query_time_ms:.2f}ms")
print(f"Searched {len(search_results.results)} chunks")
print(f"Index type: {library.index_type}\n")

for i, result in enumerate(search_results.results, 1):
    print(f"{i}. Score: {result.score:.4f}")
    print(f"   {result.text}")
    print(f"   Category: {result.metadata['category']}")
    print(f"   Document: {client.get_document(result.document_id).name}")
    print()

Query time: 0.15ms
Searched 5 chunks
Index type: IndexType.FLAT

1. Score: 0.9999
   Reinforcement learning enables autonomous decision making
   Category: ai
   Document: doc_3

2. Score: 0.9999
   Reinforcement learning enables autonomous decision making
   Category: ai
   Document: batch_doc

3. Score: 0.9995
   Machine learning models use neural networks for pattern recognition
   Category: ai
   Document: doc_1

4. Score: 0.9995
   Machine learning models use neural networks for pattern recognition
   Category: ai
   Document: batch_doc

5. Score: 0.9989
   AI is super cool
   Category: ai
   Document: additional_ai_article



---

## 5. Filtering: Two Approaches

The SDK provides two complementary filtering strategies:

### Approach 1: Custom Functions (SDK-Only)
- **Python functions** for maximum flexibility
- **Client-side** filtering with access to similarity scores
- Complex logic without API changes
- **Use for**: Common filtering scenarios, prototyping

### Approach 2: Declarative Filters (API-Compatible)
- **JSON-serializable** filter definitions
- **Server-side** filtering via REST API
- Works with any HTTP client
- **Use for**: Advanced use cases, production deployments, cross-language clients

#### Custom Filter Functions (SDK Only)

For complex filtering logic, pass a Python function to `filter_function`. The function receives `SearchResult` objects (not `Chunk` objects), which include the similarity score.

**Implementation Detail**: 
- Uses over-fetch strategy (k×3) to compensate for client-side filtering
- Operates on `SearchResult` to avoid circular dependencies
- Enables filtering based on similarity scores

In [17]:
query_vector = [0.88, 0.78, 0.12, 0.22, 0.32]

results = client.search(
    library_id=library.id,
    embedding=query_vector,
    k=5,
    filter_function=lambda chunk: chunk.metadata.get("category") == "ai"
)

print("Filtered by category='ai':\n")
for i, result in enumerate(results.results, 1):
    print(f"{i}. {result.text[:60]}...")
    print(f"   Category: {result.metadata['category']}")
    print(f"   Score: {result.score:.4f}")

Filtered by category='ai':

1. Reinforcement learning enables autonomous decision making...
   Category: ai
   Score: 0.9999
2. Reinforcement learning enables autonomous decision making...
   Category: ai
   Score: 0.9999
3. Machine learning models use neural networks for pattern reco...
   Category: ai
   Score: 0.9995
4. Machine learning models use neural networks for pattern reco...
   Category: ai
   Score: 0.9995
5. AI is super cool...
   Category: ai
   Score: 0.9989


In [18]:
from my_vector_db.sdk.models import SearchResult


def complex_filter_function(result: SearchResult):
    """
    Returns True for:
    - High-confidence AI articles (category='ai' AND confidence > 0.9)
    - OR any security article (category='security')
    """
    category = result.metadata.get("category")
    confidence = result.metadata.get("confidence")

    # High-confidence AI articles
    is_high_confidence_ai = (
        category == "ai" and confidence is not None and confidence > 0.9
    )

    # Any security article
    is_security = category == "security"

    # OR logic: either condition passes
    return is_high_confidence_ai or is_security


results = client.search(
    library_id=library.id,
    embedding=query_vector,
    k=10,
    filter_function=complex_filter_function,
)

print("Filtered by (AI AND confidence>0.9) OR security:\n")
for i, result in enumerate(results.results, 1):
    print(f"{i}. [{result.metadata['category']}] {result.text[:50]}...")
    print(
        f"   Confidence: {result.metadata['confidence']}, Score: {result.score:.4f}\n"
    )

Filtered by (AI AND confidence>0.9) OR security:

1. [ai] Machine learning models use neural networks for pa...
   Confidence: 0.95, Score: 0.9995

2. [ai] Machine learning models use neural networks for pa...
   Confidence: 0.95, Score: 0.9995

3. [ai] AI is super cool...
   Confidence: 0.95, Score: 0.9989

4. [security] Machine learning detects anomalies in network secu...
   Confidence: 0.82, Score: 0.8285

5. [security] Machine learning detects anomalies in network secu...
   Confidence: 0.82, Score: 0.8285

6. [security] Cybersecurity best practices protect against data ...
   Confidence: 0.91, Score: 0.4679

7. [security] Cybersecurity best practices protect against data ...
   Confidence: 0.91, Score: 0.4679



In [19]:
from my_vector_db.sdk.models import SearchResult


# Custom filter combining multiple conditions
def high_quality_ai_by_alice(result: SearchResult) -> bool:
    return (
        result.score > 0.95
        and result.metadata.get("category") == "ai"
        and result.metadata.get("author") == "Alice"
        and result.metadata.get("confidence", 0) > 0.85
    )


results = client.search(
    library_id=library.id,
    embedding=query_vector,
    k=5,
    filter_function=high_quality_ai_by_alice,
)

print("Custom filter: high score, AI, Alice, high confidence\n")
for i, result in enumerate(results.results, 1):
    print(f"{i}. {result.text}")
    print(
        f"   Score: {result.score:.4f}, Confidence: {result.metadata['confidence']}\n"
    )

Custom filter: high score, AI, Alice, high confidence

1. Reinforcement learning enables autonomous decision making
   Score: 0.9999, Confidence: 0.9

2. Reinforcement learning enables autonomous decision making
   Score: 0.9999, Confidence: 0.9

3. Machine learning models use neural networks for pattern recognition
   Score: 0.9995, Confidence: 0.95

4. Machine learning models use neural networks for pattern recognition
   Score: 0.9995, Confidence: 0.95

5. AI is super cool
   Score: 0.9989, Confidence: 0.95



#### Declarative Filters (API-Compatible)

For complex filtering logic, use JSON-serializable filter definitions. These filters are processed server-side, ensuring compatibility with any HTTP client.

- Supports logical operators (AND, OR, NOT) and comparison operators (==, !=, <, <=, >, >=)
- Enables filtering based on chunk metadata without custom code

In [20]:
from my_vector_db.domain.models import (
    SearchFilters,
    FilterGroup,
    MetadataFilter,
    FilterOperator,
    LogicalOperator,
)

# Simple metadata filter
filters = SearchFilters(
    metadata=FilterGroup(
        filters=[
            MetadataFilter(field="category", operator=FilterOperator.EQUALS, value="ai")
        ],
    )
)

results = client.search(
    library_id=library.id,
    embedding=query_vector,
    k=5,
    filters=filters,
)

print("Filtered by category='ai':\n")
for i, result in enumerate(results.results, 1):
    print(f"{i}. {result.text[:60]}...")
    print(f"   Category: {result.metadata['category']}")
    print(f"   Score: {result.score:.4f}")

Filtered by category='ai':

1. Reinforcement learning enables autonomous decision making...
   Category: ai
   Score: 0.9999
2. Reinforcement learning enables autonomous decision making...
   Category: ai
   Score: 0.9999
3. Machine learning models use neural networks for pattern reco...
   Category: ai
   Score: 0.9995
4. Machine learning models use neural networks for pattern reco...
   Category: ai
   Score: 0.9995
5. AI is super cool...
   Category: ai
   Score: 0.9989


### Complex AND/OR Filters

Filters support nested logic for sophisticated queries. This example finds:
- (AI articles with confidence > 0.9) **OR** (any security article)

In [21]:
complex_filters = SearchFilters(
    metadata=FilterGroup(
        operator=LogicalOperator.OR,
        filters=[
            # High-confidence AI articles
            FilterGroup(
                operator=LogicalOperator.AND,
                filters=[
                    MetadataFilter(
                        field="category", operator=FilterOperator.EQUALS, value="ai"
                    ),
                    MetadataFilter(
                        field="confidence",
                        operator=FilterOperator.GREATER_THAN,
                        value=0.9,
                    ),
                ],
            ),
            # OR any security article
            MetadataFilter(
                field="category", operator=FilterOperator.EQUALS, value="security"
            ),
        ],
    )
)

results = client.search(
    library_id=library.id,
    embedding=query_vector,
    k=10,
    filters=complex_filters,
)

print("Filtered by (AI AND confidence>0.9) OR security:\n")
for i, result in enumerate(results.results, 1):
    print(f"{i}. [{result.metadata['category']}] {result.text[:50]}...")
    print(
        f"   Confidence: {result.metadata['confidence']}, Score: {result.score:.4f}\n"
    )

Filtered by (AI AND confidence>0.9) OR security:

1. [ai] Machine learning models use neural networks for pa...
   Confidence: 0.95, Score: 0.9995

2. [ai] Machine learning models use neural networks for pa...
   Confidence: 0.95, Score: 0.9995

3. [ai] AI is super cool...
   Confidence: 0.95, Score: 0.9989

4. [security] Machine learning detects anomalies in network secu...
   Confidence: 0.82, Score: 0.8285

5. [security] Machine learning detects anomalies in network secu...
   Confidence: 0.82, Score: 0.8285

6. [security] Cybersecurity best practices protect against data ...
   Confidence: 0.91, Score: 0.4679

7. [security] Cybersecurity best practices protect against data ...
   Confidence: 0.91, Score: 0.4679



### Combined Filters
You can also combine both declarative filters and custom filter functions using the `SearchFiltersWithCallable` class. This allows you to leverage the strengths of both approaches in a single search operation.

- filter metatdata category is "ai" on the server side
- filter returned results to only those containing the word "pattern" on the client side

In [22]:
# Simple metadata filter
from my_vector_db.domain.models import SearchFiltersWithCallable


filters = SearchFiltersWithCallable(
    metadata=FilterGroup(
        filters=[
            MetadataFilter(field="category", operator=FilterOperator.EQUALS, value="ai")
        ],
    ),
    custom_filter=lambda r: "pattern" in r.text.lower()
)

results = client.search(
    library_id=library.id,
    embedding=query_vector,
    k=5,
    combined_filters=filters,
)

print("Filtered by category='ai' and containing 'pattern':\n")
for i, result in enumerate(results.results, 1):
    print(f"{i}. {result.text[:60]}...")
    print(f"   Category: {result.metadata['category']}")
    print(f"   Score: {result.score:.4f}")

Filtered by category='ai' and containing 'pattern':

1. Machine learning models use neural networks for pattern reco...
   Category: ai
   Score: 0.9995
2. Machine learning models use neural networks for pattern reco...
   Category: ai
   Score: 0.9995


In [23]:
docs = client.list_documents(library_id=library.id)
doc1 = client.get_document(docs[0].id)
doc2 = client.get_document(docs[1].id)
print(f"Document 1: {doc1.id}, Name: {doc1.name}")
print(f"Document 2: {doc2.id}, Name: {doc2.name}")

doc_filter = SearchFilters(
    document_ids=[doc1.id, doc2.id]
)

results = client.search(
    library_id=library.id,
    embedding=query_vector,
    k=5,
    filters=doc_filter,
)
print("Filtered by docs:\n")
for i, result in enumerate(results.results, 1):
    print(f"{i}. {result.text[:60]}...")
    print(f"   Category: {result.metadata['category']}")
    print(f"   Score: {result.score:.4f}")
    print(f"   Document: {client.get_document(result.document_id).name}")


Document 1: 3f0b52d1-4e46-4185-9533-cf382efae09c, Name: doc_1
Document 2: 3c95f54d-7faf-4238-b2c6-a0be5e4210d7, Name: doc_2
Filtered by docs:



---

## 6. Persistence & Durability

The database supports optional persistence with a simple snapshot-based approach.

**Design Decision**: JSON snapshots
- Saves entire state to JSON every N operations (default: 10)
- Atomic writes using temp file + rename pattern
- Human-readable format for debugging
- Configurable via environment variables

**Tradeoff**: 
- **Pro**: Simple implementation, easy debugging, no threading complexity
- **Con**: May lose last N operations on crash
- **Production Alternative**: PostgreSQL + pgvector with write-ahead logging

In [24]:
import json

# Check current persistence status
status = client.get_persistence_status()

print("Persistence Status:")
print(json.dumps(status, indent=2))

Persistence Status:
{
  "enabled": true,
  "snapshot_exists": true,
  "operations_since_save": 0,
  "snapshot_info": {
    "exists": true,
    "path": "/app/data/snapshot.json",
    "size_bytes": 7400,
    "last_modified": "2025-11-04T21:21:40.566651"
  },
  "save_threshold": -1,
  "stats": {
    "libraries": 3,
    "documents": 16,
    "chunks": 22
  }
}


### Save/Restore Demo

Demonstrate persistence by saving a snapshot, performing a destructive operation, then restoring.

In [25]:
# Save current state
result = client.save_snapshot()
print(json.dumps(result, indent=2))

{
  "message": "Snapshot saved successfully",
  "snapshot_path": "/app/data/snapshot.json",
  "timestamp": "2025-11-05T14:48:39.187521",
  "stats": {
    "libraries": 3,
    "documents": 16,
    "chunks": 22
  },
  "snapshot_info": {
    "exists": true,
    "path": "/app/data/snapshot.json",
    "size_bytes": 21087,
    "last_modified": "2025-11-05T14:48:39.123082"
  }
}


In [26]:
# Delete the library (destructive operation)
client.delete_library(library_id=library.id)

libraries_after_delete = client.list_libraries()
print(f"After delete - Libraries: {len(libraries_after_delete)}")

After delete - Libraries: 2


In [27]:
# Restore from snapshot
result = client.restore_snapshot()

# Verify restoration
libraries_after_restore = client.list_libraries()
print(f"\nVerification - Libraries: {len(libraries_after_restore)}")
for lib in libraries_after_restore:
    print(f"  {lib.name} ({lib.id})")


Verification - Libraries: 3
  tech_articles (d9ecebd5-0a6c-48dd-ab36-80768c5d13d8)
  my_demo_library (84aed1c8-9272-4a35-8f7e-e0e47f231212)
  tech_articles (5fbbba2b-ca18-4ad0-8aae-a652c3a5de76)


---

## 7. Real-World Integration: Agno RAG Agent

This section demonstrates integration with the [Agno](https://github.com/agno-ai/agno) agent framework for building RAG (Retrieval-Augmented Generation) applications.

**RAG Pattern**:
1. User asks a question
2. Agent searches vector DB for relevant context
3. Context augments the LLM prompt
4. LLM generates informed response

**Why This Works**: The hierarchical design (libraries/documents/chunks) naturally maps to knowledge organization, making integration straightforward.

See `examples/agno_example.py` for a complete working implementation.

In [None]:
# Example integration pattern (see agno_example.py for full code)


from agno.agent import Agent
from agno.knowledge.knowledge import Knowledge
from agno.models.anthropic import Claude
from my_vector_db.db import MyVectorDB

# Create vector database connection
vector_db = MyVectorDB(
    api_base_url="http://localhost:8000",
    library_name="Python Programming Guide",
    index_type="flat",
)

# Create knowledge base that uses our vector DB
knowledge = Knowledge(name="Tech Knowledge Base", vector_db=vector_db, max_results=5)

# Create agent with RAG capabilities
agent = Agent(
    name="Tech Assistant",
    knowledge=knowledge,
    model=Claude(id="claude-sonnet-4-5"),
    search_knowledge=True,  # Enable RAG
)

# Start interactive CLI (agent searches vector DB automatically)
agent.print_response(
    "what are the latest trends in AI and cloud computing?", stream=False
)

---

## 8. Design Patterns & Best Practices

This section summarizes key design decisions and recommended practices.

### 1. Batch Operations Pattern

**Recommended**:
```python
chunks = client.add_chunks(document_id=doc.id, chunks=large_list)
```

**Avoid**:
```python
for chunk in large_list:
    client.create_chunk(...)  # Many HTTP round-trips
```

**Takeaway**: Batch operations reduce network overhead and enable atomic transactions.

In [29]:
from my_vector_db import Chunk

demo_doc = client.create_document(
    library_id=library.id,
    name="demo_document",
    metadata={"purpose": "demo"},
)

# Define chunks with Chunk objects
demo_chunks = [
    Chunk(
        document_id=demo_doc.id,
        text="The quick brown fox jumps over the lazy dog",
        embedding=[0.1, 0.2, 0.3, 0.4, 0.5],
        metadata={"source": "example", "position": 1}
    ),
    Chunk(
        document_id=demo_doc.id,
        text="A fast red bird flies through the clear blue sky",
        embedding=[0.9, 0.1, 0.5, 0.3, 0.7],
        metadata={"source": "example", "position": 2}
    )
]

client.add_chunks(chunks=demo_chunks, document_id=demo_doc.id)

[Chunk(id=UUID('3c2756eb-ad87-41b9-94de-d5ec6123ae50'), text='The quick brown fox jumps over the lazy dog', embedding=[0.1, 0.2, 0.3, 0.4, 0.5], metadata={'source': 'example', 'position': 1}, document_id=UUID('184958ac-c6d4-4fc6-9162-cda31d8bc6a8'), created_at=datetime.datetime(2025, 11, 5, 14, 48, 57, 386167), updated_at=datetime.datetime(2025, 11, 5, 14, 48, 57, 386167)),
 Chunk(id=UUID('aaec4d6d-c055-4584-b076-b07244aec08d'), text='A fast red bird flies through the clear blue sky', embedding=[0.9, 0.1, 0.5, 0.3, 0.7], metadata={'source': 'example', 'position': 2}, document_id=UUID('184958ac-c6d4-4fc6-9162-cda31d8bc6a8'), created_at=datetime.datetime(2025, 11, 5, 14, 48, 57, 386191), updated_at=datetime.datetime(2025, 11, 5, 14, 48, 57, 386191))]

### 1. Architecture

**Layers**: API → Service → Storage → Index

**Benefits**:
- **Separation of concerns**: Each layer has clear responsibilities
- **Testable**: Can test each layer in isolation
- **Extensible**: Can swap implementations without affecting other layers


#### Key Design Principles

1. **Layered Architecture**: Clean separation (API → Service → Storage → Index)
2. **Thread-Safe**: RLock-based synchronization for concurrent operations
3. **Type-Safe**: Full Pydantic validation throughout
4. **Persistence**: JSON snapshots with atomic writes
5. **Filtering**: Post-filtering strategy with declarative and custom options

![image.png](../docs/api.png)

### 2. Index Selection: FLAT vs HNSW

| Metric | Flat Index | HNSW Index |
|--------|------------|------------|
| Search | O(n) - exact | O(log n) - approximate |
| Insert | O(1) | O(log n) |
| Recall | 100% | 95-99% (tunable) |
| Best For | <10K vectors | Millions of vectors |

**Recommendation**: Start with FLAT for accuracy and simplicity. Migrate to HNSW when dataset grows beyond 10,000 vectors.

### 3. Post-Filtering Strategy

**Algorithm**:
1. Perform kNN vector search → get candidates
2. Fetch full chunk data from storage
3. Apply metadata filters
4. Return top k results

**Tradeoff**: 
- **Pro**: Simple implementation, works with any index type
- **Pro**: Index layer doesn't need filter logic
- **Con**: May not return k results if filters are highly selective
- **Con**: Requires over-fetching (k×3 for custom filters)

**Production Alternative**: Pre-filtering with bitmap indexes for highly selective queries.

### 4. Type Safety Throughout

**Pattern**: Pydantic models everywhere
```python
library: Library = client.create_library(...)  # Type-checked
document: Document = client.create_document(...)  # Validated
```

**Benefits**:
- Runtime validation catches errors early
- IDE autocomplete improves developer experience
- Auto-generated OpenAPI documentation
- Prevents entire classes of bugs

**Tradeoff**: Slight performance overhead (~10-15%), but worth it for reliability.

### 5. Architecture

**Layers**: API → Service → Storage → Index

**Benefits**:
- **Separation of concerns**: Each layer has clear responsibilities
- **Testable**: Can test each layer in isolation
- **Extensible**: Can swap implementations without affecting other layers


#### Key Design Principles

1. **Layered Architecture**: Clean separation (API → Service → Storage → Index)
2. **Thread-Safe**: RLock-based synchronization for concurrent operations
3. **Type-Safe**: Full Pydantic validation throughout
4. **Persistence**: JSON snapshots with atomic writes
5. **Filtering**: Post-filtering strategy with declarative and custom options

![image.png](../docs/api.png)

### Summary: Pragmatic Design Choices

| Feature | Current | Production Alternative |
|---------|---------|------------------------|
| Persistence | JSON snapshots | PostgreSQL + WAL |
| Filtering | Post-filtering | Pre-filtering with bitmaps |
| Index | FLAT (O(n)) | HNSW (O(log n)) |
| Locking | Coarse RLock | Fine-grained locks |
| Storage | In-memory | Distributed (Redis, etc.) |

**Philosophy**: Start simple, scale where needed.

The current design is:
- Easy to understand and debug
- Production-ready for moderate scale (<100K vectors)
- Extensible for larger scale with clear upgrade paths

---

## Cleanup

In [30]:
# Close client connection
client.delete_library(library_id=library.id)  # Clean up
client.close()
print("Client connection closed")

Client connection closed


---

## Next Steps

1. **API Documentation**: http://localhost:8000/docs
2. **SDK Reference**: `docs/README.md`
3. **More Examples**: `examples/` directory
4. **Run Tests**: `uv run pytest`

**Discussion Questions**:
- When would you switch from FLAT to HNSW index?
- How would you implement distributed storage?
- What monitoring and observability would you add?
- How would you handle schema migrations?