# Lesson 24: VectorDB - The use of the Milvus database

## Introduction (5 minutes)

Welcome to our lesson on Vector Databases, focusing on the Milvus database. In this 60-minute session, we'll explore what vector databases are, why they're crucial for RAG systems, and how to use Milvus to build efficient and scalable vector search capabilities.

## Lesson Objectives

By the end of this lesson, you will be able to:
1. Understand the concept of vector databases and their importance in RAG systems
2. Recognize the key features and advantages of Milvus
3. Set up and use Milvus for vector storage and retrieval
4. Implement a basic RAG system using Milvus as the vector store
5. Optimize Milvus for performance in RAG applications

## 1. Introduction to Vector Databases (10 minutes)

Vector databases are specialized database systems designed to store, manage, and query high-dimensional vector data efficiently.

Key concepts:
- Vector representation of data
- Similarity search in high-dimensional spaces
- Indexing techniques for fast retrieval

Importance in RAG systems:
- Efficient storage of document embeddings
- Fast similarity search for relevant document retrieval
- Scalability for large document collections

## 2. Overview of Milvus (10 minutes)

Milvus is an open-source vector database built for scalable similarity search and AI applications.

Key features of Milvus:
- High performance and scalability
- Support for multiple index types (e.g., FLAT, IVF_FLAT, HNSW)
- Hybrid search capabilities (vector + scalar filtering)
- Cloud-native architecture
- Support for multiple programming languages

Advantages of Milvus in RAG systems:
- Optimized for large-scale vector operations
- Flexible deployment options (standalone, cluster)
- Integration with popular AI frameworks

## 3. Setting up Milvus (10 minutes)

Let's set up Milvus using Docker and connect to it using the Python SDK:

```bash
# Pull and run Milvus using Docker
docker pull milvusdb/milvus:latest
docker run -d --name milvus_standalone -p 19530:19530 -p 9091:9091 milvusdb/milvus:latest

In [None]:
Now, let's install the Milvus Python SDK and connect to our Milvus instance:

pip install pymilvus

from pymilvus import connections, Collection, FieldSchema, CollectionSchema, DataType

# Connect to Milvus
connections.connect("default", host="localhost", port="19530")

# Define collection schema
dim = 128  # Dimension of your embeddings
fields = [
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True),
    FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=dim)
]
schema = CollectionSchema(fields, "A collection for document embeddings")

# Create collection
collection_name = "document_embeddings"
collection = Collection(collection_name, schema)

print(f"Collection '{collection_name}' created successfully.")

In [None]:
## 4. Using Milvus in a RAG System (20 minutes)

Let's implement a basic RAG system using Milvus as the vector store:

import numpy as np
from pymilvus import connections, Collection, utility
from sentence_transformers import SentenceTransformer

class MilvusRAG:
    def __init__(self, collection_name="document_embeddings", dim=384):
        self.collection_name = collection_name
        self.dim = dim
        self.model = SentenceTransformer('all-MiniLM-L6-v2')
        
        # Connect to Milvus
        connections.connect("default", host="localhost", port="19530")
        
        # Get or create collection
        if utility.has_collection(collection_name):
            self.collection = Collection(collection_name)
        else:
            self._create_collection()
        
        # Create index if not exists
        if not self.collection.has_index():
            self._create_index()

    def _create_collection(self):
        fields = [
            FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True),
            FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=self.dim)
        ]
        schema = CollectionSchema(fields, "Document embeddings for RAG")
        self.collection = Collection(self.collection_name, schema)

    def _create_index(self):
        index_params = {
            "metric_type": "L2",
            "index_type": "IVF_FLAT",
            "params": {"nlist": 1024}
        }
        self.collection.create_index("embedding", index_params)

    def add_documents(self, documents):
        embeddings = self.model.encode(documents)
        entities = [{"embedding": embedding.tolist()} for embedding in embeddings]
        self.collection.insert(entities)
        self.collection.flush()

    def query(self, query_text, top_k=3):
        query_embedding = self.model.encode(query_text)
        search_params = {"metric_type": "L2", "params": {"nprobe": 10}}
        results = self.collection.search(
            data=[query_embedding.tolist()],
            anns_field="embedding",
            param=search_params,
            limit=top_k,
            output_fields=["id"]
        )
        return results

# Usage
rag = MilvusRAG()

# Add documents
documents = [
    "Milvus is a vector database optimized for similarity search.",
    "RAG systems combine retrieval and generation for better AI responses.",
    "Vector databases are crucial for efficient embedding storage and retrieval.",
    "Milvus supports multiple index types for fast similarity search."
]
rag.add_documents(documents)

# Query
query = "What is Milvus used for in RAG systems?"
results = rag.query(query)

print(f"Query: {query}")
print("Results:")
for hit in results[0]:
    print(f"ID: {hit.id}, Distance: {hit.distance}")

In [None]:
## 5. Optimizing Milvus for RAG Performance (10 minutes)

To optimize Milvus for RAG applications, consider the following:

1. Choose the right index:
   - HNSW for high precision and fast search
   - IVF_FLAT for a balance of accuracy and speed
   - FLAT for small datasets or when 100% recall is required

2. Tune index parameters:
   - Adjust `nlist` for IVF_FLAT based on dataset size
   - Optimize `M` and `efConstruction` for HNSW based on performance requirements

3. Use hybrid search:
   - Combine vector similarity with metadata filtering for more accurate results

4. Implement caching:
   - Use query result caching to improve response times for repeated queries

Example of hybrid search:

def hybrid_query(self, query_text, metadata_filter, top_k=3):
    query_embedding = self.model.encode(query_text)
    search_params = {"metric_type": "L2", "params": {"nprobe": 10}}
    results = self.collection.search(
        data=[query_embedding.tolist()],
        anns_field="embedding",
        param=search_params,
        limit=top_k,
        expr=metadata_filter,
        output_fields=["id", "metadata"]
    )
    return results

# Usage
metadata_filter = "metadata.category == 'technology'"
results = rag.hybrid_query("What is Milvus?", metadata_filter)

In [None]:
## Conclusion and Q&A (5 minutes)

In this lesson, we've explored vector databases, focusing on Milvus and its application in RAG systems. We've learned how to set up Milvus, use it to store and retrieve document embeddings, and implement a basic RAG system. We've also discussed optimization strategies for better performance.

Are there any questions about Milvus or its use in RAG systems?

## Additional Resources

1. Milvus documentation: https://milvus.io/docs
2. "Vector Similarity Search: From Basics to Production" article: https://pinecone.io/learn/vector-similarity-search/
3. "Evaluating Vector Database Performance: A Milvus Case Study" blog post: https://milvus.io/blog/evaluating-vector-database-performance-a-milvus-case-study.md
4. Milvus Python SDK GitHub repository: https://github.com/milvus-io/pymilvus

In our next lesson, we'll explore advanced techniques for keyword search and vector retrieval in RAG systems.