![Redis](https://redis.io/wp-content/uploads/2024/04/Logotype.svg?auto=webp&quality=85,75&width=120)
# Migrating from FLAT to SVS-VAMANA

## Let's Begin!
<a href="https://colab.research.google.com/github/redis-developer/redis-ai-resources/blob/main/python-recipes/vector-search/06_svs_vamana_migration.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook demonstrates how to migrate existing FLAT vector indices to SVS-VAMANA for improved memory efficiency and cost savings.

## What You'll Learn

- How to assess your current FLAT index for migration
- Step-by-step migration from FLAT to SVS-VAMANA
- Memory usage comparison and cost analysis
- Search quality validation
- Performance benchmarking
- Migration decision framework

## Prerequisites

- Redis Stack 8.2.0+ with RediSearch 2.8.10+
- Existing vector index with substantial data (1000+ documents recommended)
- Vector embeddings (768 dimensions using sentence-transformers/all-mpnet-base-v2)

## 📦 Installation & Setup

This notebook requires **sentence-transformers** for generating embeddings and **Redis Stack** running in Docker.

**Requirements:**
- Redis Stack 8.2.0+ with RediSearch 2.8.10+
- sentence-transformers (for generating embeddings)
- numpy (for vector operations)
- redisvl (should be available in your environment)

**🐳 Docker Setup (Required):**

Before running this notebook, make sure Redis Stack is running in Docker:

```bash
# Start Redis Stack with Docker
docker run -d --name redis-stack -p 6379:6379 -p 8001:8001 redis/redis-stack:latest
```

Or if you prefer using docker-compose, create a `docker-compose.yml` file:

```yaml
version: '3.8'
services:
  redis:
    image: redis/redis-stack:latest
    ports:
      - "6379:6379"
      - "8001:8001"
```

Then run: `docker-compose up -d`

**📚 Python Dependencies Installation:**

Install the required Python packages:

```bash
# Install core dependencies
pip install redisvl numpy sentence-transformers

# Or install with specific versions for compatibility
pip install redisvl>=0.2.0 numpy>=1.21.0 sentence-transformers>=2.2.0
```

**For Google Colab users, run this cell:**

```python
!pip install redisvl sentence-transformers numpy
```

**For Conda users:**

```bash
conda install numpy
pip install redisvl sentence-transformers
```

In [35]:
# Setup redis-vl environment
import os
import sys
import subprocess
# Required imports from redis-vl
import numpy as np
import time
from redisvl.index import SearchIndex
from redisvl.query import VectorQuery
from redisvl.redis.utils import array_to_buffer, buffer_to_array
from redisvl.utils import CompressionAdvisor
from redisvl.redis.connection import supports_svs
import redis


## Step 1: Verify SVS-VAMANA Support

First, let's ensure your Redis environment supports SVS-VAMANA.

In [36]:
# Check Redis connection and SVS support
REDIS_URL = "redis://localhost:6379"

try:
    client = redis.Redis.from_url(REDIS_URL)
    client.ping()
    print("✅ Redis connection successful")
    
    if supports_svs(client):
        print("✅ SVS-VAMANA supported")
        print("   Ready for migration!")
    else:
        print("❌ SVS-VAMANA not supported")
        print("   Requires Redis >= 8.2.0 with RediSearch >= 2.8.10")
        print("   Please upgrade Redis Stack before proceeding")
        
except Exception as e:
    print(f"❌ Redis connection failed: {e}")
    print("   Please ensure Redis is running and accessible")

✅ Redis connection successful
✅ SVS-VAMANA supported
   Ready for migration!


## Step 2: Assess Your Current Index

For this demonstration, we'll create a sample FLAT index. In practice, you would analyze your existing index.

In [37]:
# Download sample data from redis-ai-resources
print("📥 Loading sample movie data...")
import os
import json

# Load the movies dataset
url = "resources/movies.json"
with open("resources/movies.json", "r") as f:
    movies_data = json.load(f)

print(f"Loaded {len(movies_data)} movie records")
print(f"Sample movie: {movies_data[0]['title']} - {movies_data[0]['description']}")

📥 Loading sample movie data...
Loaded 20 movie records
Sample movie: Explosive Pursuit - A daring cop chases a notorious criminal across the city in a high-stakes game of cat and mouse.


In [38]:
# Configuration for demonstration  
dims = 768  # sentence-transformers/all-mpnet-base-v2 - 768 dims

num_docs = len(movies_data)  # Use actual dataset size

print(
    "📊 Migration Assessment",
    f"Vector dimensions: {dims} (sentence-transformers/all-mpnet-base-v2)",
    f"Dataset size: {num_docs} movie documents",
    "Data includes: title, genre, rating, description",
    sep="\n"
)

📊 Migration Assessment
Vector dimensions: 768 (sentence-transformers/all-mpnet-base-v2)
Dataset size: 20 movie documents
Data includes: title, genre, rating, description


---
Next, let's configure a smaple FLAT index. Notice the algorithm value, dims value, and datatype value under fields.

In [39]:
flat_schema = {
    "index": {
        "name": "migration_demo_flat",
        "prefix": "demo:flat:",
    },
    "fields": [
        {"name": "movie_id", "type": "tag"},
        {"name": "title", "type": "text"},
        {"name": "genre", "type": "tag"},
        {"name": "rating", "type": "numeric"},
        {"name": "description", "type": "text"},
        {
            "name": "embedding",
            "type": "vector",
            "attrs": {
                "dims": dims,
                "algorithm": "flat",
                "datatype": "float32",
                "distance_metric": "cosine"
            }
        }
    ]
}

# Create and populate FLAT index
print("Creating sample FLAT index...")
flat_index = SearchIndex.from_dict(flat_schema, redis_url=REDIS_URL)
flat_index.create(overwrite=True)
print(f"✅ Created FLAT index: {flat_index.name}")

Creating sample FLAT index...
✅ Created FLAT index: migration_demo_flat


---
Generate embeddings for movie descriptions


In [40]:
from sentence_transformers import SentenceTransformer

print("🔄 Generating embeddings for movie descriptions...")
embedding_model="sentence-transformers/all-mpnet-base-v2"

try:
    # Try to use sentence-transformers for real embeddings
    print("📦 Loading sentence transformer model...")
    model = SentenceTransformer(embedding_model)
    print(f"✅ Loaded embedding model with {dims} dimensions")
    
    # Generate real embeddings
    descriptions = [movie['description'] for movie in movies_data]
    embeddings = model.encode(descriptions, convert_to_numpy=True, normalize_embeddings=True)
    print(f"✅ Generated {len(embeddings)} real embeddings using sentence-transformers")
    
except ImportError:
    # Fallback to synthetic embeddings
    print("⚠️  sentence-transformers not available, using synthetic embeddings")
    print(f"📦 Using {dims} dimensions for synthetic embeddings")
    
    # Generate synthetic embeddings (normalized random vectors for demo)
    np.random.seed(42)  # For reproducible results
    embeddings = []
    for i, movie in enumerate(movies_data):
        # Create a pseudo-semantic embedding based on movie content
        vector = np.random.random(dims).astype(np.float32)
        # Add some structure based on genre
        if movie['genre'] == 'action':
            vector[:50] += 0.3  # Action movies cluster
        else:  # comedy
            vector[50:100] += 0.3  # Comedy movies cluster
        
        # Normalize
        vector = vector / np.linalg.norm(vector)
        embeddings.append(vector)
    
    embeddings = np.array(embeddings)
    print(f"✅ Generated {len(embeddings)} synthetic embeddings")

# Prepare data for loading
sample_data = []
for i, movie in enumerate(movies_data):
    sample_data.append({
        'movie_id': str(movie['id']),
        'title': movie['title'],
        'genre': movie['genre'],
        'rating': movie['rating'],
        'description': movie['description'],
        'embedding': array_to_buffer(embeddings[i].astype(np.float32), dtype='float32')
    })

🔄 Generating embeddings for movie descriptions...
📦 Loading sentence transformer model...
14:45:27 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
14:45:27 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2
✅ Loaded embedding model with 768 dimensions


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

✅ Generated 20 real embeddings using sentence-transformers


In [41]:
# Load data into FLAT index
print("📥 Loading data into FLAT index...")
batch_size = 100  # Process in batches

for i in range(0, len(sample_data), batch_size):
    batch = sample_data[i:i+batch_size]
    flat_index.load(batch)
    print(f"  Loaded {min(i+batch_size, len(sample_data))}/{len(sample_data)} documents")

# Wait for indexing to complete
print("Waiting for indexing to complete...")
time.sleep(3)

flat_info = flat_index.info()
print(f"\n✅ FLAT index loaded with {flat_info['num_docs']} documents")
print(f"Index size: {flat_info.get('vector_index_sz_mb', 'N/A')} MB")

📥 Loading data into FLAT index...
  Loaded 20/20 documents
Waiting for indexing to complete...

✅ FLAT index loaded with 20 documents
Index size: 3.0168838500976563 MB


## Step 3: Get Compression Recommendation

The CompressionAdvisor analyzes your vector dimensions and provides optimal compression settings for SVS-VAMANA vector indices. It eliminates the guesswork from parameter tuning by providing intelligent recommendations based on your vector characteristics and performance priorities.

## Configuration Strategy
**High-Dimensional Vectors (≥1024 dims)**: Uses **LeanVec4x8** compression with dimensionality reduction. Memory priority reduces dimensions by 50%, speed priority by
25%, balanced by 50%. Achieves 60-80% memory savings.

**Lower-Dimensional Vectors (<1024 dims)**: Uses **LVQ compression** without dimensionality reduction. Memory priority uses LVQ4 (4 bits), speed uses LVQ4x8 (12 bits),
balanced uses LVQ4x4 (8 bits). Achieves 60-87% memory savings.

**Our Configuration (768 dims)**: Will use **LVQ compression** as we're below the 1024 dimension threshold. This provides excellent compression without dimensionality reduction.

## Available Compression Types
- **LVQ4/LVQ4x4/LVQ4x8**: 4/8/12 bits per dimension
- **LeanVec4x8/LeanVec8x8**: 12/16 bits + dimensionality reduction for high-dim vectors


In [42]:
# Get compression recommendation
print("🔍 Analyzing compression options...")
print()

# Try different priorities to show options
priorities = ["memory", "balanced", "performance"]
configs = {}

for priority in priorities:
    config = CompressionAdvisor.recommend(dims=dims, priority=priority)
    configs[priority] = config
    print(f"{priority.upper()} priority:")
    print(f"  Algorithm: {config['algorithm']}")
    print(f"  Compression: {config.get('compression', 'None')}")
    print(f"  Datatype: {config['datatype']}")
    if 'reduce' in config:
        reduction = ((dims - config['reduce']) / dims) * 100
        print(f"  Dimensionality: {dims} → {config['reduce']} ({reduction:.1f}% reduction)")
    print()

# Select memory-optimized configuration for migration
selected_config = configs["memory"]
print(f"📋 Selected configuration: {selected_config['compression']} with {selected_config['datatype']}")
print(f"Expected memory savings: Significant for {dims}-dimensional vectors")

🔍 Analyzing compression options...

MEMORY priority:
  Algorithm: svs-vamana
  Compression: LVQ4
  Datatype: float32

BALANCED priority:
  Algorithm: svs-vamana
  Compression: LVQ4x4
  Datatype: float32

PERFORMANCE priority:
  Algorithm: svs-vamana
  Compression: LVQ4x4
  Datatype: float32

📋 Selected configuration: LVQ4 with float32
Expected memory savings: Significant for 768-dimensional vectors


## Step 4: Create SVS-VAMANA Index

Now we'll create the new SVS-VAMANA index with the recommended compression settings.

In [43]:
# Fallback configuration if not defined (for CI/CD compatibility)
if 'selected_config' not in locals():
    from redisvl.utils import CompressionAdvisor
    selected_config = CompressionAdvisor.recommend(dims=dims, priority="memory")

# Create SVS-VAMANA schema with compression
svs_schema = {
    "index": {
        "name": "migration_demo_svs",
        "prefix": "demo:svs:",
    },
    "fields": [
        {"name": "movie_id", "type": "tag"},
        {"name": "title", "type": "text"},
        {"name": "genre", "type": "tag"},
        {"name": "rating", "type": "numeric"},
        {"name": "description", "type": "text"},
        {
            "name": "embedding",
            "type": "vector",
            "attrs": {
                "dims": selected_config.get('reduce', dims),  # Use reduced dimensions (512)
                "algorithm": "svs-vamana",
                "datatype": selected_config['datatype'],
                "distance_metric": "cosine"
                # Note: Don't include the full selected_config to avoid dims/reduce conflict
            }
        }
    ]
}

print("Creating SVS-VAMANA index with compression...")
svs_index = SearchIndex.from_dict(svs_schema, redis_url=REDIS_URL)
svs_index.create(overwrite=True)
print(f"✅ Created SVS-VAMANA index: {svs_index.name}")
print(f"Compression: {selected_config.get('compression', 'None')}")
print(f"Datatype: {selected_config['datatype']}")

Creating SVS-VAMANA index with compression...
✅ Created SVS-VAMANA index: migration_demo_svs
Compression: LVQ4
Datatype: float32


## Step 5: Migrate Data

Extract data from the original index and load it into the SVS-VAMANA index with compression applied.

In [44]:
print("🔄 Migrating data to SVS-VAMANA...")

# Fallback configuration if not defined (for CI/CD compatibility)
if 'selected_config' not in locals():
    from redisvl.utils import CompressionAdvisor
    selected_config = CompressionAdvisor.recommend(dims=dims, priority="memory")

# Determine target vector dimensions (may be reduced by LeanVec)
target_dims = selected_config.get('reduce', dims)
target_dtype = selected_config['datatype']

print(f"Target dimensions: {target_dims} (from {dims})")
print(f"Target datatype: {target_dtype}")


🔄 Migrating data to SVS-VAMANA...
Target dimensions: 768 (from 768)
Target datatype: float32


In [45]:
# Extract data from FLAT index
print("Extracting data from original index...")
keys = client.keys("demo:flat:*")
print(f"Found {len(keys)} documents to migrate")

# Process and transform data for SVS index
svs_data = []
for i, key in enumerate(keys):
    doc_data = client.hgetall(key)
    
    if b'embedding' in doc_data:
        # Extract original vector
        original_vector = np.array(buffer_to_array(doc_data[b'embedding'], dtype='float32'))
        
        # Apply dimensionality reduction if needed (LeanVec)
        if target_dims < dims:
            vector = original_vector[:target_dims]
        else:
            vector = original_vector
        
        # Convert to target datatype
        if target_dtype == 'float16':
            vector = vector.astype(np.float16)
        
        svs_data.append({
            "movie_id": doc_data[b'movie_id'].decode(),
            "title": doc_data[b'title'].decode(),
            "genre": doc_data[b'genre'].decode(),
            "rating": int(doc_data[b'rating'].decode()),
            "description": doc_data[b'description'].decode(),
            "embedding": array_to_buffer(vector, dtype=target_dtype)
        })
    
    if (i + 1) % 500 == 0:
        print(f"  Processed {i + 1}/{len(keys)} documents")

print(f"Prepared {len(svs_data)} documents for migration")

Extracting data from original index...
Found 40 documents to migrate
Prepared 40 documents for migration


In [46]:
# Load data into SVS index
print("Loading data into SVS-VAMANA index...")
batch_size = 100  # Define batch size for migration

if len(svs_data) > 0:
    for i in range(0, len(svs_data), batch_size):
        batch = svs_data[i:i+batch_size]
        svs_index.load(batch)
        print(f"  Migrated {min(i+batch_size, len(svs_data))}/{len(svs_data)} documents")

    # Wait for indexing to complete
    print("Waiting for indexing to complete...")
    time.sleep(5)

    svs_info = svs_index.info()
    print(f"\n✅ Migration complete! SVS index has {svs_info['num_docs']} documents")
else:
    print("⚠️  No data to migrate. Make sure the FLAT index was populated first.")
    print("   Run the previous cells to load data into the FLAT index.")
    svs_info = svs_index.info()

Loading data into SVS-VAMANA index...
  Migrated 40/40 documents
Waiting for indexing to complete...

✅ Migration complete! SVS index has 20 documents


## Step 6: Compare Memory Usage

Let's analyze the memory savings achieved through compression. This is just an example on the small sample data. Use a larger dataset before deciding.

In [47]:
# Helper function to extract memory info
def get_memory_mb(index_info):
    """Extract memory usage in MB from index info"""
    memory = index_info.get('vector_index_sz_mb', 0)
    if isinstance(memory, str):
        try:
            return float(memory)
        except ValueError:
            return 0.0
    return float(memory)

# Get memory usage
flat_memory = get_memory_mb(flat_info)
svs_memory = get_memory_mb(svs_info)

print(
    "📊 Memory Usage Comparison",
    "=" * 40,
    f"Original FLAT index:    {flat_memory:.2f} MB",
    f"SVS-VAMANA index:       {svs_memory:.2f} MB",
    "",
    sep="\n"
)

if flat_memory > 0:
    if svs_memory > 0:
        savings = ((flat_memory - svs_memory) / flat_memory) * 100
        print(
            f"💰 Memory savings: {savings:.1f}%",
            f"Absolute reduction: {flat_memory - svs_memory:.2f} MB",
            sep="\n"
        )
    else:
        print("⏳ SVS index still indexing - memory comparison pending")
        
    # Cost analysis
    print("\n💵 Cost Impact Analysis:")
    cost_per_gb_hour = 0.10  # Example cloud pricing
    hours_per_month = 24 * 30
    
    flat_monthly_cost = (flat_memory / 1024) * cost_per_gb_hour * hours_per_month
    if svs_memory > 0:
        svs_monthly_cost = (svs_memory / 1024) * cost_per_gb_hour * hours_per_month
        monthly_savings = flat_monthly_cost - svs_monthly_cost
        print(
            f"Monthly cost reduction: ${monthly_savings:.2f}",
            f"Annual cost reduction: ${monthly_savings * 12:.2f}",
            sep="\n"
        )
    else:
        print(
            f"Current monthly cost: ${flat_monthly_cost:.2f}",
            "Projected savings: Available after indexing completes",
            sep="\n"
        )
else:
    print("⚠️  Memory information not available")

📊 Memory Usage Comparison
Original FLAT index:    3.02 MB
SVS-VAMANA index:       3.02 MB

💰 Memory savings: -0.0%
Absolute reduction: -0.00 MB

💵 Cost Impact Analysis:
Monthly cost reduction: $-0.00
Annual cost reduction: $-0.00


## Step 7: Validate Search Quality

Test that the compressed index maintains good search quality.

In [48]:
print("🔍 Validating search quality...")

# Create test queries
num_test_queries = 5
test_queries = []

for i in range(num_test_queries):
    # Generate normalized test vector
    query_vec = np.random.random(dims).astype(np.float32)
    query_vec = query_vec / np.linalg.norm(query_vec)
    test_queries.append(query_vec)

print(f"Generated {num_test_queries} test queries")

# Test FLAT index (ground truth)
print("\nTesting original FLAT index...")
flat_results = []
flat_start = time.time()

for query_vec in test_queries:
    query = VectorQuery(
        vector=query_vec,
        vector_field_name="embedding",
        return_fields=["movie_id", "title", "genre"],
        dtype="float32",
        num_results=10
    )
    results = flat_index.query(query)
    flat_results.append([doc["movie_id"] for doc in results])

flat_time = time.time() - flat_start
print(f"FLAT search time: {flat_time:.3f}s ({flat_time/num_test_queries:.3f}s per query)")

# Test SVS-VAMANA index
print("\nTesting SVS-VAMANA index...")
svs_results = []
svs_start = time.time()

for i, query_vec in enumerate(test_queries):
    # Adjust query vector for SVS index (handle dimensionality reduction)
    if target_dims < dims:
        svs_query_vec = query_vec[:target_dims]
    else:
        svs_query_vec = query_vec
    
    if target_dtype == 'float16':
        svs_query_vec = svs_query_vec.astype(np.float16)
    
    query = VectorQuery(
        vector=svs_query_vec,
        vector_field_name="embedding",
        return_fields=["movie_id", "title", "genre"],
        dtype=target_dtype,
        num_results=10
    )
    
    try:
        results = svs_index.query(query)
        svs_results.append([doc["movie_id"] for doc in results])
    except Exception as e:
        print(f"Query {i+1} failed: {e}")
        svs_results.append([])

svs_time = time.time() - svs_start
print(f"SVS search time: {svs_time:.3f}s ({svs_time/num_test_queries:.3f}s per query)")

# Calculate recall if we have results
if svs_results and any(svs_results):
    recalls = []
    for flat_res, svs_res in zip(flat_results, svs_results):
        if flat_res and svs_res:
            intersection = set(flat_res).intersection(set(svs_res))
            recall = len(intersection) / len(flat_res)
            recalls.append(recall)
    
    if recalls:
        avg_recall = np.mean(recalls)
        print(f"\n📈 Average recall@10: {avg_recall:.3f} ({avg_recall*100:.1f}%)")
        
        if avg_recall >= 0.9:
            print("✅ Excellent search quality maintained")
        elif avg_recall >= 0.8:
            print("✅ Good search quality maintained")
        else:
            print("⚠️  Search quality may be impacted - consider adjusting compression")
else:
    print("⚠️  SVS index may still be indexing - search quality test pending")

🔍 Validating search quality...
Generated 5 test queries

Testing original FLAT index...
FLAT search time: 0.012s (0.002s per query)

Testing SVS-VAMANA index...
SVS search time: 0.017s (0.003s per query)

📈 Average recall@10: 1.000 (100.0%)
✅ Excellent search quality maintained


## Step 8: Migration Decision Framework

Based on the results, let's determine if migration is recommended.

In [49]:
print("🎯 Migration Analysis & Recommendation")
print("=" * 50)

# Fallback configuration if not defined (for CI/CD compatibility)
if 'selected_config' not in locals():
    from redisvl.utils import CompressionAdvisor
    selected_config = CompressionAdvisor.recommend(dims=dims, priority="memory")

# Summarize configuration
print(f"Dataset: {num_docs} documents, {dims}-dimensional vectors")
print(f"Compression: {selected_config.get('compression', 'None')}")
print(f"Datatype: float32 → {selected_config['datatype']}")
if 'reduce' in selected_config:
    reduction = ((dims - selected_config['reduce']) / dims) * 100
    print(f"Dimensions: {dims} → {selected_config['reduce']} ({reduction:.1f}% reduction)")
print()

# Decision criteria
memory_savings_significant = False
search_quality_acceptable = True
performance_acceptable = True

if flat_memory > 0 and svs_memory > 0:
    savings_pct = ((flat_memory - svs_memory) / flat_memory) * 100
    memory_savings_significant = savings_pct > 25  # 25%+ savings considered significant
    print(f"Memory savings: {savings_pct:.1f}% ({'Significant' if memory_savings_significant else 'Modest'})")
else:
    print("Memory savings: Pending (SVS index still indexing)")

if 'recalls' in locals() and recalls:
    avg_recall = np.mean(recalls)
    search_quality_acceptable = avg_recall >= 0.8  # 80%+ recall considered acceptable
    print(f"Search quality: {avg_recall:.1f}% recall ({'Acceptable' if search_quality_acceptable else 'Needs improvement'})")
else:
    print("Search quality: Pending validation")

if 'flat_time' in locals() and 'svs_time' in locals():
    performance_ratio = svs_time / flat_time if flat_time > 0 else 1
    performance_acceptable = performance_ratio <= 2.0  # Allow up to 2x slower
    print(f"Performance: {performance_ratio:.1f}x vs original ({'Acceptable' if performance_acceptable else 'Slower than expected'})")
else:
    print("Performance: Pending comparison")


# Final recommendation
print("\n🏆 RECOMMENDATION:")
if memory_savings_significant and search_quality_acceptable and performance_acceptable:
    print("✅ MIGRATE TO SVS-VAMANA")
    print("   • Significant memory savings achieved")
    print("   • Search quality maintained")
    print("   • Performance impact acceptable")
    print("   • Cost reduction benefits clear")
elif memory_savings_significant and search_quality_acceptable:
    print("⚠️  CONSIDER MIGRATION WITH MONITORING")
    print("   • Good memory savings and search quality")
    print("   • Monitor performance in production")
    print("   • Consider gradual rollout")
elif memory_savings_significant:
    print("⚠️  MIGRATION NEEDS TUNING")
    print("   • Memory savings achieved")
    print("   • Search quality or performance needs improvement")
    print("   • Try different compression settings")
else:
    print("❌ MIGRATION NOT RECOMMENDED")
    print("   • Insufficient benefits for current dataset")
    print("   • Consider larger dataset or different compression")
    print("   • SVS-VAMANA works best with high-dimensional data")

🎯 Migration Analysis & Recommendation
Dataset: 20 documents, 768-dimensional vectors
Compression: LVQ4
Datatype: float32 → float32

Memory savings: -0.0% (Modest)
Search quality: 1.0% recall (Acceptable)
Performance: 1.4x vs original (Acceptable)

🏆 RECOMMENDATION:
❌ MIGRATION NOT RECOMMENDED
   • Insufficient benefits for current dataset
   • Consider larger dataset or different compression
   • SVS-VAMANA works best with high-dimensional data


## Step 9: Production Migration Checklist

If migration is recommended, follow this checklist for production deployment.

In [50]:
print(
    "📋 Production Migration Checklist",
    "=" * 40,
    "\nPRE-MIGRATION:",
    "□ Backup existing index data",
    "□ Test migration on staging environment",
    "□ Validate search quality with real queries",
    "□ Measure baseline performance metrics",
    "□ Plan rollback strategy",
    "\nMIGRATION:",
    "□ Create SVS-VAMANA index with tested configuration",
    "□ Migrate data in batches during low-traffic periods",
    "□ Monitor memory usage and indexing progress",
    "□ Validate data integrity after migration",
    "□ Test search functionality thoroughly",
    "\nPOST-MIGRATION:",
    "□ Monitor search performance and quality",
    "□ Track memory usage and cost savings",
    "□ Update application configuration",
    "□ Document new index settings",
    "□ Clean up old index after validation period",
    "\n💡 TIPS:",
    "• Start with a subset of data for initial validation",
    "• Use blue-green deployment for zero-downtime migration",
    "• Monitor for 24-48 hours before removing old index",
    "• Keep compression settings documented for future reference",
    sep="\n"
)

📋 Production Migration Checklist

PRE-MIGRATION:
□ Backup existing index data
□ Test migration on staging environment
□ Validate search quality with real queries
□ Measure baseline performance metrics
□ Plan rollback strategy

MIGRATION:
□ Create SVS-VAMANA index with tested configuration
□ Migrate data in batches during low-traffic periods
□ Monitor memory usage and indexing progress
□ Validate data integrity after migration
□ Test search functionality thoroughly

POST-MIGRATION:
□ Monitor search performance and quality
□ Track memory usage and cost savings
□ Update application configuration
□ Document new index settings
□ Clean up old index after validation period

💡 TIPS:
• Start with a subset of data for initial validation
• Use blue-green deployment for zero-downtime migration
• Monitor for 24-48 hours before removing old index
• Keep compression settings documented for future reference


## Step 10: Cleanup

Clean up the demonstration indices.

In [51]:
print("🧹 Cleaning up demonstration indices...")

# Clean up FLAT index
try:
    flat_index.delete(drop=True)
    print("✅ Deleted FLAT demonstration index")
except Exception as e:
    print(f"⚠️  Failed to delete FLAT index: {e}")

# Clean up SVS index
try:
    svs_index.delete(drop=True)
    print("✅ Deleted SVS-VAMANA demonstration index")
except Exception as e:
    print(f"⚠️  Failed to delete SVS index: {e}")

print(
    "\n🎉 Migration demonstration complete!",
    "\nNext steps:",
    "1. Apply learnings to your production data",
    "2. Test with your actual query patterns",
    "3. Monitor performance in your environment",
    "4. Consider gradual rollout strategy",
    sep="\n"
)

🧹 Cleaning up demonstration indices...
✅ Deleted FLAT demonstration index
✅ Deleted SVS-VAMANA demonstration index

🎉 Migration demonstration complete!

Next steps:
1. Apply learnings to your production data
2. Test with your actual query patterns
3. Monitor performance in your environment
4. Consider gradual rollout strategy
