# Complete MongoDB Vector Search Lab

This notebook demonstrates a complete lab experience using all features of jupyter-lab-progress in a real-world MongoDB Vector Search workshop.

## What You'll Learn
- Set up MongoDB Atlas for vector search
- Create and validate vector embeddings  
- Build semantic search applications
- Use jupyter-lab-progress for engaging lab experience

## Prerequisites
- MongoDB Atlas account (free tier works)
- Python 3.8+
- Basic knowledge of Python and pandas

In [None]:
# 🔧 Environment Setup - Run this cell first!
import sys
import subprocess

def install_if_missing(package_name, import_name=None):
    """Install a package if it's not already available."""
    if import_name is None:
        import_name = package_name.replace('-', '_')
    
    try:
        __import__(import_name)
        print(f"✅ {package_name} is available")
    except ImportError:
        print(f"📦 Installing {package_name}...")
        subprocess.check_call([sys.executable, "-m", "pip", "install", package_name])
        print(f"✅ {package_name} installed successfully!")

# Install core dependencies
install_if_missing("jupyter-lab-progress", "jupyter_lab_progress")

# Import everything we need
from jupyter_lab_progress import *
import pandas as pd
import numpy as np

print("\n🎉 Setup complete! Ready to build a MongoDB Vector Search system.")
print("📚 This is a comprehensive lab demonstrating real-world vector search.")

## Lab Overview

In this comprehensive lab, you'll build a complete vector search system for an e-commerce platform. This simulates a real-world scenario where customers search for products using natural language.

### What We'll Build
1. **Product Database**: E-commerce product catalog with descriptions
2. **Vector Embeddings**: Convert product descriptions to semantic vectors  
3. **Search System**: Find similar products using vector similarity
4. **Validation**: Ensure data quality throughout the process

Let's get started! 🚀

In [None]:
# Create lab progress tracker
lab_steps = [
    "Environment Setup",
    "Load Sample Data", 
    "Generate Embeddings",
    "Validate Data Quality",
    "Build Search Index",
    "Test Search Functionality",
    "Performance Optimization",
    "Complete Demo"
]

lab_progress = LabProgress(
    steps=lab_steps,
    lab_name="🔍 MongoDB Vector Search Lab"
)

# Mark first step complete
lab_progress.mark_done("Environment Setup", score=100, notes="All packages installed!")

# Create validator
validator = LabValidator(progress_tracker=lab_progress)

show_info("🎯 Lab initialized successfully! Your progress will be tracked throughout this workshop.")

In [None]:
# Create sample e-commerce product data
products_data = [
    {"id": 1, "name": "Wireless Bluetooth Headphones", "description": "High-quality wireless headphones with noise cancellation and 30-hour battery life", "category": "Electronics", "price": 199.99},
    {"id": 2, "name": "Organic Cotton T-Shirt", "description": "Comfortable organic cotton t-shirt in various colors and sizes", "category": "Clothing", "price": 29.99},
    {"id": 3, "name": "Smart Fitness Watch", "description": "Advanced fitness tracking watch with heart rate monitor and GPS", "category": "Electronics", "price": 299.99},
    {"id": 4, "name": "Ceramic Coffee Mug", "description": "Handcrafted ceramic coffee mug perfect for morning coffee", "category": "Home", "price": 19.99},
    {"id": 5, "name": "Yoga Mat", "description": "Non-slip yoga mat made from eco-friendly materials", "category": "Sports", "price": 49.99},
    {"id": 6, "name": "Smartphone Case", "description": "Protective smartphone case with wireless charging compatibility", "category": "Electronics", "price": 24.99},
    {"id": 7, "name": "Running Shoes", "description": "Lightweight running shoes with advanced cushioning technology", "category": "Sports", "price": 129.99},
    {"id": 8, "name": "Desk Lamp", "description": "Adjustable LED desk lamp with multiple brightness settings", "category": "Home", "price": 79.99},
    {"id": 9, "name": "Water Bottle", "description": "Insulated stainless steel water bottle keeps drinks cold for 24 hours", "category": "Sports", "price": 34.99},
    {"id": 10, "name": "Laptop Backpack", "description": "Durable laptop backpack with multiple compartments and USB charging port", "category": "Electronics", "price": 89.99}
]

# Create DataFrame
products_df = pd.DataFrame(products_data)

# Display the data
show_info(f"📊 Loaded {len(products_df)} sample products")
display(products_df)

# Mark step complete
lab_progress.mark_done("Load Sample Data", score=100, notes="Product catalog ready!")