# VectorWrap Demo - Universal Vector Search

This notebook demonstrates how to use VectorWrap to perform vector similarity search across different database backends (PostgreSQL, MySQL, SQLite, DuckDB).

[![GitHub](https://img.shields.io/badge/GitHub-vectorwrap-blue)](https://github.com/mihirahuja1/vectorwrap)
[![PyPI](https://img.shields.io/pypi/v/vectorwrap)](https://pypi.org/project/vectorwrap/)

## 1. Installation

First, let's install vectorwrap with SQLite support (great for demos):

In [None]:
# Install vectorwrap with SQLite support
!pip install -q "vectorwrap[sqlite]"

# Optional: Install OpenAI for real embeddings (or use any other embedding library)
!pip install -q openai

## 2. Connect to a Database

VectorWrap supports multiple database backends. Here we'll use SQLite for simplicity:

In [None]:
from vectorwrap import VectorDB
import numpy as np

# Connect to SQLite (in-memory for this demo)
db = VectorDB("sqlite:///:memory:")
print("✅ Connected to SQLite database")

# Alternative connection strings:
# db = VectorDB("postgresql://user:pass@localhost/db")  # PostgreSQL
# db = VectorDB("mysql://user:pass@localhost/db")       # MySQL
# db = VectorDB("duckdb:///:memory:")                   # DuckDB

## 3. Create a Collection

Let's create a collection for storing product embeddings:

In [None]:
# Create a collection with 384-dimensional vectors
# (Using smaller dimensions for demo; use 1536 for OpenAI embeddings)
VECTOR_DIM = 384
db.create_collection("products", dim=VECTOR_DIM)
print(f"✅ Created collection 'products' with {VECTOR_DIM}-dimensional vectors")

## 4. Simple Embedding Function

For this demo, we'll use a simple embedding function. In production, use OpenAI, Hugging Face, or other embedding models:

In [None]:
def simple_embed(text: str) -> list[float]:
    """Simple deterministic embedding function for demo purposes.
    In production, use real embeddings from OpenAI, Hugging Face, etc."""
    # Create a deterministic vector based on text
    np.random.seed(sum(ord(c) for c in text) % 10000)
    return np.random.randn(VECTOR_DIM).tolist()

# For real embeddings with OpenAI (requires API key):
# from openai import OpenAI
# client = OpenAI(api_key="your-api-key")
# 
# def embed(text: str) -> list[float]:
#     response = client.embeddings.create(
#         model="text-embedding-3-small",
#         input=text
#     )
#     return response.data[0].embedding

## 5. Insert Vectors with Metadata

Let's add some products to our database:

In [None]:
# Sample products to insert
products = [
    (1, "Apple iPhone 15 Pro - Latest flagship smartphone with titanium design", 
     {"category": "phone", "brand": "Apple", "price": 999}),
    
    (2, "Samsung Galaxy S24 Ultra - Android flagship with S Pen", 
     {"category": "phone", "brand": "Samsung", "price": 1199}),
    
    (3, "Sony WH-1000XM5 - Premium noise-canceling wireless headphones", 
     {"category": "audio", "brand": "Sony", "price": 399}),
    
    (4, "Apple AirPods Pro - True wireless earbuds with active noise cancellation", 
     {"category": "audio", "brand": "Apple", "price": 249}),
    
    (5, "iPad Pro M2 - Powerful tablet for creative professionals", 
     {"category": "tablet", "brand": "Apple", "price": 1099}),
    
    (6, "Google Pixel 8 Pro - AI-powered Android phone with amazing camera", 
     {"category": "phone", "brand": "Google", "price": 999}),
]

# Insert products into the database
for product_id, description, metadata in products:
    vector = simple_embed(description)
    db.upsert("products", product_id, vector, metadata)
    print(f"✅ Inserted product {product_id}: {metadata['brand']} - {metadata['category']}")

print(f"\n📦 Total products in database: {len(products)}")

## 6. Semantic Search

Now let's search for similar products:

In [None]:
# Search query
query = "I need a high-end smartphone with great camera"
query_vector = simple_embed(query)

# Find top 3 most similar products
results = db.query(
    collection="products",
    query_vector=query_vector,
    top_k=3
)

print(f"🔍 Query: '{query}'")
print(f"\n📊 Top 3 most similar products:")
print("-" * 50)

for rank, (product_id, distance) in enumerate(results, 1):
    # Get product details (you'd typically store this in your app)
    product = products[product_id - 1]
    _, desc, metadata = product
    
    print(f"\n{rank}. Product ID: {product_id}")
    print(f"   Distance: {distance:.4f}")
    print(f"   Brand: {metadata['brand']}")
    print(f"   Category: {metadata['category']}")
    print(f"   Price: ${metadata['price']}")
    print(f"   Description: {desc[:60]}...")

## 7. Search with Filters

You can also filter results by metadata:

In [None]:
# Search for audio products only
query = "wireless listening device for music"
query_vector = simple_embed(query)

# Filter by category
results = db.query(
    collection="products",
    query_vector=query_vector,
    top_k=5,
    filter={"category": "audio"}  # Only return audio products
)

print(f"🔍 Query: '{query}'")
print(f"🎯 Filter: category = 'audio'")
print(f"\n📊 Results (audio products only):")
print("-" * 50)

for rank, (product_id, distance) in enumerate(results, 1):
    product = products[product_id - 1]
    _, desc, metadata = product
    
    print(f"\n{rank}. {metadata['brand']} - ${metadata['price']}")
    print(f"   Distance: {distance:.4f}")
    print(f"   {desc[:80]}...")

## 8. Switch Database Backends

The beauty of VectorWrap is that you can switch backends with just one line:

In [None]:
# Example: Switch to DuckDB (uncomment to try)
# !pip install -q "vectorwrap[duckdb]"
# db_duckdb = VectorDB("duckdb:///:memory:")
# db_duckdb.create_collection("products", dim=VECTOR_DIM)

# The exact same code works!
# for product_id, description, metadata in products:
#     vector = simple_embed(description)
#     db_duckdb.upsert("products", product_id, vector, metadata)

print("💡 To switch backends, just change the connection string:")
print("   - PostgreSQL: 'postgresql://user:pass@host/db'")
print("   - MySQL:      'mysql://user:pass@host/db'")
print("   - SQLite:     'sqlite:///path/to/db.sqlite'")
print("   - DuckDB:     'duckdb:///path/to/db.duckdb'")

## 🎉 Summary

You've just learned how to:
1. **Connect** to a vector database using VectorWrap
2. **Create** collections for storing vectors
3. **Insert** vectors with metadata
4. **Search** for similar vectors
5. **Filter** results by metadata
6. **Switch** between different database backends

### Next Steps
- Use real embeddings from OpenAI, Hugging Face, or Cohere
- Try different database backends for your use case
- Scale to production with PostgreSQL + pgvector
- Check out the [GitHub repo](https://github.com/mihirahuja1/vectorwrap) for more examples

**⭐ If you found this helpful, please star the [vectorwrap repo](https://github.com/mihirahuja1/vectorwrap)!**