# Vector Store Complete

## Overview

This notebook demonstrates the complete vector store workflow: generate embeddings, store them in a vector database, perform searches, and use hybrid search.

### Learning Objectives

- Generate embeddings for documents
- Store embeddings in a vector database
- Perform similarity and filtered searches
- Use hybrid search combining vector and keyword search

---

## Workflow

**Generate Embeddings → Store in Vector DB → Search → Hybrid Search**

Each step builds toward a production-ready semantic search system.

---

## Step 1: Generate Embeddings

Start by generating embeddings for your documents.


In [None]:
from semantica.embeddings import EmbeddingGenerator
import numpy as np

documents = [
    "Machine learning is a subset of artificial intelligence.",
    "Deep learning uses neural networks with multiple layers.",
    "Natural language processing enables computers to understand text.",
]

generator = EmbeddingGenerator()

try:
    embeddings = generator.generate(documents)
    print("✓ Embeddings generated")
    print(f"  Documents: {len(documents)}")
    print(f"  Embeddings shape: {embeddings.shape if hasattr(embeddings, 'shape') else 'N/A'}")
    
except Exception as e:
    print(f"✗ Error generating embeddings: {e}")
    embeddings = np.random.rand(len(documents), 1536).astype(np.float32)
    print("  Using demo embeddings")


## Step 2: Store in Vector Database

Store the embeddings along with documents and metadata in a vector database.


In [None]:
from semantica.vector_store import VectorStore

vector_store = VectorStore()

metadata = [
    {"id": i, "category": "technology", "source": "demo"}
    for i in range(len(documents))
]

try:
    vector_store.store(embeddings, documents, metadata)
    print("✓ Embeddings stored in vector database")
    print(f"  Stored {len(documents)} documents with embeddings")
    
except Exception as e:
    print(f"✗ Error storing embeddings: {e}")


## Step 3: Search

Perform similarity search and filtered search on the stored embeddings.


In [None]:
from semantica.vector_store import VectorRetriever

retriever = VectorRetriever(vector_store)

query = "artificial intelligence"
query_embedding = generator.generate([query])[0] if hasattr(generator, 'generate') else np.random.rand(1536).astype(np.float32)

try:
    results = retriever.retrieve(query_embedding, top_k=2)
    print("✓ Similarity search complete")
    print(f"  Found {len(results) if results else 0} results")
    
    if results:
        for i, result in enumerate(results[:2]):
            print(f"  Result {i+1}: Score = {result.score if hasattr(result, 'score') else 'N/A'}")
    
    filtered_results = vector_store.search(
        query_embedding,
        top_k=2,
        filters={"category": "technology"}
    )
    print(f"\n✓ Filtered search complete")
    print(f"  Found {len(filtered_results) if filtered_results else 0} filtered results")
    
except Exception as e:
    print(f"✗ Error performing search: {e}")


## Step 4: Hybrid Search

Use hybrid search to combine vector similarity search with keyword search for better results.


In [None]:
from semantica.vector_store import HybridSearch

hybrid_search = HybridSearch(vector_store)

try:
    hybrid_results = hybrid_search.search(
        query="artificial intelligence",
        vector_weight=0.7,
        keyword_weight=0.3,
        top_k=3
    )
    
    print("✓ Hybrid search complete")
    print(f"  Found {len(hybrid_results) if hybrid_results else 0} results")
    print("  Note: Hybrid search combines vector and keyword search for better accuracy")
    
    if hybrid_results:
        for i, result in enumerate(hybrid_results[:3]):
            print(f"  Result {i+1}: {result.document[:50] if hasattr(result, 'document') else 'N/A'}...")
    
except Exception as e:
    print(f"✗ Error performing hybrid search: {e}")
