# Notebook: FAISS Vector Store Proof of Concept

This notebook provides a clean, self-contained demonstration of a high-performance, in-memory vector store using **FAISS** (Facebook AI Similarity Search).

**Goal:**
1.  Load a standard sentence-transformer model to create embeddings.
2.  Create sample tasks with varied content.
3.  Build a FAISS vector store in memory from these tasks.
4.  Perform a **semantic similarity search** to prove the vector store's effectiveness.

This showcases a core competency for building RAG (Retrieval-Augmented Generation) and other advanced agentic systems.

In [1]:
import sys
sys.path.append('..')

from datetime import datetime

from langchain_community.vectorstores import FAISS
from langchain_huggingface import HuggingFaceEmbeddings

from src.models.task import Task

print("All modules imported successfully.")

  from .autonotebook import tqdm as notebook_tqdm


All modules imported successfully.


In [2]:
# 1. Initialize the embedding model.
print("Loading embedding model...")
embedding_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
print("Embedding model loaded.")

# 2. Create our sample tasks, being explicit with optional fields to satisfy the linter.
sample_tasks = [
    Task(title="Finalize the quarterly marketing report", category="Work", description="Focus on Q4 metrics.", due_date=None),
    Task(title="Prepare slides for the project presentation", category="Work", description="Include the latest user feedback.", due_date=None),
    Task(title="Schedule a dentist appointment", category="Personal", description="Annual check-up and cleaning.", due_date=None),
    Task(title="Review the Q3 financial report", category="Work", description="Check for budget discrepancies.", due_date=None),
]

# 3. Prepare the data for the vector store.
documents = [f"Task: {t.title} - Description: {t.description or ''}" for t in sample_tasks]

metadatas = []
for task in sample_tasks:
    meta = task.model_dump(exclude_none=True)
    for key, value in meta.items():
        if isinstance(value, datetime):
            meta[key] = value.isoformat()
    metadatas.append(meta)

print("Sample tasks prepared for ingestion.")

Loading embedding model...
Embedding model loaded.
Sample tasks prepared for ingestion.


In [3]:
# This combines the documents and their embeddings into a searchable index.
print("\nCreating FAISS vector store...")
# FAISS.from_texts() is a convenient LangChain method that handles embedding and indexing.
db = FAISS.from_texts(documents, embedding_model, metadatas=metadatas)
print("FAISS vector store created successfully.")

# Perform the semantic search.
search_query = "What do I need to do regarding the financial reports?"
print(f"\nPerforming search for: '{search_query}'")

# The similarity_search method finds the documents whose embeddings are
# closest in vector space to the search query's embedding.
results = db.similarity_search(search_query, k=2) # Get the top 2 results

print("\n--- Top Search Results ---")
for i, doc in enumerate(results):
    print(f"{i+1}. Text Found: '{doc.page_content}'")
    print(f"   Associated Metadata: {doc.metadata}")
    print("-" * 20)


Creating FAISS vector store...
FAISS vector store created successfully.

Performing search for: 'What do I need to do regarding the financial reports?'

--- Top Search Results ---
1. Text Found: 'Task: Review the Q3 financial report - Description: Check for budget discrepancies.'
   Associated Metadata: {'id': '75084bf1-efa3-4115-9f09-804aa4442fa5', 'created_at': '2025-11-09T05:17:09.335925+00:00', 'is_completed': False, 'title': 'Review the Q3 financial report', 'category': 'Work', 'priority': 'Medium', 'description': 'Check for budget discrepancies.'}
--------------------
2. Text Found: 'Task: Finalize the quarterly marketing report - Description: Focus on Q4 metrics.'
   Associated Metadata: {'id': '5102e143-979f-481b-a685-484c6c1f497d', 'created_at': '2025-11-09T05:17:09.335848+00:00', 'is_completed': False, 'title': 'Finalize the quarterly marketing report', 'category': 'Work', 'priority': 'Medium', 'description': 'Focus on Q4 metrics.'}
--------------------


### Conclusion

The FAISS vector store correctly and instantly identified the two most semantically relevant tasks: **"Finalize the quarterly marketing report"** and **"Review the Q3 financial report"**.

This successful demonstration proves the effectiveness of the vector database approach for building intelligent, context-aware AI agents. The project is complete.