## Vector Databases: The Heart of Semantic Search

Once we have embeddings for our data—be it recipe chunks, medical documents, or product descriptions—the next question is: **where do we store them?** Storing embeddings in a plain list or a traditional database works for very small datasets, but as soon as you have thousands or millions of high-dimensional vectors, querying efficiently becomes a challenge. This is where **vector databases** come in.

---

### What is a Vector Database?

A **vector database** is a specialized database designed to store embeddings (vectors) and perform **similarity search** efficiently. Instead of looking for exact matches like a traditional SQL database, a vector database answers questions like:

- “Which chunks of text are semantically closest to this query?”  
- “Which recipes are most similar to a chocolate cake?”  
- “Which product descriptions match a user’s intent?”

It essentially transforms your problem from keyword search to **semantic search**, where meaning, not exact wording, determines similarity.

---

### How it Works

1. **Storing embeddings**: Each data chunk is converted into a fixed-length vector (embedding) and stored in the database along with a reference to the original text.  
2. **Indexing**: Vector databases build efficient indexes to speed up nearest neighbor search. Common techniques include:
   - **FAISS** (Facebook AI Similarity Search): highly optimized for CPU/GPU, supports millions of vectors.  
   - **HNSW** (Hierarchical Navigable Small World graphs): fast approximate search, good for real-time queries.  
3. **Querying**: When a query embedding is generated (from a user question, for example), the database computes distances (cosine similarity, Euclidean, etc.) between the query and stored vectors, returning the most similar entries.

---

### Benefits of Using a Vector Database

- **Speed**: Can handle millions of embeddings and return results in milliseconds.  
- **Scalability**: Designed to scale horizontally, supporting large datasets efficiently.  
- **Semantic search**: Retrieves results based on meaning rather than exact text match.  
- **Integration with RAG**: Perfect for retrieval-augmented generation pipelines, where relevant chunks need to be fetched for an LLM.

---

### Example Use Case: Recipe PDFs

Imagine you have a PDF cookbook that has been cleaned and chunked into 1,000 recipe sections.  

1. You generate embeddings for each chunk using `all-mpnet-base-v2`.  
2. You store these embeddings in a vector database like **FAISS**.  
3. When a user queries, *“Show me chocolate dessert recipes under 30 minutes”*, the system:  
   - Converts the query into an embedding.  
   - Searches the vector database for closest embeddings.  
   - Returns the corresponding recipe chunks.

This allows **semantic search** that understands intent, not just keywords. For example, it could retrieve *“Quick cocoa mug cake”* even though the query didn’t literally say “mug cake.”

---

### Popular Open-Source Vector Databases

**FAISS**, **Milvus**,  **Weaviate** , **Chroma**    


### Key Takeaways

- A **vector database** is the backbone of any semantic search or retrieval system.  
- It allows you to store, index, and search high-dimensional embeddings efficiently.  
- For tasks like **recipe search, document QA, or recommendation systems**, vector databases transform raw embeddings into actionable insights.  

> In short, embeddings give your data a “language” a machine can understand, and vector databases give that machine a “library” where it can find the right answers quickly.


#### Note:
I will stick to FAISS for this course

In [2]:
# imports and setup
import sys
import os
import json
from langchain.vectorstores import FAISS
from langchain.embeddings import HuggingFaceBgeEmbeddings
import warnings
from langchain.schema import Document
warnings.filterwarnings('ignore')

project_root = os.path.abspath(os.path.join("..", ".."))
sys.path.append(project_root)

from common.helper import read_pdf, extract_recipes_from_pdf

inp_dir = os.path.join(project_root, "data", "chunks")
MODEL_NAME = "intfloat/e5-base-v2"
vec_store_path = os.path.join(project_root, "data", "vector_store", "faiss_index")

# load the embedding model  
embeddings = HuggingFaceBgeEmbeddings(model_name=MODEL_NAME, encode_kwargs={"normalize_embeddings": True})

chunks = json.load(open(f"{inp_dir}/recipe_chunks.json", "r", encoding="utf-8"))
print(f"Loaded {len(chunks)} chunks")

# embed one chunk for example
chunk_text = chunks[0]['content']
chunk_embeddings = embeddings.embed_documents([chunk_text])

print(f"Chunk text: {chunk_text[:200]}...")
print(f"Embedding vector (first 10 values): {chunk_embeddings[0][:10]}  ...")

Loaded 23 chunks
Chunk text: Directions Step 1 Preheat oven to 350 degrees F (175 degrees C). Grease and fl our two 9 inch, round, cake pans; cover bottoms with waxed paper. Step 2 In a large bowl, combine fl our, 2 cups sugar, c...
Embedding vector (first 10 values): [-0.006904602516442537, 0.001459900289773941, -0.04020014405250549, 0.009997756220400333, 0.057940173894166946, -0.05789573863148689, 0.057525020092725754, 0.01736609824001789, -0.0022118366323411465, -0.017856286838650703]  ...


In [10]:
# Read a chunk and its metadata
# convert to a list of Document objects to be compatible with langchain
# create the vector store and save it locally

documents = []
for chunk in chunks:
    documents.append(Document(page_content=chunk["name"] + " " + chunk['content'], metadata={"page_num": chunk['metadata']['page'], "name": chunk['name']}))

vector_store = FAISS.from_documents(documents, embeddings)
vector_store.save_local(vec_store_path)

With the vector store in place, we are now ready to explore retrieval. This is the stage where we leverage the embeddings and the vector database to find the most relevant chunks for a given query. Instead of relying on exact keyword matches, retrieval uses semantic similarity — finding chunks whose meaning is closest to the user’s input. For example, if a user queries “quick chocolate desserts”, the system will return recipe chunks related to chocolate cakes, brownies, or mug cakes, even if the text doesn’t contain the exact words “quick” or “dessert”. This ability to retrieve information based on meaning rather than literal matches is the foundation of Retrieval-Augmented Generation (RAG) systems and sets the stage for building intelligent, context-aware applications on top of our recipe embeddings.

Below are some query examples:

In [11]:
# Load FAISS index
vectorstore = FAISS.load_local(vec_store_path, embeddings, allow_dangerous_deserialization=True)

def search_vector_db(query, k=3):
    results = vectorstore.similarity_search(query, k=k)
    return results

# Example search
query = "Give me recipe of banana cake"
results = search_vector_db(query)
print(f"Top {len(results)} results for the query: '{query}'")
for i, res in enumerate(results):
    print(f"\nResult {i+1}:")
    print(f"Content: {res.page_content[:300]}...")
    print(f"Metadata: {res.metadata}")

Top 3 results for the query: 'Give me recipe of banana cake'

Result 1:
Content:  BANANA CAKE Directions Step 1 Heat oven to 180C/160C fan/gas 4. Step 2 Butter your tin and line the base and sides with baking parchment. Step 3 Mix the butter and sugar until light and fluffy, then slowly add the eggs with a little flour. Fold in the remaining flour, baking powder and bananas. Ste...
Metadata: {'page_num': 35, 'name': ' BANANA CAKE'}

Result 2:
Content:  APPLE AND ALMOND DESSERT CAKE Directions Step 1 Preheat the oven to 170 C . Brush around the base of your tin with melted butter to grease. Line base and side with baking parchment. Step 2 Beat butter, caster sugar & vanilla in a bowl for 8 mins or till pale and creamy (by hand or electric beater)....
Metadata: {'page_num': 37, 'name': ' APPLE AND ALMOND DESSERT CAKE'}

Result 3:
Content:  BLACK FOREST CAKE WITH CREAM FILLING AND CHERRIES Directions Step 1 Preheat oven to 350 degrees F (175 degrees C). Grease and fl our two 9 inch, round

In [15]:
# Example search
query = "Give me cake recipe with chocolate and frosting"
results = search_vector_db(query)
print(f"Top {len(results)} results for the query: '{query}'")
for i, res in enumerate(results):
    print(f"\nResult {i+1}:")
    print(f"Content: {res.page_content[:300]}...")
    print(f"Metadata: {res.metadata}")

Top 3 results for the query: 'Give me cake recipe with chocolate and frosting'

Result 1:
Content:  CHOCOLATE CAKE WITH CHOCOLATE BUTTERCREAM FROSTING Directions Step 1 Preheat oven to 350º F. Prepare two 9-inch cake pans by spraying with baking spray or buttering and lightly fl ouring. Step 2 Add fl our, sugar, cocoa, baking powder, baking soda, salt and espresso powder to a large bowl or the bo...
Metadata: {'page_num': 45, 'name': ' CHOCOLATE CAKE WITH CHOCOLATE BUTTERCREAM FROSTING'}

Result 2:
Content:  RED VELVET WITH CREAM CHEESE FROSTING Directions Step 1 Preheat oven to 350 degrees F (175 degrees C). Grease two 9-inch round pans. Step 2 Beat Butter and sugar until very light and fluffy. Add eggs and beat well. Step 3 Make a paste of cocoa and red food coloring; add to creamed mixture. Step 4 M...
Metadata: {'page_num': 23, 'name': ' RED VELVET WITH CREAM CHEESE FROSTING'}

Result 3:
Content:  BLACK FOREST CAKE WITH CREAM FILLING AND CHERRIES Directions Step 1 Preheat oven to 3