### Difference Between Vector Index and Vector Database

The terms **vector index** and **vector database** are related but serve different purposes in managing and retrieving vectorized data.

**1. Vector Index**
A **vector index** is a data structure that enables efficient similarity search on high-dimensional vector data. It speeds up retrieval for nearest neighbor searches.

**Key Features:**
- Designed for fast similarity searches on high-dimensional data.
- Uses indexing methods like:
  - **HNSW (Hierarchical Navigable Small World graphs)** – Used in FAISS, Milvus, Weaviate.
  - **IVF (Inverted File Index)** – Used in FAISS for partitioning vectors.
  - **LSH (Locality-Sensitive Hashing)** – Used in ANN-based searches.
  - **KD-Trees, Ball Trees, VP-Trees** – Used in classical nearest neighbor searches.
- **Does not store raw vectors** but optimizes their access and retrieval.
- Helps reduce computational cost ofbrute-force searches.

**Example Usage:**
- In FAISS, an IVF-PQ index speeds up approximate nearest neighbor searches.
- HNSW-based indexes allow rapid semantic search in vector databases.



**2. Vector Database**
A vector database is a complete data management system designed to store, index, and query vectorized data efficiently. It includes a vector index but also provides other database functionalities.

**Key Features:**
- Stores both raw vector embeddings and metadata (e.g., text, labels, categories).
- Uses vector indexes internally for optimized searches.
- Provides scalability, persistence, and distributed search.
- Supports hybrid search (vector similarity + structured queries).
- **Examples:**
  - **FAISS** – Optimized for in-memory searches.
  - **Milvus** – Open-source vector database with distributed storage.
  - **Weaviate** – Supports hybrid search with metadata filtering.
  - **Pinecone** – Managed vector database with automatic indexing.
  - **ChromaDB** – Lightweight, developer-friendly vector store.

**Example Usage:**
- A semantic search engine for legal documents using text embeddings.
- Recommendation systems that find similar products via image/text embeddings.
- Retrieval-Augmented Generation (RAG) pipelines for LLMs.


Traditional databases work, they store strings, numbers, and other types of scalar data in rows and columns. On the other hand, a vector database operates on vectors, so the way it’s optimized and queried is quite different.

In traditional databases, we are usually querying for rows in the database where the value usually exactly matches our query. In vector databases, we apply a similarity metric to find a vector that is the most similar to our query.

A vector database uses a combination of different algorithms that all participate in Approximate Nearest Neighbor (ANN) search. These algorithms optimize the search through hashing, quantization, or graph-based search

A Retrieval-Augmented Generation (RAG) pipeline consists of three main phases: indexing, retrieval, and generation. In the indexing phase, raw documents are processed, transformed into vector embeddings using an embedding model, and stored in a vector database for efficient retrieval. During the retrieval phase, when a user query is received, the system searches for the most relevant document embeddings using vector similarity search techniques like FAISS, Milvus, or Pinecone. Finally, in the generation phase, a large language model (LLM) uses the retrieved documents to generate a contextually accurate response, enhancing the reliability of the output by incorporating external knowledge.

![Example Image](RAG.png)

In [None]:
! pip install -q langchain_community tiktoken langchain-openai langchainhub chromadb langchain

In [None]:
! pip install -q bs4

In [None]:
from dotenv import load_dotenv
import os
from IPython.display import display, Markdown

load_dotenv()
os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY')

import bs4
from langchain import hub
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

#### INDEXING ####

# Load Documents
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs = loader.load()

# Split
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)

# Embed
vectorstore = Chroma.from_documents(documents=splits, 
                                    embedding=OpenAIEmbeddings())

retriever = vectorstore.as_retriever()

#### RETRIEVAL and GENERATION ####

# Prompt
prompt = hub.pull("rlm/rag-prompt")

# LLM
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

# Post-processing
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

# Chain
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

# Question
display(Markdown(rag_chain.invoke("What is Task Decomposition?")))



NameError: name 'Markdown' is not defined