# Lesson 3.3: Embeddings and Vector Stores

---

In the previous lessons, we learned how to load and split large documents into smaller chunks. The next step to transform these chunks into retrievable knowledge for Large Language Models (LLMs) is to convert them into numerical representations. This lesson will introduce **Embeddings** and **Vector Stores**, two core concepts in building **Retrieval-Augmented Generation (RAG)** systems.

## 1. Concept of Embeddings

### 1.1. What are Embeddings?

**Embeddings** are numerical representations of text (or images, audio, etc.) as vectors in a multi-dimensional space. Each vector is a list of real numbers. The goal of embeddings is to capture the **semantic meaning** of the text.

* **Important Property:** Texts with similar meanings (e.g., "dog" and "puppy") will have their embedding vectors located close to each other in the vector space. Conversely, texts with different meanings will have vectors located far apart.
* **How it Works:** Embedding models (also called embedding encoders) are trained on a large amount of data to learn how to map text to this vector space.



### 1.2. The Role of Embeddings in Semantic Search and RAG

Embeddings are the backbone of **semantic search** and **Retrieval-Augmented Generation (RAG)** systems.

* **Semantic Search:** Instead of searching based on exact keywords (like traditional search), semantic search allows you to find documents or text snippets based on their meaning.
    * **Process:**
        1.  Create an embedding for the user's query.
        2.  Compare the query's embedding with the embeddings of all stored document chunks.
        3.  Retrieve the chunks whose embeddings are closest (i.e., most semantically similar) to the query.
* **RAG (Retrieval-Augmented Generation):** Embeddings are an essential component of RAG, a technique that allows an LLM to retrieve relevant information from an external knowledge base before generating a response. This helps the LLM to:
    * **Answer more accurately:** Based on factual information, reducing "hallucinations."
    * **Update knowledge:** Can access the latest information or private data not present in the LLM's training data.
    * **Transparency:** Can indicate the source of information used to generate the answer.




---

## 2. Using Embedding Models in LangChain

LangChain provides a unified interface for working with various embedding models.

In [None]:
# Install libraries if not already installed:
# pip install langchain-openai openai
# pip install langchain-google-genai google-generativeai
# pip install langchain-huggingface sentence-transformers # for local HuggingFaceEmbeddings

### 2.1. `OpenAIEmbeddings`

* **Concept:** Uses OpenAI's embedding models (e.g., `text-embedding-ada-002`, `text-embedding-3-small`, `text-embedding-3-large`). These are powerful and popular models.
* **Requirement:** OpenAI API Key (`OPENAI_API_KEY`).

In [None]:
import os
from langchain_openai import OpenAIEmbeddings

# Set environment variable for OpenAI API key
# os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"

# Initialize OpenAIEmbeddings
# You can specify model_name, e.g., "text-embedding-ada-002"
openai_embeddings = OpenAIEmbeddings(model="text-embedding-ada-002")

# Create an embedding for a text snippet
text_to_embed = "LangChain helps build LLM applications."
embedding = openai_embeddings.embed_query(text_to_embed)

print(f"--- OpenAIEmbeddings ---")
print(f"Văn bản: '{text_to_embed}'") # Text: '{text_to_embed}'
print(f"Kích thước vector embedding: {len(embedding)}") # Embedding vector size:
print(f"Một vài phần tử đầu tiên của vector: {embedding[:5]}...") # First few elements of the vector:

### 2.2. `HuggingFaceEmbeddings` (Local Models)

* **Concept:** Allows you to use embedding models from the Hugging Face Hub, including models that can run locally on your machine (e.g., `all-MiniLM-L6-v2`). This is useful when you want to avoid API costs or have data privacy requirements.
* **Requirement:** Requires `sentence-transformers` to be installed: `pip install sentence-transformers`.

In [None]:
# Install if not already installed:
# pip install langchain-huggingface sentence-transformers

from langchain_huggingface import HuggingFaceEmbeddings

# Initialize HuggingFaceEmbeddings with a local model
# 'sentence-transformers/all-MiniLM-L6-v2' is a small, efficient model
hf_embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

# Create an embedding for a text snippet
text_to_embed = "Semantic search is crucial."
embedding = hf_embeddings.embed_query(text_to_embed)

print(f"\n--- HuggingFaceEmbeddings (Cục bộ) ---") # --- HuggingFaceEmbeddings (Local) ---
print(f"Văn bản: '{text_to_embed}'") # Text:
print(f"Kích thước vector embedding: {len(embedding)}") # Embedding vector size:
print(f"Một vài phần tử đầu tiên của vector: {embedding[:5]}...") # First few elements of the vector:

### 2.3. `GoogleGenerativeAIEmbeddings`

* **Concept:** Uses Google Generative AI's embedding models (e.g., `embedding-001`).
* **Requirement:** Google Cloud API Key (`GOOGLE_API_KEY`).

In [None]:
# Install if not already installed:
# pip install langchain-google-genai google-generativeai

import os
from langchain_google_genai import GoogleGenerativeAIEmbeddings

# Set environment variable for Google API key
# os.environ["GOOGLE_API_KEY"] = "YOUR_GOOGLE_API_KEY"

# Initialize GoogleGenerativeAIEmbeddings
gemini_embeddings = GoogleGenerativeAIEmbeddings(model="embedding-001")

# Create an embedding for a text snippet
text_to_embed = "Vector Stores store embeddings."
embedding = gemini_embeddings.embed_query(text_to_embed)

print(f"\n--- GoogleGenerativeAIEmbeddings ---")
print(f"Văn bản: '{text_to_embed}'") # Text:
print(f"Kích thước vector embedding: {len(embedding)}") # Embedding vector size:
print(f"Một vài phần tử đầu tiên của vector: {embedding[:5]}...") # First few elements of the vector:


---

## 3. Introduction to Vector Stores

### 3.1. What are Vector Stores?

**Vector Stores** (Vector Databases) are specialized databases designed to store and manage vector embeddings. They are optimized for efficiently searching for the "closest" vectors (i.e., most semantically similar) to a given query vector.

* **Architecture:** A Vector Store typically uses Approximate Nearest Neighbor (ANN) algorithms to perform fast searches on millions or billions of vectors.
* **Types of Vector Stores:**
    * **In-memory:** Simple, fast for small cases, but data is lost when the program ends (e.g., local `FAISS`, local `Chroma`).
    * **Cloud-based/Managed:** Provides scalability, persistence, and management by the provider (e.g., Pinecone, Weaviate, Qdrant, Milvus, ChromaDB Cloud).



### 3.2. Why are Vector Stores Needed?

* **Store Embeddings:** Provides persistent storage for generated vector embeddings.
* **Efficient Search:** Performs fast and accurate semantic searches on a large dataset.
* **Easy Integration:** LangChain provides a unified interface for interacting with various Vector Stores.


---

## 4. Basic Vector Store Operations

Although there are many different Vector Stores, they generally share the following basic operations:

* **`add_documents`:** Adds one or more `Document` objects (already chunked) to the Vector Store. The Vector Store will automatically create embeddings for these `Document`s using the configured embedding model and store them.
* **`similarity_search`:** Takes a query string, creates an embedding for that query, and searches for the most semantically similar `Document`s (chunks) in the Vector Store. It returns a list of `Document`s sorted by relevance.
* **`as_retriever`:** A convenient method to turn a Vector Store into a `Retriever`. A `Retriever` is a standard interface in LangChain specifically used for retrieving relevant documents based on a query.

### 4.1. Practical Example with `FAISS` (In-memory Vector Store)

**FAISS (Facebook AI Similarity Search)** is a library that enables efficient similarity search on vectors. LangChain provides integration with FAISS to create a local, in-memory Vector Store.

* **Requirement:** Requires `faiss-cpu` (or `faiss-gpu` if you have a GPU) to be installed: `pip install faiss-cpu`.

In [None]:
# Install if not already installed:
# pip install langchain-openai openai faiss-cpu

import os
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_core.documents import Document

# Set environment variable for OpenAI API key
# os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"

# 1. Initialize Embeddings model
embeddings_model = OpenAIEmbeddings(model="text-embedding-ada-002")

# 2. Prepare documents (chunks)
documents = [
    Document(page_content="LangChain is a framework for developing LLM applications.", metadata={"source": "langchain_docs"}),
    Document(page_content="Retrieval-Augmented Generation (RAG) improves LLM accuracy.", metadata={"source": "rag_article"}),
    Document(page_content="Embeddings convert text into numerical vectors.", metadata={"source": "embedding_guide"}),
    Document(page_content="Vector Stores efficiently store and search embeddings.", metadata={"source": "vector_db_overview"}),
    Document(page_content="Machine learning is a branch of artificial intelligence.", metadata={"source": "ai_basics"}),
    Document(page_content="Pip is the package manager for Python.", metadata={"source": "python_guide"}),
]

# 3. Create Vector Store from documents and embeddings model
# FAISS.from_documents will automatically create embeddings and add them to the store
vector_store = FAISS.from_documents(documents, embeddings_model)

print("--- FAISS Vector Store ---")
print(f"Đã thêm {len(documents)} tài liệu vào FAISS Vector Store.") # Added {len(documents)} documents to FAISS Vector Store.

# 4. Similarity Search
query = "How can LLMs access new knowledge?"
print(f"\nĐang tìm kiếm cho truy vấn: '{query}'") # Searching for query:

# similarity_search returns the most similar Documents
# k=2 means return the 2 most relevant documents
retrieved_docs = vector_store.similarity_search(query, k=2)

print("\n--- Các tài liệu được truy xuất ---") # --- Retrieved Documents ---
for i, doc in enumerate(retrieved_docs):
    print(f"Tài liệu {i+1} (Nguồn: {doc.metadata.get('source', 'Không rõ')}):") # Document {i+1} (Source: {doc.metadata.get('source', 'Unknown')}):
    print(f"  Nội dung: {doc.page_content}") # Content:
    print("-" * 30)

# 5. Convert to Retriever
retriever = vector_store.as_retriever(search_kwargs={"k": 1}) # Only retrieve 1 document

print(f"\n--- Sử dụng Retriever ---") # --- Using Retriever ---
query_retriever = "Which vector represents text meaning?"
retrieved_by_retriever = retriever.invoke(query_retriever)
print(f"Truy xuất bằng Retriever cho '{query_retriever}':") # Retrieved by Retriever for:
for doc in retrieved_by_retriever:
    print(f"  Nội dung: {doc.page_content}") # Content:
    print(f"  Siêu dữ liệu: {doc.metadata}") # Metadata:

**Explanation:**
* `FAISS.from_documents(documents, embeddings_model)`: This is a convenient way to create a Vector Store from a list of `Document`s and an `Embeddings` model. FAISS will automatically create embeddings for each `Document` and store them.
* `vector_store.similarity_search(query, k=2)`: Performs a semantic search. It will take the `query` string, create an embedding for it, and find the 2 `Document`s with the closest embeddings in the store.
* `vector_store.as_retriever()`: Converts the Vector Store into a `Retriever` object. A `Retriever` is a standard interface in LangChain for retrieving documents, very useful when building RAG Chains.

### 4.2. Storing and Loading a Vector Store (FAISS)

For `FAISS`, you can store the created Vector Store to disk and load it later, avoiding the need to recreate embeddings every time you run the application.

In [None]:
# Install if not already installed:
# pip install langchain-openai openai faiss-cpu

import os
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_core.documents import Document
import shutil # For removing directory

# Set environment variable for OpenAI API key
# os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"

embeddings_model = OpenAIEmbeddings(model="text-embedding-ada-002")

# Prepare documents (similar to the example above)
documents_to_save = [
    Document(page_content="LangChain is a framework for developing LLM applications.", metadata={"source": "langchain_docs"}),
    Document(page_content="Retrieval-Augmented Generation (RAG) improves LLM accuracy.", metadata={"source": "rag_article"}),
    Document(page_content="Embeddings convert text into numerical vectors.", metadata={"source": "embedding_guide"}),
]

# Create Vector Store
vector_store_to_save = FAISS.from_documents(documents_to_save, embeddings_model)

# Save Vector Store to disk
save_path = "faiss_index"
vector_store_to_save.save_local(save_path)
print(f"\nĐã lưu Vector Store vào thư mục: {save_path}") # Vector Store saved to directory:

# Load Vector Store from disk
loaded_vector_store = FAISS.load_local(save_path, embeddings_model, allow_dangerous_deserialization=True)
# allow_dangerous_deserialization=True is necessary when loading objects from disk
# Be cautious when using this option with untrusted sources.

print(f"Đã tải lại Vector Store từ thư mục: {save_path}") # Vector Store reloaded from directory:

# Perform search with the reloaded Vector Store
query_loaded = "Which framework helps develop AI applications?"
retrieved_loaded_docs = loaded_vector_store.similarity_search(query_loaded, k=1)

print(f"\n--- Tìm kiếm với Vector Store đã tải lại ---") # --- Search with Reloaded Vector Store ---
print(f"Truy vấn: '{query_loaded}'") # Query:
for doc in retrieved_loaded_docs:
    print(f"  Nội dung: {doc.page_content}") # Content:
    print(f"  Siêu dữ liệu: {doc.metadata}") # Metadata:

# Clean up index directory
if os.path.exists(save_path):
    shutil.rmtree(save_path)
    print(f"Đã xóa thư mục '{save_path}'.") # Directory '{save_path}' removed.

**Note:** For cloud-based Vector Stores (like Pinecone, Weaviate), you would not store locally but connect directly to their cloud service.


---

## Lesson Summary

This lesson delved into two core concepts in building RAG systems: **Embeddings** and **Vector Stores**. We understood that **Embeddings** are numerical representations of text, capturing semantic meaning and serving as the foundation for **semantic search**. You learned how to use popular embedding models like **`OpenAIEmbeddings`**, **`HuggingFaceEmbeddings`** (local), and **`GoogleGenerativeAIEmbeddings`** to convert text into vectors.

Next, we explored **Vector Stores** as efficient places to store and manage vector embeddings, enabling fast similarity search. Finally, through practical examples with **FAISS** (an in-memory Vector Store), you performed basic operations such as **adding documents**, **similarity search**, and converting a Vector Store into a **Retriever**, as well as how to **store and load** a local Vector Store. Mastering these concepts is a crucial step towards building LLM applications capable of effectively retrieving and utilizing external knowledge.