### 📖 Where We Are

**In the last notebook**, we learned the fundamentals of vector embeddings. We explored how they capture semantic meaning in numbers, how to measure their similarity using cosine similarity, and how to generate them using free, open-source models from Hugging Face that run on our own machine.

**In this notebook**, we'll explore the other major category of embedding models: **proprietary, API-based models**, using OpenAI as our primary example. We'll learn how to use them with LangChain, compare the different models they offer, and build a simple yet powerful **semantic search** function from scratch to see embeddings in action.

### 1. OpenAI Embeddings

While open-source models are fantastic for their accessibility and control, proprietary models offered by services like OpenAI often provide state-of-the-art performance. The main difference is that instead of running the model on your computer, you send your text to the provider's API and receive the embedding back. This requires an API key and usually involves a cost per use, but it offloads the computational heavy lifting.

**Analogy: Home Cooking vs. a Michelin Star Restaurant 🧑‍🍳**

-   **Hugging Face (Local Models)** is like having a professional kitchen at home. You have full control, there's no cost per use (after buying the equipment), but you're responsible for the setup and computation.
-   **OpenAI (API Models)** is like dining at a Michelin-star restaurant. You simply send your order (text), and a team of world-class chefs (the model) prepares a perfect dish (the embedding) for you. It's convenient and high-quality, but you pay for the service.

In [None]:
# To use the OpenAI API, we need to manage our API key securely.
import os
# `dotenv` is a library that helps load environment variables from a .env file.
from dotenv import load_dotenv

# This loads the variables from your .env file into the environment.
# Make sure you have a .env file with OPENAI_API_KEY="sk-..."
load_dotenv()

In [None]:
# LangChain libraries automatically look for the OPENAI_API_KEY environment variable.
# This line explicitly sets it, which is a good practice for clarity.
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

In [None]:
# Import the LangChain wrapper for OpenAI's embedding models.
from langchain_openai import OpenAIEmbeddings

# Instantiate the embeddings model.
# "text-embedding-3-small" is OpenAI's latest, most cost-effective, and highly performant model.
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

In [None]:
# The interface is identical to the HuggingFaceEmbeddings wrapper.
single_text = "Langchain and Rag are amazing frameworks and projects to work on"
single_embeddings = embeddings.embed_query(single_text)

# The 'text-embedding-3-small' model produces vectors with 1536 dimensions.
print(f"Vector dimensions: {len(single_embeddings)}")
print(f"Sample values: {single_embeddings[:5]}")

In [None]:
# We can also embed multiple documents at once for efficiency.
multiple_texts = [
    "Python is a programming language",
    "LangChain is a framework for LLM applications",
    "Embeddings convert text to numbers",
    "Vectors can be compared for similarity"
]

multiple_embeddings = embeddings.embed_documents(multiple_texts)

print(f"Number of embeddings: {len(multiple_embeddings)}")
print(f"Dimension of each embedding: {len(multiple_embeddings[0])}")

### 📊 OpenAI Embedding Models Comparison

OpenAI offers several embedding models, each with different capabilities, sizes, and costs.

| Model | Dimensions | Cost / 1M Tokens | Description | Best For |
| :--- | :---: | :---: | :--- | :--- |
| **`text-embedding-3-small`** | 1536 | $0.02 | Good balance of performance and cost. | General purpose, cost-effective applications. |
| **`text-embedding-3-large`** | 3072 | $0.13 | Highest quality and performance. | When accuracy is absolutely critical. |
| **`text-embedding-ada-002`** | 1536 | $0.10 | Previous generation, still widely used. | Legacy applications or specific compatibility needs. |

### 2. Cosine Similarity With OpenAI Embeddings

The principles of vector similarity are universal. The same `cosine_similarity` function we used for Hugging Face embeddings works perfectly for OpenAI embeddings because, in the end, they are all just vectors.

In [None]:
# This is the same function from the previous notebook.
import numpy as np
def cosine_similarity(vec1, vec2):
    dot_product = np.dot(vec1, vec2)
    norm_a = np.linalg.norm(vec1)
    norm_b = np.linalg.norm(vec2)
    return dot_product / (norm_a * norm_b)

In [None]:
sentences = [
    "The cat sat on the mat",
    "A feline rested on the rug", # Semantically very similar to the first
    "The dog played in the yard", # Related, but different
    "I love programming in Python" # Unrelated
]

# Generate embeddings for these sentences
sentence_embeddings = embeddings.embed_documents(sentences)

# Calculate the similarity between the two cat sentences.
cat_similarity = cosine_similarity(sentence_embeddings[0], sentence_embeddings[1])
print(f"Similarity between 'cat' sentences: {cat_similarity:.4f}")

# Calculate the similarity between the cat and dog sentences.
dog_similarity = cosine_similarity(sentence_embeddings[0], sentence_embeddings[2])
print(f"Similarity between 'cat' and 'dog' sentences: {dog_similarity:.4f}")

# Calculate the similarity between the cat and Python sentences.
python_similarity = cosine_similarity(sentence_embeddings[0], sentence_embeddings[3])
print(f"Similarity between 'cat' and 'Python' sentences: {python_similarity:.4f}")

### 3. Building a Simple Semantic Search

Now we can put everything together to build the core component of a RAG system: **semantic search**. The goal is to take a user's query, compare it against a collection of documents, and retrieve the most relevant ones based on semantic meaning, not just keyword matching.

In [None]:
def semantic_search(query: str, documents: list[str], embeddings_model, top_k: int = 2):
    """Performs a simple semantic search.
    
    Args:
        query: The user's search query.
        documents: A list of documents to search through.
        embeddings_model: The embedding model to use.
        top_k: The number of top results to return.
    """
    # 1. Embed the user's query.
    query_embedding = embeddings_model.embed_query(query)
    
    # 2. Embed all the documents.
    doc_embeddings = embeddings_model.embed_documents(documents)
    
    # 3. Calculate the cosine similarity between the query and each document.
    similarities = []
    for i, doc_emb in enumerate(doc_embeddings):
        similarity = cosine_similarity(query_embedding, doc_emb)
        similarities.append((similarity, documents[i]))
    
    # 4. Sort the documents by similarity score in descending order.
    similarities.sort(key=lambda x: x[0], reverse=True)
    
    # 5. Return the top_k most similar documents.
    return similarities[:top_k]

In [None]:
# Our corpus of documents to search.
documents = [
    "LangChain is a framework for developing applications powered by language models",
    "Python is a high-level programming language",
    "Machine learning is a subset of artificial intelligence",
    "Embeddings convert text into numerical vectors",
    "The weather today is sunny and warm"
]
query = "What is LangChain?"

# Perform the search.
results = semantic_search(query, documents, embeddings)

print(f"\n🔎 Semantic Search Results for: '{query}'")
for score, doc in results:
    print(f"Score: {score:.4f} | Document: {doc}")

### 🔑 Key Takeaways

* **API vs. Local Models**: OpenAI provides high-performance embedding models via a simple API. This contrasts with local models (like from Hugging Face) which you run on your own hardware. API models are convenient but have associated costs.
* **Consistent LangChain Interface**: A major advantage of LangChain is its consistent API. The `.embed_query()` and `.embed_documents()` methods work the same way for `OpenAIEmbeddings` as they do for `HuggingFaceEmbeddings`.
* **Semantic Search is the Goal**: The primary application of embeddings in RAG is semantic search. This process involves embedding a query and a set of documents, then using a similarity metric (like cosine similarity) to find and rank the most relevant documents.
* **Cost and Performance Trade-offs**: When using proprietary models, it's important to choose the right one for your task. Models like `text-embedding-3-small` offer a great balance of cost and performance, while `text-embedding-3-large` provides the highest quality for more demanding applications.