---
# OpenAI Embeddings Model

---

## OpenAI Platform 👉 [Learn more](https://platform.openai.com/docs/models)
![image.png](attachment:image.png)![image-2.png](attachment:image-2.png)![image-3.png](attachment:image-3.png)![image-4.png](attachment:image-4.png)![image-5.png](attachment:image-5.png)

---

## Create an OpenAI Api Key 👉 [Learn more](https://platform.openai.com/api-keys)
![image-7.png](attachment:image-7.png)![image-6.png](attachment:image-6.png)

---

In [1]:
import os
from dotenv import load_dotenv
load_dotenv()

True

In [4]:
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

---

## OpenAIEmbeddings 👉 [Learn more](https://python.langchain.com/docs/integrations/text_embedding/openai/)

In [5]:
from langchain_openai import OpenAIEmbeddings

In [None]:
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

In [7]:
embeddings

OpenAIEmbeddings(client=<openai.resources.embeddings.Embeddings object at 0x00000247C009A270>, async_client=<openai.resources.embeddings.AsyncEmbeddings object at 0x00000247C009ABA0>, model='text-embedding-3-small', dimensions=None, deployment='text-embedding-ada-002', openai_api_version=None, openai_api_base=None, openai_api_type=None, openai_proxy=None, embedding_ctx_length=8191, openai_api_key=SecretStr('**********'), openai_organization=None, allowed_special=None, disallowed_special=None, chunk_size=1000, max_retries=2, request_timeout=None, headers=None, tiktoken_enabled=True, tiktoken_model_name=None, show_progress_bar=False, model_kwargs={}, skip_empty=False, default_headers=None, default_query=None, retry_min_seconds=4, retry_max_seconds=20, http_client=None, http_async_client=None, check_embedding_ctx_length=True)

---

### Single Text Embeddings
#### **`Example-1`**


In [8]:
single_text = "Langchain and Rag are amazing frameworks and projects to work on"

single_text

'Langchain and Rag are amazing frameworks and projects to work on'

In [9]:
single_embeddings = embeddings.embed_query(single_text)
print(f"Length = {len(single_embeddings)}")
print(single_embeddings)

Length = 1536
[-0.0501275472342968, -0.0310883279889822, -0.0033981609158217907, -0.003270666114985943, 0.03263866528868675, -0.03130591660737991, -0.014497006312012672, 0.0014780894853174686, -0.010607565753161907, -0.03394421190023422, 0.018182456493377686, 0.004501415882259607, -0.038160037249326706, 0.05007315054535866, 0.005973555613309145, 0.014401810243725777, -0.00012292621249798685, -0.0621766559779644, 0.040526341646909714, 0.0652773305773735, -0.0014610901707783341, -0.006048352457582951, -0.018318450078368187, 0.032965049147605896, -0.006928916554898024, -0.0083092600107193, -0.007574889808893204, 0.06326460838317871, 0.017910467460751534, -0.01975999027490616, 5.253847484709695e-05, -0.03394421190023422, -0.006442736368626356, 0.03372661769390106, 0.0149049898609519, 0.021827107295393944, -0.005721965804696083, 0.004107031971216202, -0.008084869012236595, 0.022493479773402214, 0.009628405794501305, 0.037969645112752914, 0.004419818986207247, -0.0075340913608670235, 0.01784

In [10]:
print("📝 Single Text Embedding:")
print(f"Input: {single_text}")
print(f"Output: Vector of {len(single_embeddings)} dimensions")
print(f"Sample values: {single_embeddings[:5]}")

📝 Single Text Embedding:
Input: Langchain and Rag are amazing frameworks and projects to work on
Output: Vector of 1536 dimensions
Sample values: [-0.0501275472342968, -0.0310883279889822, -0.0033981609158217907, -0.003270666114985943, 0.03263866528868675]


---

### Multiple Text Embeddings
#### **`Example-2`**

In [11]:
multiple_texts = [
    "Python is a programming language",
    "LangChain is a framework for LLM applications",
    "Embeddings convert text to numbers",
    "Vectors can be compared for similarity"
]

multiple_texts

['Python is a programming language',
 'LangChain is a framework for LLM applications',
 'Embeddings convert text to numbers',
 'Vectors can be compared for similarity']

In [None]:
multiple_embeddings = embeddings.embed_documents(multiple_texts)

In [13]:
print(multiple_embeddings)

[[-0.011004673317074776, -0.020408110693097115, 0.018817074596881866, -0.0028302103746682405, 0.015716591849923134, -0.026639673858880997, 0.0005226965295150876, 0.03720579296350479, -0.0017197990091517568, 0.012993469834327698, 0.021540194749832153, -0.0247222688049078, -0.009428935125470161, 0.0018638592446222901, 0.003916399087756872, 0.015502413734793663, -0.03296302631497383, 0.029780952259898186, -0.027210814878344536, 0.010372338816523552, -0.001478848629631102, -0.009913384914398193, -0.05385048687458038, 0.01543102040886879, 0.03685902804136276, -0.04291720688343048, 0.005415645893663168, 0.03622669354081154, -0.01945960894227028, 0.0011199729051440954, 0.01297307200729847, -0.032351087778806686, -0.03653265908360481, 0.05123955383896828, -0.03118840791285038, -0.04507938772439957, 0.04585450515151024, -0.010464129038155079, 0.06837379932403564, -0.015074057504534721, 0.004041336476802826, -0.039163991808891296, 0.03133119270205498, -0.00040030901436693966, -0.0021201081108301

In [14]:
print("\n📚 Multiple Text Embeddings:")
print(f"Number of texts: {len(multiple_texts)}")
print(f"Number of embeddings: {len(multiple_embeddings)}")
print(f"Each embedding size: {len(multiple_embeddings[0])}")


📚 Multiple Text Embeddings:
Number of texts: 4
Number of embeddings: 4
Each embedding size: 1536


In [16]:
print(multiple_embeddings[0])
print(multiple_embeddings[1])
print(multiple_embeddings[2])

[-0.011004673317074776, -0.020408110693097115, 0.018817074596881866, -0.0028302103746682405, 0.015716591849923134, -0.026639673858880997, 0.0005226965295150876, 0.03720579296350479, -0.0017197990091517568, 0.012993469834327698, 0.021540194749832153, -0.0247222688049078, -0.009428935125470161, 0.0018638592446222901, 0.003916399087756872, 0.015502413734793663, -0.03296302631497383, 0.029780952259898186, -0.027210814878344536, 0.010372338816523552, -0.001478848629631102, -0.009913384914398193, -0.05385048687458038, 0.01543102040886879, 0.03685902804136276, -0.04291720688343048, 0.005415645893663168, 0.03622669354081154, -0.01945960894227028, 0.0011199729051440954, 0.01297307200729847, -0.032351087778806686, -0.03653265908360481, 0.05123955383896828, -0.03118840791285038, -0.04507938772439957, 0.04585450515151024, -0.010464129038155079, 0.06837379932403564, -0.015074057504534721, 0.004041336476802826, -0.039163991808891296, 0.03133119270205498, -0.00040030901436693966, -0.00212010811083018

---

## Different OpenAI Embedding Models

In [17]:
from langchain_openai import OpenAIEmbeddings

models_comparison = {
    "text-embedding-3-small": {
        "dimensions": 1536,
        "description": "Good balance of performance and cost",
        "cost_per_1m_tokens": 0.02,
        "use_case": "General purpose, cost-effective"
    },
    "text-embedding-3-large": {
        "dimensions": 3072,
        "description": "Highest quality embeddings",
        "cost_per_1m_tokens": 0.13,
        "use_case": "When accuracy is critical"
    },
    "text-embedding-ada-002": {
        "dimensions": 1536,
        "description": "Previous generation model",
        "cost_per_1m_tokens": 0.10,
        "use_case": "Legacy applications"
    }
}

# Display comparison
print("📊 OpenAI Embedding Models Comparison:\n")
for model_name, details in models_comparison.items():
    print(f"Model: {model_name}")
    print(f"  📏 Dimensions: {details['dimensions']}")
    print(f"  💰 Cost: ${details['cost_per_1m_tokens']}/1M tokens")
    print(f"  📝 Description: {details['description']}")
    print(f"  🎯 Use case: {details['use_case']}\n")

📊 OpenAI Embedding Models Comparison:

Model: text-embedding-3-small
  📏 Dimensions: 1536
  💰 Cost: $0.02/1M tokens
  📝 Description: Good balance of performance and cost
  🎯 Use case: General purpose, cost-effective

Model: text-embedding-3-large
  📏 Dimensions: 3072
  💰 Cost: $0.13/1M tokens
  📝 Description: Highest quality embeddings
  🎯 Use case: When accuracy is critical

Model: text-embedding-ada-002
  📏 Dimensions: 1536
  💰 Cost: $0.1/1M tokens
  📝 Description: Previous generation model
  🎯 Use case: Legacy applications



---