In [None]:
from langchain_openai import OpenAIEmbeddings
import os
from dotenv import load_dotenv
load_dotenv()   
api_key = os.getenv("OPENAI_API_KEY")

In [12]:
embedding_model = OpenAIEmbeddings(api_key=api_key)

In [None]:
text = "I am PremKumar, you tutor and mentor for RAG and AI"
vector = embedding_model.embed_query(text)

In [None]:
# Let's see what it looks like (it will be a list of floating point numbers)
print(f"Vector length: {len(vector)}")
print(f"First 5 numbers: {vector[:5]}")

Vector length: 1536
First 5 numbers: [0.0004620793624781072, -0.0030264039523899555, -0.010309120640158653, -0.031093191355466843, -0.0004767622740473598]


In [21]:
# Let's see what it looks like (it will be a list of floating point numbers)
text = "I am PremKumar, you tutor and mentor for RAG and AI"
vector = embedding_model.embed_query(text)
print(f"Vector length: {len(vector)}")
print(f"First 5 numbers: {vector}")

Vector length: 1536
First 5 numbers: [0.00520513579249382, -0.02007550187408924, -0.0007674781372770667, -0.025155333802103996, -0.0016848113154992461, 0.023299502208828926, -0.02443738467991352, 0.011019852012395859, -0.03608713671565056, -0.015849079936742783, 0.003146110102534294, 0.0038200346753001213, 0.010403499007225037, -0.00040913824341259897, 0.005032421555370092, -0.01924918219447136, 0.02537207305431366, -0.007734892889857292, 0.0063768839463591576, -0.0014807713450863957, -0.01291632279753685, 0.029178561642766, -0.005547177977859974, 0.0019032441778108478, -0.02114565297961235, -0.0058181025087833405, 0.01966911368072033, -0.01613355055451393, 0.006268514320254326, -0.013282070867717266, -0.007281094323843718, 0.0014739983016625047, 0.004957917146384716, -0.023773619905114174, -0.002089504851028323, 0.0032477066852152348, 0.012875684536993504, 0.02592746913433075, 0.017135970294475555, 0.005452354438602924, 0.027485284954309464, -0.005625068675726652, -0.00409434549510479

In [None]:
# Let's see what it looks like (it will be a list of floating point numbers)
text = "I am PremKumar, you tutor and mentor for RAG and AI, I am an AI Architect"
vector = embedding_model.embed_query(text)
print(f"Vector length: {len(vector)}")
print(f"First 5 numbers: {vector}")

Does the vector length change based on the size of the text?

text = "I am PremKumar, you tutor and mentor for RAG and AI"
If I change the text to "I am PremKumar, you tutor and mentor for RAG and AI" the vector length will change.

No, the vector length never changes based on the size of the text; it remains fixed regardless of whether you input a single keyword or a detailed paragraph. 

Embedding models are designed to map variable-length text into a static, fixed-dimensional space (for example, OpenAIâ€™s text-embedding-ada-002 always outputs a vector of exactly 1,536 numbers). 

This uniformity is essential because vector databases and similarity algorithms (like Cosine Similarity) require every data point to have the exact same number of dimensions to mathematically compare how "close" or related they are to one another.

**NOTE: Fixed Dimensionality

Embedding models are neural networks with a fixed output layer. They are designed to map variable-length text into a fixed-size vector space.

For OpenAI's text-embedding-ada-002: The vector length is always 1536.

For OpenAI's text-embedding-3-large: The default is 3072 (though this specific model allows you to programmatically shorten it).

If you ran your code with the word "Hi" and then again with a 500-word essay, len(vector) would print 1536 both times (assuming the standard model).