*Copyright 2025 Jaeyoung Chun / Winning Twelve*

You may not make copies of this and use or distribute it for any purpose.

# Embedding

OpenAI embeddings: https://platform.openai.com/docs/guides/embeddings/embedding-models
- text-embedding-3-small
- text-embedding-3-large
- text-embedding-ada-002

In [1]:
from langchain_openai import OpenAIEmbeddings

In [2]:
model_names = {
    "small": "text-embedding-3-small",
    "large": "text-embedding-3-large",
    "ada": "text-embedding-ada-002"
}

In [3]:
text = ["2025년 1월에 있었던 챔피언스리그 벤피카 vs. 바르셀로나 경기에 대해 요약해줘."]

## text-embedding-3-small

In [4]:
embedding_model1 = OpenAIEmbeddings(model=model_names["small"])

In [5]:
embedding = embedding_model1.embed_documents(text)

In [6]:
len(embedding)

1

In [7]:
len(embedding[0])

1536

In [8]:
embedding

[[0.01707690954208374,
  -0.002315341727808118,
  -0.01484245341271162,
  -0.03838007524609566,
  -0.010595974512398243,
  0.013305632397532463,
  0.010393761098384857,
  0.029543356969952583,
  0.0023267162032425404,
  -0.00662753963842988,
  0.0015899017453193665,
  -0.03300120308995247,
  -0.0065314881503582,
  -0.0484907403588295,
  0.00013973252498544753,
  0.0007778891013003886,
  -0.06567886471748352,
  0.006460713688284159,
  -0.018724948167800903,
  0.0009535618592053652,
  0.020848186686635017,
  0.008897383697330952,
  0.03773299232125282,
  -0.016975803300738335,
  0.02274899184703827,
  -0.012436115182936192,
  0.01839129626750946,
  0.025256436318159103,
  0.008427237160503864,
  0.008169415406882763,
  -0.0149233378469944,
  -0.0026970193721354008,
  0.0326978825032711,
  -0.05916759744286537,
  0.02107062190771103,
  -0.04201991483569145,
  -0.002364631276577711,
  0.008973212912678719,
  0.023416295647621155,
  -0.010555531829595566,
  0.0015001696301624179,
  -0.02050

## Other Embeddings

In [9]:
for key, value in model_names.items():

    embedding_model = OpenAIEmbeddings(model=model_names[key])
    embedding = embedding_model.embed_documents(text)
    print(model_names[key], len(embedding[0]))

text-embedding-3-small 1536
text-embedding-3-large 3072
text-embedding-ada-002 1536


## Multi-documents

In [10]:
docs = [
    "2025년 1월에 있었던 챔피언스리그 벤피카 vs. 바르셀로나 경기에 대해 요약해줘.",
    "2022 월드컵 한국 vs. 포르투갈 경기에 대해 요약해줘."
]

In [11]:
embeddings = embedding_model1.embed_documents(docs)

In [12]:
len(embeddings)

2

In [13]:
len(embeddings[0]), len(embeddings[1])

(1536, 1536)

## Pricing

```
| Model                     | Cost  |
|---------------------------|-------|
| text-embedding-3-small    | $0.02 |
| text-embedding-3-large    | $0.13 |
| text-embedding-ada-002    | $0.10 |
```

- Price per 1M tokens (as of Feb 3, 2025)
- Reference: https://platform.openai.com/docs/pricing#embeddings