-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Description
Elasticsearch Version
9.1
Installed Plugins
No response
Java Version
bundled
OS Version
Any
Problem Description
Chunked inputs from semantic_text are automatically batched into a single request. If a bulk ingest request contains multiple semantic_text fields they will be batched together up to a certain batch size. The OpenAI embeddings API has a max batch size of 2048 inputs and 2048 is the value used to control the batch size in the OpenAI integration.
Line 19 in 4a39b4c
public static final int EMBEDDING_MAX_BATCH_SIZE = 2048; |
The OpenAI embeddings API is also limited to 300,000 tokens per request if the request contains 2048 inputs then that equals 146 tokens per input (300,000 / 2048) which is a small doc.
The max number of items in a single embedding request needs to respect the 300,000 tokens limit. In practice this means that a batch size of 2048 will rarely be appropriate and the chunk size should be taken into consideration.
Steps to Reproduce
Create an index with a semantic text field and bulk upload 2000 long documents.
Logs (if relevant)
No response