-
Notifications
You must be signed in to change notification settings - Fork 115
Description
Related dev issue: elastic/elasticsearch#130485
Starting in version 9.2, the PUT _inference/{task_type}/{elasticsearch_inference_id}
API with the rerank task type introduces two new parameters in service_settings:
long_document_strategymax_chunks_per_doc
These parameters define how the reranker handles long documents during inference.
long_document_strategy
Type: string (optional)
Controls the strategy used for processing long documents.
Possible values:
truncate(default): Processes only the beginning of each document.chunk: Splits long documents into smaller parts (chunks) before inference.
To enable chunking, this value must be set to "chunk".
max_chunks_per_doc
Type: integer (optional)
Limits how many chunks per document are sent for inference when chunking is enabled.
If not set, all chunks generated for the document are processed.
Required updates
-
Add
long_document_strategyandmax_chunks_per_docto theservice_settingsobject of the PUT inference API documentation -
Update the Configuring chunking documentation to mention that chunking must be enabled by setting:
"long_document_strategy": "chunk"
- Update the description of the
chunking_settingsobject to clarify that enabling chunking requires setting thelong_document_strategy parametertochunkin the rerank inference endpoint configuration.