Skip to content

[9.2] Add new parameters to inference PUT API for the rerank task type #5451

@kosabogi

Description

@kosabogi

Related dev issue: elastic/elasticsearch#130485

Starting in version 9.2, the PUT _inference/{task_type}/{elasticsearch_inference_id}
API with the rerank task type introduces two new parameters in service_settings:

  • long_document_strategy
  • max_chunks_per_doc

These parameters define how the reranker handles long documents during inference.

long_document_strategy

Type: string (optional)

Controls the strategy used for processing long documents.

Possible values:

  • truncate (default): Processes only the beginning of each document.
  • chunk: Splits long documents into smaller parts (chunks) before inference.

To enable chunking, this value must be set to "chunk".

max_chunks_per_doc

Type: integer (optional)

Limits how many chunks per document are sent for inference when chunking is enabled.
If not set, all chunks generated for the document are processed.

Required updates

"long_document_strategy": "chunk"

  • Update the description of the chunking_settings object to clarify that enabling chunking requires setting the long_document_strategy parameter to chunk in the rerank inference endpoint configuration.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions