- 
                Notifications
    You must be signed in to change notification settings 
- Fork 115
Adds new parameters to the elasticsearch inference API for the rerank task type #5476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Following you can find the validation changes against the target branch for the APIs. 
 You can validate these APIs yourself by using the  | 
| */ | ||
| num_threads: integer | ||
| /** | ||
| * Only for the `rerank` task type. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A quick clarification. For 9.2, these two values are only configurable for rerank endpoints using the elastic reranker model.
| body: { | ||
| /** | ||
| * The chunking configuration object. | ||
| * The chunking configuration object. For the `rerank` task type, you can enable chunking by setting the `long_document_strategy` parameter to `chunk` in the `service_settings` object. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we need to be more specific about this anywhere but for this new method of chunking the user can not set chunking_settings the way that they would for embeddings. We handle building the chunking settings for them. If we want to clarify how we build the chunking settings somewhere we can.
| * | ||
| * Possible values: | ||
| * - `truncate` (default): Processes only the beginning of each document. | ||
| * - `chunk`: Splits long documents into smaller parts (chunks) before inference. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure where it's best to clarify this but with chunking enabled we will return to the user a single score per document (same as we do for truncating) with the score correlating to the highest score of any chunk. I just want to make it clear that the structure of the response to the user will not change, only the rerank relevance scores.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR adds the new
long_document_strategyandmax_chunks_per_docparameters to theservice_settingsobject of the Create an Elasticsearch inference endpoint documentation.It also updates the description of the
chunking_settingsobject to clarify that this setting is only applicable for thesparse_embeddingsandtext_embeddingstask types.Related issue: #5451