Skip to content

Conversation

@davidkyle
Copy link
Member

Default inference endpoints automatically deploy the model on inference. If the model download has been started then deploying the model should wait for the download to complete. This happens but with a default timeout of 30 seconds which in some cases is not long enough.

The timeout parameter from the inference request is now used in the start deployment request. Semantic text sets the timeout to a large value so it will not timeout, clients can control this with the inference timeout parameter

Non issue as the code is behind a feature flag

@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Nov 13, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

* @param modelVariant The configuration of the model variant to be downloaded
* @param listener The listener
*/
default void putModel(Model modelVariant, ActionListener<Boolean> listener) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is only used by the ElasticsearchInternalService and does not need to be part of the InferenceService interface

@davidkyle davidkyle added the auto-backport Automatically create backport pull requests when merged label Nov 13, 2024
@davidkyle davidkyle merged commit 59602a9 into elastic:main Nov 13, 2024
16 checks passed
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.x

davidkyle added a commit to davidkyle/elasticsearch that referenced this pull request Nov 13, 2024
Default inference endpoints automatically deploy the model on inference
the inference timeout is now passed to start model deployment so users
can control that timeout
@davidkyle davidkyle deleted the pass-timeout branch November 13, 2024 14:25
davidkyle added a commit to davidkyle/elasticsearch that referenced this pull request Nov 13, 2024
smalyshev pushed a commit to smalyshev/elasticsearch that referenced this pull request Nov 13, 2024
Default inference endpoints automatically deploy the model on inference
the inference timeout is now passed to start model deployment so users
can control that timeout
afoucret pushed a commit to afoucret/elasticsearch that referenced this pull request Nov 14, 2024
Default inference endpoints automatically deploy the model on inference
the inference timeout is now passed to start model deployment so users
can control that timeout
elasticsearchmachine pushed a commit that referenced this pull request Nov 14, 2024
)

* [ML] Pass inference timeout to start deployment (#116725)

Default inference endpoints automatically deploy the model on inference
the inference timeout is now passed to start model deployment so users
can control that timeout

* handle max time

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
alexey-ivanov-es pushed a commit to alexey-ivanov-es/elasticsearch that referenced this pull request Nov 28, 2024
Default inference endpoints automatically deploy the model on inference
the inference timeout is now passed to start model deployment so users
can control that timeout
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged :ml Machine learning >non-issue Team:ML Meta label for the ML team v8.17.0 v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants