Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Undeploy elser when inference model deleted #104230

Merged
merged 7 commits into from Jan 11, 2024

Conversation

maxhniebergall
Copy link
Member

Previously, when an _inference model was deleted, this action did not also delete the corresponding trained model deployment. Now, for the ELSER service, the trained model deployment associated with an inference model is deleted/undeployed when the inference model is deleted.

closes https://github.com/elastic/ml-team/issues/1099

@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Jan 10, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine
Copy link
Collaborator

Hi @maxhniebergall, I've created a changelog YAML for you.

@maxhniebergall
Copy link
Member Author

To create an integration test for this feature I added a blocking download elser call which adds about 20s to the runtime. It was helpful for verifying the functionality of this change, but if that downloading elser causes problems or takes too long, I can disable this IT.

@@ -57,11 +68,29 @@ protected void masterOperation(
ClusterState state,
ActionListener<AcknowledgedResponse> listener
) {
modelRegistry.deleteModel(request.getModelId(), listener.delegateFailureAndWrap((l, r) -> l.onResponse(AcknowledgedResponse.TRUE)));
SubscribableListener.<ModelRegistry.UnparsedModel>newForked(modelConfigListener -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙌

// ELSER not downloaded case
{
String modelId = randomAlphaOfLength(10).toLowerCase();
expectThrows(ResponseException.class, () -> putModel(modelId, elserConfig, TaskType.SPARSE_EMBEDDING));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe check the message from the exception that it shows the model doesn't exist

new ElasticsearchStatusException("Failed to stop model " + request.getModelId(), RestStatus.INTERNAL_SERVER_ERROR)
);
}
}).addListener(listener.delegateFailure((l3, didDeleteModel) -> listener.onResponse(AcknowledgedResponse.of(didDeleteModel))));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should be listener.delegateFailureAndWrap() just in case 🤷‍♂️. The only thing it is doing is onResponse so I guess it probably doesn't matter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, I think the AndWrap method is only needed when the lambda could throw a runtime exception, and I don't think that should be possible here

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

expectThrows(ResponseException.class, () -> putModel(modelId, elserConfig, TaskType.SPARSE_EMBEDDING));
}

downloadElserBlocking();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be a slow test, we should look at testing the happy case elsewhere

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where should we be testing slow tests like this?

@maxhniebergall maxhniebergall merged commit 31e8989 into main Jan 11, 2024
16 checks passed
@maxhniebergall maxhniebergall deleted the undeployElserWhenInferenceModelDeleted branch January 11, 2024 15:32
jedrazb pushed a commit to jedrazb/elasticsearch that referenced this pull request Jan 17, 2024
* Added stop top InferenceService interface and Elser

* New integration tests

* undeploy ELSER deployment when _inf ELSER model deleted

* Update docs/changelog/104230.yaml

* Added check for platform architecture in integration test

* improvements from PR comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :ml Machine learning Team:ML Meta label for the ML team v8.13.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants