Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] add new trained_models/{model_id}/_infer endpoint for all supervised models and deprecate deployment infer api #86361

Merged
merged 15 commits into from May 5, 2022
Merged
5 changes: 5 additions & 0 deletions docs/changelog/86361.yaml
@@ -0,0 +1,5 @@
pr: 86361
summary: "Add new _infer endpoint for all supervised models and deprecate deployment infer api"
area: Machine Learning
type: enhancement
issues: []
1 change: 1 addition & 0 deletions docs/reference/ml/trained-models/apis/index.asciidoc
Expand Up @@ -11,6 +11,7 @@ include::delete-trained-models.asciidoc[leveloffset=+2]
include::get-trained-models.asciidoc[leveloffset=+2]
include::get-trained-models-stats.asciidoc[leveloffset=+2]
//INFER
include::infer-trained-model.asciidoc[leveloffset=+2]
include::infer-trained-model-deployment.asciidoc[leveloffset=+2]
//START/STOP
include::start-trained-model-deployment.asciidoc[leveloffset=+2]
Expand Down
Expand Up @@ -8,6 +8,8 @@

Evaluates a trained model.

deprecated::[8.3.0,Replaced by <<infer-trained-model>>.]

[[infer-trained-model-deployment-request]]
== {api-request-title}

Expand Down Expand Up @@ -38,7 +40,7 @@ include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id]
(Optional, time)
Controls the amount of time to wait for {infer} results. Defaults to 10 seconds.

[[infer-trained-model-request-body]]
[[infer-trained-model-deployment-request-body]]
== {api-request-body-title}

`docs`::
Expand Down
271 changes: 271 additions & 0 deletions docs/reference/ml/trained-models/apis/infer-trained-model.asciidoc
@@ -0,0 +1,271 @@
[role="xpack"]
[[infer-trained-model]]
= Infer trained model API
[subs="attributes"]
++++
<titleabbrev>Infer trained model</titleabbrev>
++++

Evaluates a trained model. The model may be any supervised model either trained by {dfanalytics} or imported.

[[infer-trained-model-request]]
== {api-request-title}

`POST _ml/trained_models/<model_id>/_infer`

////
[[infer-trained-model-prereq]]
== {api-prereq-title}

////
////
[[infer-trained-model-desc]]
== {api-description-title}

////

[[infer-trained-model-path-params]]
== {api-path-parms-title}

`<model_id>`::
(Required, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id]

[[infer-trained-model-query-params]]
== {api-query-parms-title}

`timeout`::
(Optional, time)
Controls the amount of time to wait for {infer} results. Defaults to 10 seconds.

[[infer-trained-model-request-body]]
== {api-request-body-title}

`docs`::
(Required, array)
An array of objects to pass to the model for inference. The objects should
contain the fields matching your configured trained model input. Typically for NLP models, the field
name is `text_field`. Currently for NLP models, only a single value is allowed. For {dfanalytics} or
imported classification or regression models, more than one value is allowed.

////
[[infer-trained-model-results]]
== {api-response-body-title}
////
////
[[ml-get-trained-models-response-codes]]
== {api-response-codes-title}

////

[[infer-trained-model-example]]
== {api-examples-title}

The response depends on the kind of model.

For example, for language identification the response is the predicted language and the score:

[source,console]
--------------------------------------------------
POST _ml/trained_models/lang_ident_model_1/_infer
{
"docs":[{"text": "The fool doth think he is wise, but the wise man knows himself to be a fool."}]
}
--------------------------------------------------
// TEST[skip:TBD]

Here are the results predicting english with a high probability.

[source,console-result]
----
{
"inference_results": [
{
"predicted_value": "en",
"prediction_probability": 0.9999658805366392,
"prediction_score": 0.9999658805366392
}
]
}
----
// NOTCONSOLE


When it is a text classification model, the response is the score and predicted classification.

For example:

[source,console]
--------------------------------------------------
POST _ml/trained_models/model2/_infer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an example using the built in lang_ident model pls

{
"docs": [{"text_field": "The movie was awesome!!"}]
}
--------------------------------------------------
// TEST[skip:TBD]

The API returns the predicted label and the confidence.

[source,console-result]
----
{
"inference_results": [{
"predicted_value" : "POSITIVE",
"prediction_probability" : 0.9998667964092964
}]
}
----
// NOTCONSOLE

For named entity recognition (NER) models, the response contains the annotated
text output and the recognized entities.

[source,console]
--------------------------------------------------
POST _ml/trained_models/model2/_infer
{
"docs": [{"text_field": "Hi my name is Josh and I live in Berlin"}]
}
--------------------------------------------------
// TEST[skip:TBD]

The API returns in this case:

[source,console-result]
----
{
"inference_results": [{
"predicted_value" : "Hi my name is [Josh](PER&Josh) and I live in [Berlin](LOC&Berlin)",
"entities" : [
{
"entity" : "Josh",
"class_name" : "PER",
"class_probability" : 0.9977303419824,
"start_pos" : 14,
"end_pos" : 18
},
{
"entity" : "Berlin",
"class_name" : "LOC",
"class_probability" : 0.9992474323902818,
"start_pos" : 33,
"end_pos" : 39
}
]
}]
}
----
// NOTCONSOLE

Zero-shot classification models require extra configuration defining the class labels.
These labels are passed in the zero-shot inference config.

[source,console]
--------------------------------------------------
POST _ml/trained_models/model2/_infer
{
"docs": [
{
"text_field": "This is a very happy person"
}
],
"inference_config": {
"zero_shot_classification": {
"labels": [
"glad",
"sad",
"bad",
"rad"
],
"multi_label": false
}
}
}
--------------------------------------------------
// TEST[skip:TBD]

The API returns the predicted label and the confidence, as well as the top classes:

[source,console-result]
----
{
"inference_results": [{
"predicted_value" : "glad",
"top_classes" : [
{
"class_name" : "glad",
"class_probability" : 0.8061155063386439,
"class_score" : 0.8061155063386439
},
{
"class_name" : "rad",
"class_probability" : 0.18218006158387956,
"class_score" : 0.18218006158387956
},
{
"class_name" : "bad",
"class_probability" : 0.006325615787634201,
"class_score" : 0.006325615787634201
},
{
"class_name" : "sad",
"class_probability" : 0.0053788162898424545,
"class_score" : 0.0053788162898424545
}
],
"prediction_probability" : 0.8061155063386439
}]
}
----
// NOTCONSOLE


The tokenization truncate option can be overridden when calling the API:

[source,console]
--------------------------------------------------
POST _ml/trained_models/model2/_infer
{
"docs": [{"text_field": "The Amazon rainforest covers most of the Amazon basin in South America"}],
"inference_config": {
"ner": {
"tokenization": {
"bert": {
"truncate": "first"
}
}
}
}
}
--------------------------------------------------
// TEST[skip:TBD]

When the input has been truncated due to the limit imposed by the model's `max_sequence_length`
the `is_truncated` field appears in the response.

[source,console-result]
----
{
"inference_results": [{
"predicted_value" : "The [Amazon](LOC&Amazon) rainforest covers most of the [Amazon](LOC&Amazon) basin in [South America](LOC&South+America)",
"entities" : [
{
"entity" : "Amazon",
"class_name" : "LOC",
"class_probability" : 0.9505460915724254,
"start_pos" : 4,
"end_pos" : 10
},
{
"entity" : "Amazon",
"class_name" : "LOC",
"class_probability" : 0.9969992804311777,
"start_pos" : 41,
"end_pos" : 47
}
],
"is_truncated" : true
}]
}
----
// NOTCONSOLE
Expand Up @@ -12,7 +12,8 @@ You can use the following APIs to perform model management operations:
* <<delete-trained-models-aliases>>
* <<get-trained-models>>
* <<get-trained-models-stats>>
* <<infer-trained-model-deployment>>
* <<infer-trained-model>>
* <<infer-trained-model-deployment>> deprecated:[8.3.0]
* <<start-trained-model-deployment>>
* <<stop-trained-model-deployment>>

Expand Down
Expand Up @@ -26,7 +26,7 @@ Requires the `manage_ml` cluster privilege. This privilege is included in the
Currently only `pytorch` models are supported for deployment. When deployed,
the model attempts allocation to every machine learning node. Once deployed
the model can be used by the <<inference-processor,{infer-cap} processor>>
in an ingest pipeline or directly in the <<infer-trained-model-deployment>> API.
in an ingest pipeline or directly in the <<infer-trained-model>> API.

[[start-trained-model-deployment-path-params]]
== {api-path-parms-title}
Expand Down
@@ -1,7 +1,7 @@
{
"ml.infer_trained_model_deployment":{
"ml.infer_trained_model":{
"documentation":{
"url":"https://www.elastic.co/guide/en/elasticsearch/reference/master/infer-trained-model-deployment.html",
"url":"https://www.elastic.co/guide/en/elasticsearch/reference/master/infer-trained-model.html",
"description":"Evaluate a trained model."
},
"stability":"experimental",
Expand All @@ -12,6 +12,19 @@
},
"url":{
"paths":[
{
"path":"/_ml/trained_models/{model_id}/_infer",
"methods":[
"POST"
],
"parts":{
"model_id":{
"type":"string",
"description":"The unique identifier of the trained model.",
"required":true
}
}
},
{
"path":"/_ml/trained_models/{model_id}/deployment/_infer",
"methods":[
Expand All @@ -23,6 +36,10 @@
"description":"The unique identifier of the trained model.",
"required":true
}
},
"deprecated": {
"version":"8.3.0",
"description": "/_ml/trained_models/{model_id}/deployment/_infer is deprecated. Use /_ml/trained_models/{model_id}/_infer instead"
}
}
]
Expand All @@ -36,7 +53,7 @@
}
},
"body":{
"description":"The docs to apply inference on",
"description":"The docs to apply inference on and inference configuration overrides",
"required":true
}
}
Expand Down
Expand Up @@ -109,7 +109,7 @@
import org.elasticsearch.xpack.core.ml.action.GetRecordsAction;
import org.elasticsearch.xpack.core.ml.action.GetTrainedModelsAction;
import org.elasticsearch.xpack.core.ml.action.GetTrainedModelsStatsAction;
import org.elasticsearch.xpack.core.ml.action.InternalInferModelAction;
import org.elasticsearch.xpack.core.ml.action.InferModelAction;
import org.elasticsearch.xpack.core.ml.action.IsolateDatafeedAction;
import org.elasticsearch.xpack.core.ml.action.KillProcessAction;
import org.elasticsearch.xpack.core.ml.action.MlInfoAction;
Expand Down Expand Up @@ -319,7 +319,8 @@ public List<ActionType<? extends ActionResponse>> getClientActions() {
StartDataFrameAnalyticsAction.INSTANCE,
EvaluateDataFrameAction.INSTANCE,
ExplainDataFrameAnalyticsAction.INSTANCE,
InternalInferModelAction.INSTANCE,
InferModelAction.INSTANCE,
InferModelAction.EXTERNAL_INSTANCE,
GetTrainedModelsAction.INSTANCE,
DeleteTrainedModelAction.INSTANCE,
GetTrainedModelsStatsAction.INSTANCE,
Expand Down