Skip to content

Commit

Permalink
[7.16] [ML] Model snapshot upgrade needs a stats endpoint (#81706)
Browse files Browse the repository at this point in the history
* [7.16] [ML] Model snapshot upgrade needs a stats endpoint

Previously the ML model snapshot upgrade endpoint did not
provide a way to reliably monitor progress. This could lead
to the upgrade assistant UI thinking that a model snapshot
upgrade had finished when it actually hadn't.

This change adds a new "stats" API that allows external
interested parties to find out the status of each model
snapshot upgrade and which node (if any) each is running on.

Backport of #81641

* Fixing compilation
  • Loading branch information
droberts195 committed Dec 14, 2021
1 parent f8fa41b commit 0d40336
Show file tree
Hide file tree
Showing 16 changed files with 1,062 additions and 116 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
[role="xpack"]
[[ml-get-job-model-snapshot-upgrade-stats]]
= Get {anomaly-job} model snapshot upgrade statistics API

[subs="attributes"]
++++
<titleabbrev>Get model snapshot upgrade statistics</titleabbrev>
++++

Retrieves usage information for {anomaly-job} model snapshot upgrades.

[[ml-get-job-model-snapshot-upgrade-stats-request]]
== {api-request-title}

`GET _ml/anomaly_detectors/<job_id>/model_snapshots/<snapshot_id>/_upgrade/_stats` +

`GET _ml/anomaly_detectors/<job_id>,<job_id>/model_snapshots/_all/_upgrade/_stats` +

`GET _ml/anomaly_detectors/_all/model_snapshots/_all/_upgrade/_stats`

[[ml-get-job-model-snapshot-upgrade-stats-prereqs]]
== {api-prereq-title}

Requires the `monitor_ml` cluster privilege. This privilege is included in the
`machine_learning_user` built-in role.

[[ml-get-job-model-snapshot-upgrade-stats-desc]]
== {api-description-title}

{anomaly-detect-cap} job model snapshot upgrades are ephemeral. Only
upgrades that are in progress at the time this API is called will be
returned.

[[ml-get-job-model-snapshot-upgrade-stats-path-parms]]
== {api-path-parms-title}

`<job_id>`::
(string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection-wildcard]

`<snapshot_id>`::
(string)
Identifier for the model snapshot.
+
You can get statistics for multiple {anomaly-job} model snapshot upgrades in a
single API request by using a comma-separated list of snapshot IDs. You can also
use wildcard expressions or `_all`.

[[ml-get-job-model-snapshot-upgrade-stats-query-parms]]
== {api-query-parms-title}

`allow_no_match`::
(Optional, Boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=allow-no-match-jobs]

[role="child_attributes"]
[[ml-get-job-model-snapshot-upgrade-stats-results]]
== {api-response-body-title}

The API returns an array of {anomaly-job} model snapshot upgrade status objects.
All of these properties are informational; you cannot update their values.

`assignment_explanation`::
(string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=assignment-explanation-datafeeds]

`job_id`::
(string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection]

`node`::
(object)
Contains properties for the node that runs the upgrade task. This information is
available only for upgrade tasks that are assigned to a node.
+
--
[%collapsible%open]
====
`attributes`:::
(object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-attributes]
`ephemeral_id`:::
(string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-ephemeral-id]
`id`:::
(string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-id]
`name`:::
(string)
The node name. For example, `0-o0tOo`.
`transport_address`:::
(string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-transport-address]
====
--

`snapshot_id`::
(string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-snapshot-id]

`state`::
(string)
One of `loading_old_state`, `saving_new_state`, `stopped` or `failed`.


[[ml-get-job-model-snapshot-upgrade-stats-response-codes]]
== {api-response-codes-title}

`404` (Missing resources)::
If `allow_no_match` is `false`, this code indicates that there are no
resources that match the request or only partial matches for the request.

[[ml-get-job-model-snapshot-upgrade-stats-example]]
== {api-examples-title}

[source,console]
--------------------------------------------------
GET _ml/anomaly_detectors/low_request_rate/model_snapshots/_all/_upgrade/_stats
--------------------------------------------------
// TEST[skip:it will be too difficult to get a reliable response in docs tests]

The API returns the following results:

[source,console-result]
----
{
"count" : 1,
"model_snapshot_upgrades" : [
{
"job_id" : "low_request_rate",
"snapshot_id" : "1828371",
"state" : "saving_new_state",
"node" : {
"id" : "7bmMXyWCRs-TuPfGJJ_yMw",
"name" : "node-0",
"ephemeral_id" : "hoXMLZB0RWKfR9UPPUCxXX",
"transport_address" : "127.0.0.1:9300",
"attributes" : {
"ml.machine_memory" : "17179869184",
"ml.max_open_jobs" : "512"
}
},
"assignment_explanation" : ""
}
]
}
----
// TESTRESPONSE[s/"7bmMXyWCRs-TuPfGJJ_yMw"/$body.$_path/]
// TESTRESPONSE[s/"node-0"/$body.$_path/]
// TESTRESPONSE[s/"hoXMLZB0RWKfR9UPPUCxXX"/$body.$_path/]
// TESTRESPONSE[s/"127.0.0.1:9300"/$body.$_path/]
// TESTRESPONSE[s/"17179869184"/$body.datafeeds.0.node.attributes.ml\\.machine_memory/]
1 change: 1 addition & 0 deletions docs/reference/ml/anomaly-detection/apis/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ include::get-job.asciidoc[leveloffset=+2]
include::get-job-stats.asciidoc[leveloffset=+2]
include::get-ml-info.asciidoc[leveloffset=+2]
include::get-snapshot.asciidoc[leveloffset=+2]
include::get-job-model-snapshot-upgrade-stats.asciidoc[leveloffset=+2]
include::get-overall-buckets.asciidoc[leveloffset=+2]
include::get-calendar-event.asciidoc[leveloffset=+2]
include::get-filter.asciidoc[leveloffset=+2]
Expand Down
1 change: 1 addition & 0 deletions docs/reference/ml/anomaly-detection/apis/ml-apis.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ See also <<ml-df-analytics-apis>>.

* <<ml-delete-snapshot,Delete model snapshot>>
* <<ml-get-snapshot,Get model snapshot info>>
* <<ml-get-job-model-snapshot-upgrade-stats,Get model snapshot upgrade statistics>>
* <<ml-revert-snapshot,Revert model snapshot>>
* <<ml-update-snapshot,Update model snapshot>>
* <<ml-upgrade-job-model-snapshot,Upgrade model snapshot>>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
{
"ml.get_model_snapshot_upgrade_stats":{
"documentation":{
"url":"https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-get-job-model-snapshot-upgrade-stats.html",
"description":"Gets stats for anomaly detection job model snapshot upgrades that are in progress."
},
"stability":"stable",
"visibility":"public",
"headers":{
"accept": [ "application/json"]
},
"url":{
"paths":[
{
"path":"/_ml/anomaly_detectors/{job_id}/model_snapshots/{snapshot_id}/_upgrade/_stats",
"methods":[
"GET"
],
"parts":{
"job_id":{
"type":"string",
"description":"The ID of the job. May be a wildcard, comma separated list or `_all`."
},
"snapshot_id":{
"type":"string",
"description":"The ID of the snapshot. May be a wildcard, comma separated list or `_all`."
}
}
}
]
},
"params":{
"allow_no_match":{
"type":"boolean",
"required":false,
"description":"Whether to ignore if a wildcard expression matches no jobs or no snapshots. (This includes the `_all` string.)"
}
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,9 @@
import org.elasticsearch.common.Strings;
import org.elasticsearch.common.regex.Regex;

import java.util.ArrayList;
import java.util.Collection;
import java.util.Collections;
import java.util.Iterator;
import java.util.LinkedList;
import java.util.List;
Expand Down Expand Up @@ -43,7 +45,8 @@ public static String[] tokenizeExpression(String expression) {
return Strings.tokenizeToStringArray(expression, ",");
}

private final LinkedList<IdMatcher> requiredMatches;
private final List<IdMatcher> allMatchers;
private final List<IdMatcher> requiredMatches;
private final boolean onlyExact;

/**
Expand All @@ -57,15 +60,18 @@ public static String[] tokenizeExpression(String expression) {
*/
public ExpandedIdsMatcher(String[] tokens, boolean allowNoMatchForWildcards) {
requiredMatches = new LinkedList<>();
List<IdMatcher> allMatchers = new ArrayList<>();

if (Strings.isAllOrWildcard(tokens)) {
// if allowNoJobForWildcards == true then any number
// of jobs with any id is ok. Therefore no matches
// are required

IdMatcher matcher = new WildcardMatcher("*");
this.allMatchers = Collections.singletonList(matcher);
if (allowNoMatchForWildcards == false) {
// require something, anything to match
requiredMatches.add(new WildcardMatcher("*"));
requiredMatches.add(matcher);
}
onlyExact = false;
return;
Expand All @@ -78,23 +84,55 @@ public ExpandedIdsMatcher(String[] tokens, boolean allowNoMatchForWildcards) {
// specific job Ids are
for (String token : tokens) {
if (Regex.isSimpleMatchPattern(token)) {
allMatchers.add(new WildcardMatcher(token));
atLeastOneWildcard = true;
} else {
requiredMatches.add(new EqualsIdMatcher(token));
IdMatcher matcher = new EqualsIdMatcher(token);
allMatchers.add(matcher);
requiredMatches.add(matcher);
}
}
} else {
// Matches are required for wildcards
for (String token : tokens) {
if (Regex.isSimpleMatchPattern(token)) {
requiredMatches.add(new WildcardMatcher(token));
IdMatcher matcher = new WildcardMatcher(token);
allMatchers.add(matcher);
requiredMatches.add(matcher);
atLeastOneWildcard = true;
} else {
requiredMatches.add(new EqualsIdMatcher(token));
IdMatcher matcher = new EqualsIdMatcher(token);
allMatchers.add(matcher);
requiredMatches.add(matcher);
}
}
}
onlyExact = atLeastOneWildcard == false;
this.allMatchers = Collections.unmodifiableList(allMatchers);
}

/**
* Generate the list of required matches from the {@code expression}
* and initialize.
*
* @param expression Expression that will be tokenized into a set of wildcards or full Ids
* @param allowNoMatchForWildcards If true then it is not required for wildcard
* expressions to match an Id meaning they are
* not returned in the list of required matches
*/
public ExpandedIdsMatcher(String expression, boolean allowNoMatchForWildcards) {
this(tokenizeExpression(expression), allowNoMatchForWildcards);
}

/**
* Test whether an ID matches any of the expressions.
* Unlike {@link #filterMatchedIds} this does not modify the state of
* the matcher.
* @param id ID to test.
* @return Does the ID match one or more of the patterns in the expression?
*/
public boolean idMatches(String id) {
return allMatchers.stream().anyMatch(idMatcher -> idMatcher.matches(id));
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -305,6 +305,16 @@ public static Collection<PersistentTasksCustomMetadata.PersistentTask<?>> nonFai
});
}

public static Collection<PersistentTasksCustomMetadata.PersistentTask<?>> snapshotUpgradeTasks(
@Nullable PersistentTasksCustomMetadata tasks
) {
if (tasks == null) {
return Collections.emptyList();
}

return tasks.findTasks(JOB_SNAPSHOT_UPGRADE_TASK_NAME, task -> true);
}

public static Collection<PersistentTasksCustomMetadata.PersistentTask<?>> snapshotUpgradeTasksOnNode(
@Nullable PersistentTasksCustomMetadata tasks,
String nodeId
Expand Down

0 comments on commit 0d40336

Please sign in to comment.