-
Notifications
You must be signed in to change notification settings - Fork 24.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[ML] adds new change_point pipeline aggregation (#83428)
adds a new `change_point` sibling pipeline aggregation. This aggregation detects a change_point in a multi-bucket aggregation. Example: ``` POST kibana_sample_data_flights/_search { "size": 0, "aggs": { "histo": { "date_histogram": { "field": "timestamp", "fixed_interval": "3h" }, "aggs": { "ticket_price": { "max": { "field": "AvgTicketPrice" } } } }, "changes": { "change_point": { "buckets_path": "histo>ticket_price" } } } } ``` Response ``` { /*<snip>*/ "aggregations" : { "histo" : { "buckets" : [ /*<snip>*/ ] }, "changes" : { "bucket" : { "key" : "2022-01-28T23:00:00.000Z", "doc_count" : 48, "ticket_price" : { "value" : 1187.61083984375 } }, "type" : { "distribution_change" : { "p_value" : 0.023753965139433175, "change_point" : 40 } } } } } ```
- Loading branch information
Showing
19 changed files
with
2,077 additions
and
8 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
pr: 83428 | ||
summary: Adds new `change_point` pipeline aggregation | ||
area: Machine Learning | ||
type: feature | ||
issues: [] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
99 changes: 99 additions & 0 deletions
99
docs/reference/aggregations/pipeline/change-point-aggregation.asciidoc
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
[role="xpack"] | ||
[[search-aggregations-change-point-aggregation]] | ||
=== Change point aggregation | ||
++++ | ||
<titleabbrev>Change point</titleabbrev> | ||
++++ | ||
|
||
experimental::[] | ||
|
||
A sibling pipeline that detects, spikes, dips, and change points in a metric. Given a distribution of values | ||
provided by the sibling multi-bucket aggregation, this aggregation indicates the bucket of any spike or dip | ||
and/or the bucket at which the largest change in the distribution of values, if they are statistically significant. | ||
|
||
|
||
|
||
[[change-point-agg-syntax]] | ||
==== Parameters | ||
|
||
`buckets_path`:: | ||
(Required, string) | ||
Path to the buckets that contain one set of values in which to detect a change point. There must be at least 21 bucketed | ||
values. Fewer than 1,000 is preferred. | ||
For syntax, see <<buckets-path-syntax>>. | ||
|
||
==== Syntax | ||
|
||
A `change_point` aggregation looks like this in isolation: | ||
|
||
[source,js] | ||
-------------------------------------------------- | ||
{ | ||
"change_point": { | ||
"buckets_path": "date_histogram>_count" <1> | ||
} | ||
} | ||
-------------------------------------------------- | ||
// NOTCONSOLE | ||
<1> The buckets containing the values to test against. | ||
|
||
[[change-point-agg-response]] | ||
==== Response body | ||
|
||
`bucket`:: | ||
(Optional, object) | ||
Values of the bucket that indicates the discovered change point. Not returned if no change point was found. | ||
All the aggregations in the bucket are returned as well. | ||
+ | ||
.Properties of bucket | ||
[%collapsible%open] | ||
==== | ||
`key`::: | ||
(value) | ||
The key of the bucket matched. Could be string or numeric. | ||
`doc_count`::: | ||
(number) | ||
The document count of the bucket. | ||
==== | ||
|
||
`type`:: | ||
(object) | ||
The found change point type and its related values. Possible types: | ||
+ | ||
-- | ||
* `dip`: a significant dip occurs at this change point | ||
* `distribution_change`: the overall distribution of the values has changed significantly | ||
* `non_stationary`: there is no change point, but the values are not from a stationary distribution | ||
* `spike`: a significant spike occurs at this point | ||
* `stationary`: no change point found | ||
* `step_change`: the change indicates a statistically significant step up or down in value distribution | ||
* `trend_change`: there is an overall trend change occurring at this point | ||
-- | ||
|
||
==== Response example | ||
[source,js] | ||
-------------------------------------------------- | ||
"changes" : { | ||
"bucket" : { | ||
"key" : "2022-01-28T23:00:00.000Z", <1> | ||
"doc_count" : 48, <2> | ||
"ticket_price" : { <3> | ||
"value" : 1187.61083984375 | ||
} | ||
}, | ||
"type" : { <4> | ||
"distribution_change" : { | ||
"p_value" : 0.023753965139433175, <5> | ||
"change_point" : 40 <6> | ||
} | ||
} | ||
} | ||
-------------------------------------------------- | ||
// NOTCONSOLE | ||
<1> The bucket key that is the change point. | ||
<2> The number of documents in that bucket. | ||
<3> Aggregated values in the bucket. | ||
<4> Type of change found. | ||
<5> The `p_value` indicates how extreme the change is; lower values indicate greater change. | ||
<6> The specific bucket where the change occurs (indexing starts at `0`). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
90 changes: 90 additions & 0 deletions
90
.../main/java/org/elasticsearch/xpack/ml/aggs/changepoint/ChangePointAggregationBuilder.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
/* | ||
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
* or more contributor license agreements. Licensed under the Elastic License | ||
* 2.0; you may not use this file except in compliance with the Elastic License | ||
* 2.0. | ||
*/ | ||
|
||
package org.elasticsearch.xpack.ml.aggs.changepoint; | ||
|
||
import org.elasticsearch.Version; | ||
import org.elasticsearch.common.io.stream.StreamInput; | ||
import org.elasticsearch.common.io.stream.StreamOutput; | ||
import org.elasticsearch.plugins.SearchPlugin; | ||
import org.elasticsearch.search.aggregations.pipeline.BucketHelpers; | ||
import org.elasticsearch.search.aggregations.pipeline.BucketMetricsPipelineAggregationBuilder; | ||
import org.elasticsearch.search.aggregations.pipeline.PipelineAggregator; | ||
import org.elasticsearch.xcontent.ConstructingObjectParser; | ||
import org.elasticsearch.xcontent.ObjectParser; | ||
import org.elasticsearch.xcontent.ParseField; | ||
import org.elasticsearch.xcontent.XContentBuilder; | ||
|
||
import java.io.IOException; | ||
import java.util.Locale; | ||
import java.util.Map; | ||
|
||
import static org.elasticsearch.search.aggregations.pipeline.PipelineAggregator.Parser.GAP_POLICY; | ||
|
||
public class ChangePointAggregationBuilder extends BucketMetricsPipelineAggregationBuilder<ChangePointAggregationBuilder> { | ||
|
||
public static final ParseField NAME = new ParseField("change_point"); | ||
@SuppressWarnings("unchecked") | ||
public static final ConstructingObjectParser<ChangePointAggregationBuilder, String> PARSER = new ConstructingObjectParser<>( | ||
NAME.getPreferredName(), | ||
false, | ||
(args, context) -> new ChangePointAggregationBuilder(context, (String) args[0]) | ||
); | ||
|
||
static { | ||
PARSER.declareString(ConstructingObjectParser.constructorArg(), BUCKETS_PATH_FIELD); | ||
PARSER.declareField( | ||
ConstructingObjectParser.optionalConstructorArg(), | ||
p -> BucketHelpers.GapPolicy.parse(p.text().toLowerCase(Locale.ROOT), p.getTokenLocation()), | ||
GAP_POLICY, | ||
ObjectParser.ValueType.STRING | ||
); | ||
} | ||
|
||
public ChangePointAggregationBuilder(String name, String bucketsPath) { | ||
super(name, NAME.getPreferredName(), new String[] { bucketsPath }); | ||
} | ||
|
||
public ChangePointAggregationBuilder(StreamInput in) throws IOException { | ||
super(in, NAME.getPreferredName()); | ||
} | ||
|
||
public static SearchPlugin.PipelineAggregationSpec buildSpec() { | ||
return new SearchPlugin.PipelineAggregationSpec(NAME, ChangePointAggregationBuilder::new, ChangePointAggregationBuilder.PARSER) | ||
.addResultReader(InternalChangePointAggregation::new); | ||
} | ||
|
||
@Override | ||
public String getWriteableName() { | ||
return NAME.getPreferredName(); | ||
} | ||
|
||
@Override | ||
public Version getMinimalSupportedVersion() { | ||
return Version.V_8_2_0; | ||
} | ||
|
||
@Override | ||
protected void innerWriteTo(StreamOutput out) throws IOException {} | ||
|
||
@Override | ||
protected PipelineAggregator createInternal(Map<String, Object> metadata) { | ||
return new ChangePointAggregator(name, bucketsPaths[0], metadata); | ||
} | ||
|
||
@Override | ||
protected boolean overrideBucketsPath() { | ||
return true; | ||
} | ||
|
||
@Override | ||
protected XContentBuilder doXContentBody(XContentBuilder builder, Params params) throws IOException { | ||
builder.field(BUCKETS_PATH_FIELD.getPreferredName(), bucketsPaths[0]); | ||
return builder; | ||
} | ||
|
||
} |
Oops, something went wrong.