Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] fix weird change_point bug where all data values are equivalent #97588

Merged
merged 2 commits into from Jul 11, 2023

Conversation

benwtrent
Copy link
Member

If a user calls change_point aggregation and all the bucket values are exactly the same, we may run into weird floating point errors when calculating statistics. When we have no variance and standard deviation is 0, we should indicate that there is no change point and its a stationary set of data.

An example error that a user could run into:

{
  "error": {
    "root_cause": [],
    "type": "search_phase_execution_exception",
    "reason": "",
    "phase": "fetch",
    "grouped": true,
    "failed_shards": [],
    "caused_by": {
      "type": "not_strictly_positive_exception",
      "reason": "standard deviation (0)",
      "stack_trace": """org.apache.commons.math3.exception.NotStrictlyPositiveException: standard deviation (0)
	at commons.math3@3.6.1/org.apache.commons.math3.distribution.NormalDistribution.<init>(NormalDistribution.java:142)
	at commons.math3@3.6.1/org.apache.commons.math3.distribution.NormalDistribution.<init>(NormalDistribution.java:107)
	at commons.math3@3.6.1/org.apache.commons.math3.distribution.NormalDistribution.<init>(NormalDistribution.java:85)
	at org.elasticsearch.ml@8.8.2/org.elasticsearch.xpack.ml.aggs.changepoint.KDE.cdf(KDE.java:113)
	at org.elasticsearch.ml@8.8.2/org.elasticsearch.xpack.ml.aggs.changepoint.ChangePointAggregator.maxDeviationKdePValue(ChangePointAggregator.java:123)
	at org.elasticsearch.ml@8.8.2/org.elasticsearch.xpack.ml.aggs.changepoint.ChangePointAggregator.doReduce(ChangePointAggregator.java:88)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.search.aggregations.InternalAggregations.topLevelReduce(InternalAggregations.java:119)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.action.search.SearchPhaseController.reduceAggs(SearchPhaseController.java:631)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.action.search.SearchPhaseController.reducedQueryPhase(SearchPhaseController.java:594)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.action.search.QueryPhaseResultConsumer.reduce(QueryPhaseResultConsumer.java:138)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.action.search.FetchSearchPhase.innerRun(FetchSearchPhase.java:104)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.action.search.FetchSearchPhase$1.doRun(FetchSearchPhase.java:90)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:983)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1623)
"""
    },
    "stack_trace": """Failed to execute phase [fetch], 
	at org.elasticsearch.server@8.8.2/org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:729)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.action.search.FetchSearchPhase$1.onFailure(FetchSearchPhase.java:95)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:28)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:983)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1623)
Caused by: org.apache.commons.math3.exception.NotStrictlyPositiveException: standard deviation (0)
	at commons.math3@3.6.1/org.apache.commons.math3.distribution.NormalDistribution.<init>(NormalDistribution.java:142)
	at commons.math3@3.6.1/org.apache.commons.math3.distribution.NormalDistribution.<init>(NormalDistribution.java:107)
	at commons.math3@3.6.1/org.apache.commons.math3.distribution.NormalDistribution.<init>(NormalDistribution.java:85)
	at org.elasticsearch.ml@8.8.2/org.elasticsearch.xpack.ml.aggs.changepoint.KDE.cdf(KDE.java:113)
	at org.elasticsearch.ml@8.8.2/org.elasticsearch.xpack.ml.aggs.changepoint.ChangePointAggregator.maxDeviationKdePValue(ChangePointAggregator.java:123)
	at org.elasticsearch.ml@8.8.2/org.elasticsearch.xpack.ml.aggs.changepoint.ChangePointAggregator.doReduce(ChangePointAggregator.java:88)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.search.aggregations.InternalAggregations.topLevelReduce(InternalAggregations.java:119)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.action.search.SearchPhaseController.reduceAggs(SearchPhaseController.java:631)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.action.search.SearchPhaseController.reducedQueryPhase(SearchPhaseController.java:594)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.action.search.QueryPhaseResultConsumer.reduce(QueryPhaseResultConsumer.java:138)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.action.search.FetchSearchPhase.innerRun(FetchSearchPhase.java:104)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.action.search.FetchSearchPhase$1.doRun(FetchSearchPhase.java:90)
	at org.elasticsearch.server@8.8.2/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
	... 6 more
"""
  },
  "status": 400
}

@benwtrent benwtrent added >bug :ml Machine learning v8.10.0 labels Jul 11, 2023
@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Jul 11, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine
Copy link
Collaborator

Hi @benwtrent, I've created a changelog YAML for you.

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@benwtrent benwtrent merged commit 6a15ecf into elastic:main Jul 11, 2023
12 checks passed
@benwtrent benwtrent deleted the bug/fix-change-point-failure branch July 11, 2023 18:49
felixbarny pushed a commit to felixbarny/elasticsearch that referenced this pull request Aug 3, 2023
…lastic#97588)

If a user calls change_point aggregation and all the bucket values are exactly the same, we may run into weird floating point errors when calculating statistics. When we have no variance and standard deviation is 0, we should indicate that there is no change point and its a stationary set of data.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :ml Machine learning Team:ML Meta label for the ML team v8.10.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants