Skip to content

Conversation

@Jason-Whitmore
Copy link
Contributor

Issue #127517 describes ChangePointDetector.getChangeType() failing to detect a spike when all non spike values are constant (for example, the input [51,51,...,51,50001,51...,51]).

Prior to this change, the KDE.evaluate() method would return a special object for an input with bandwidth == 0. This would occur if the values excluding the spike had zero variance (see KDE constructor). This would then cause ChangePointDetector.getChangeType() and SpikeAndDipDetector.detect() to use the wrong p-value and fail to detect the spike.

This change removes the bandwidth == 0 condition for returning a special ValueAndMagnitude object. Statistical testing in ChangePointDetector.getChangeType() and SpikeAndDipDetector.detect() now properly detect the spike in the example mentioned above.

Unit tests are added to confirm the bug is fixed.

Closes #127517

…ct for bandwidth = 0

Prior to this change, the evaluate method would return a dummy object for an input
with bandwidth = 0. This would occur if the dataset had zero variance (see KDE constructor).
This would then cause ChangePointDetector to fail to detect a spike on a dataset containing
all equal numbers except for one spike.

This change removes the bandwidth = 0 condition for returning a dummy ValueAndMagnitude object.
Statistical testing in ChangePointDetector now properly detects the spike in the example mentioned
above.

Unit tests are added to confirm the bug is fixed.
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.1.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels May 29, 2025
@nielsbauman nielsbauman added >bug :ml Machine learning Team:ML Meta label for the ML team and removed needs:triage Requires assignment of a team area label labels Jun 4, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@davidkyle
Copy link
Member

@elasticmachine test this please

@davidkyle davidkyle self-assigned this Jun 10, 2025
@davidkyle
Copy link
Member

Thanks for your contribution @Jason-Whitmore and thanks for adding the tests, we will try to review this as soon as possible

@davidkyle
Copy link
Member

@elasticmachine test this please

Copy link
Contributor

@tveasey tveasey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose a small tweak.

This change enables an early return and avoids numerical issues.
@prwhelan
Copy link
Member

@elasticmachine generate changelog

@prwhelan
Copy link
Member

@elasticmachine test this please

@prwhelan
Copy link
Member

@elasticmachine test this please

@davidkyle davidkyle enabled auto-merge (squash) October 21, 2025 14:07
@prwhelan
Copy link
Member

@elasticmachine test this please

@prwhelan
Copy link
Member

@elasticmachine test this please

@prwhelan
Copy link
Member

@elasticmachine test this please

@brianseeders
Copy link
Contributor

@elasticmachine test this please

@prwhelan
Copy link
Member

prwhelan commented Nov 7, 2025

@elasticmachine test this please

@prwhelan
Copy link
Member

@elasticmachine test this please

@prwhelan
Copy link
Member

@elasticmachine test this please

@davidkyle davidkyle merged commit 352229b into elastic:main Nov 14, 2025
35 checks passed
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Nov 16, 2025
* main: (135 commits)
  Mute org.elasticsearch.upgrades.IndexSortUpgradeIT testIndexSortForNumericTypes {upgradedNodes=1} elastic#138130
  Mute org.elasticsearch.upgrades.IndexSortUpgradeIT testIndexSortForNumericTypes {upgradedNodes=2} elastic#138129
  Mute org.elasticsearch.search.basic.SearchWithRandomDisconnectsIT testSearchWithRandomDisconnects elastic#138128
  [DiskBBQ] avoid EsAcceptDocs bug by calling cost before building iterator (elastic#138127)
  Log NOT_PREFERRED shard movements (elastic#138069)
  Improve bulk loading of binary doc values (elastic#137995)
  Add internal action for getting inference fields and inference results for those fields (elastic#137680)
  Address issue with DateFieldMapper#isFieldWithinQuery(...) (elastic#138032)
  WriteLoadConstraintDecider: Have separate rate limiting for canRemain and canAllocate decisions (elastic#138067)
  Adding NodeContext to TransportBroadcastByNodeAction (elastic#138057)
  Mute org.elasticsearch.simdvec.ESVectorUtilTests testSoarDistanceBulk elastic#138117
  Mute org.elasticsearch.xpack.esql.qa.single_node.GenerativeIT test elastic#137909
  Backport batched_response_might_include_reduction_failure version to 8.19 (elastic#138046)
  Add summary metrics for tdigest fields (elastic#137982)
  Add gp-llm-v2 model ID and inference endpoint (elastic#138045)
  Various tracing fixes (elastic#137908)
  [ML] Fixing KDE evaluate() to return correct ValueAndMagnitude object (elastic#128602)
  Mute org.elasticsearch.xpack.shutdown.NodeShutdownIT testStalledShardMigrationProperlyDetected elastic#115697
  [ML] Fix Flaky Audit Message Assertion in testWithDatastream for RegressionIT and ClassificationIT (elastic#138065)
  [ML] Fix Non-Deterministic Training Set Selection in RegressionIT testTwoJobsWithSameRandomizeSeedUseSameTrainingSet (elastic#138063)
  ...

# Conflicts:
#	rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/search.vectors/200_dense_vector_docvalue_fields.yml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug external-contributor Pull request authored by a developer outside the Elasticsearch team :ml Machine learning Team:ML Meta label for the ML team v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ChangePointDetector doesn't detect a spike

8 participants