Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Reduce false positives for the periodic component test for anomaly detection #1177

Merged
merged 9 commits into from May 4, 2020

Conversation

tveasey
Copy link
Contributor

@tveasey tveasey commented Apr 28, 2020

While investigating changes in our QA suite from #1158, I noticed, particularly for sparse data, the periodicity test generates false positives.

These turn out to be because of how we account for piecewise scaling of the periodic component. We allow piecewise linear scaling of the periodic component in the time window we test. However, we didn't directly impose a limit on the minimum number of repeats we require per scale. This change requires (as with unscaled periodicity testing) that we have at least three repeats per scale selected. I also took the opportunity to slightly rework the combination of soft conditions we check to make it clearer what is happening.

The downside of detecting periodicity when none is present is that the model is generally less robust to anomalous behaviour. Also, the periodic component tends to be unstable. Both these effects tend to lead to poor estimation of anomaly severity: in particular, less significant anomalies are scored more highly than more significant anomalies by chance because of model instability.

@valeriy42 valeriy42 self-requested a review April 29, 2020 12:40
Copy link
Contributor

@valeriy42 valeriy42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @tveasey! I have just a single minor comment.

Comment on lines 1279 to 1283
double p{(softLessThan(v, 1.0, 0.2) &&
softGreaterThan(summary.s_R, summary.s_AutocorrelationThreshold, 0.1) &&
softGreaterThan(summary.s_DF / DFmin[0], 1.0, 0.2) &&
softLessThan(summary.s_TrendSegments, 0.0, 0.3) &&
softLessThan(summary.s_ScaleSegments, 0.0, 0.3))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice 👍

lib/maths/CPeriodicityHypothesisTests.cc Outdated Show resolved Hide resolved
@tveasey
Copy link
Contributor Author

tveasey commented May 4, 2020

retest

@tveasey tveasey merged commit c3e8dbe into elastic:master May 4, 2020
@tveasey tveasey deleted the periodicity-testing branch May 4, 2020 17:30
tveasey added a commit to tveasey/ml-cpp-1 that referenced this pull request May 4, 2020
tveasey added a commit that referenced this pull request May 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants