Skip to content

Conversation

tveasey
Copy link
Contributor

@tveasey tveasey commented Jun 7, 2019

This improves the interaction between multi-bucket feature and the model we maintain attending specifically to periods flagged as anomalous. In particular, we were suppressing updates of this anomaly model if the bucket probability was high but continuing to adjust the overall probability if the multi-bucket feature probability was low. The upshot our scores were inappropriately high.

This change pegs the time used to compute the anomaly duration to the last update time for the anomaly model and also smoothly reduces the impact on our overall result if the bucket value itself is normal.

This should help with #477.

Copy link
Contributor

@edsavage edsavage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - just one suggestion made inline

bool anomalyProbability(double& result) const;

//! Get the largest probability the model counts as anomalous.
double largestAnomalyProbability() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit of a nit pick - feel free to disregard! - Rename this method to say twiceLargestSignificantProbability?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll leave this. It is the probability which is deemed sufficiently anomalous to update this model and although it is currently twice the cutoff probability to output a result there is nothing that requires this, it was just a reasonable value.

inserter.insertValue(VERSION_6_5_TAG, "");
inserter.insertValue(VERSION_7_2_TAG, "");
if (m_Anomaly) {
inserter.insertLevel(ANOMALY_6_5_TAG, boost::bind(&CAnomaly::acceptPersistInserter,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer restored, so is it worth persisting it? Or is it a mistake not to restore it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd deleted the restore code from the wrong branch. Good catch.

@droberts195
Copy link
Contributor

Also, I'm not convinced it's a good idea to put this into 7.2.1. This change affects results and 7.2.1 is going to be a patch release that people will upgrade to expecting no change to functionality. Potentially Cloud clusters could be force-upgraded to 7.2.1 if there turns out to be a security bug in 7.2.0. Technically this is a bug fix, but if there's some set of circumstances where the anomaly detection is clearly worse after this change then we'll still have a very annoyed user.

Therefore I would say that either we prioritise testing this next Monday and Tuesday and get it into 7.2.0 or else it waits for 7.3.0.

And if it waits for 7.3.0 then the 7.2 constant in the code should be changed.

@tveasey
Copy link
Contributor Author

tveasey commented Jun 7, 2019

Ok I agree this is on the fence regarding bug fix vs enhancement. I'm happy to delay until 7.3 and we can test more. I've updated the naming accordingly.

@droberts195 droberts195 removed the v7.2.1 label Jun 7, 2019
Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

* Fix an edge case causing spurious anomalies (false positives) if the variance in the count of events
changed significantly throughout the period of a seasonal quantity. (See {ml-pull}489[#489].)
* Reduce false positives associated with the multi-bucket feature. (See {ml-pull}491[#491].)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this to the 7.3.0 section.

@Crazybus
Copy link

Crazybus commented Jun 7, 2019

jenkins test this please

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants