Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report model change annotations via result stream #1247

Merged
merged 16 commits into from Jun 10, 2020

Conversation

przemekwitek
Copy link
Contributor

@przemekwitek przemekwitek commented May 14, 2020

This PR makes model change annotations reported via result stream.
It is achieved the following way:

  • model classes from maths module report model changes via TModelChangeCallback functional type
  • concrete instantiation of TModelChangeCallback creates a CAnnotation object and puts it into in-memory collection m_CurrentBucketStats.s_Annotations
  • annotations from models' s_Annotations are collected by CAnomalyJob and written out to the stream using CAnnotationJsonWriter class

Relates elastic/elasticsearch#55781

@przemekwitek przemekwitek force-pushed the annotations branch 16 times, most recently from b67cf88 to c7b4443 Compare May 21, 2020 13:23
@przemekwitek przemekwitek changed the title [DRAFT] Report model change annotations via result stream Report model change annotations via result stream May 21, 2020
@przemekwitek przemekwitek removed the WIP label May 21, 2020
@przemekwitek przemekwitek marked this pull request as ready for review May 21, 2020 13:51
lib/maths/CModel.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The memory of m_CurrentBucketStats.s_Annotations needs to be accounted for. So everywhere where we have this:

bash-3.2$ grep 'mem += core::CMemory::dynamicSize(m_CurrentBucketStats.s_InterimCorrections);' *
CEventRateModel.cc:    mem += core::CMemory::dynamicSize(m_CurrentBucketStats.s_InterimCorrections);
CEventRatePopulationModel.cc:    mem += core::CMemory::dynamicSize(m_CurrentBucketStats.s_InterimCorrections);
CMetricModel.cc:    mem += core::CMemory::dynamicSize(m_CurrentBucketStats.s_InterimCorrections);
CMetricPopulationModel.cc:    mem += core::CMemory::dynamicSize(m_CurrentBucketStats.s_InterimCorrections);

and this:

bash-3.2$ grep 'core::CMemoryDebug::dynamicSize("m_CurrentBucketStats.s_InterimCorrections",' *
CEventRateModel.cc:    core::CMemoryDebug::dynamicSize("m_CurrentBucketStats.s_InterimCorrections",
CEventRatePopulationModel.cc:    core::CMemoryDebug::dynamicSize("m_CurrentBucketStats.s_InterimCorrections",
CMetricModel.cc:    core::CMemoryDebug::dynamicSize("m_CurrentBucketStats.s_InterimCorrections",
CMetricPopulationModel.cc:    core::CMemoryDebug::dynamicSize("m_CurrentBucketStats.s_InterimCorrections",

we need equivalent lines for m_CurrentBucketStats.s_Annotations.

TODO1: when to clear s_Annotations vector?

Am I correct that at the moment it's never cleared?

One option might be to have a method that allows the annotations to be taken instead of got as a const reference. Then after outputting what's been built up they are all gone - no danger of outputting duplicates and no memory build up as long as the output method is called reasonably frequently.

For example:

CEventRateModel::TAnnotationVec CEventRateModel::takeAnnotations() {
    CEventRateModel::TAnnotationVec result;
    result.swap(m_CurrentBucketStats.s_Annotations);
    return result;
 }

Maybe Tom has a better idea though.

lib/maths/CTimeSeriesDecompositionDetail.cc Outdated Show resolved Hide resolved
lib/maths/CTimeSeriesModel.cc Outdated Show resolved Hide resolved
lib/model/CAnnotation.cc Outdated Show resolved Hide resolved
lib/model/CEventRateModel.cc Outdated Show resolved Hide resolved
lib/model/CEventRatePopulationModel.cc Outdated Show resolved Hide resolved
lib/model/CMetricModel.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@tveasey tveasey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done a first pass through. I've made a few suggestions. Aside from defaulting the add annotation callback better, my biggest suggested change is to rejig where the annotations for new decomposition components are written. Also, this should be capturing when a trend is detected and when calendar components are detected. It is probably easiest to discuss this offline.

include/maths/CModel.h Outdated Show resolved Hide resolved
include/maths/CTimeSeriesChangeDetector.h Outdated Show resolved Hide resolved
include/maths/CTimeSeriesDecomposition.h Outdated Show resolved Hide resolved
include/maths/CTimeSeriesDecompositionDetail.h Outdated Show resolved Hide resolved
include/maths/CTimeSeriesDecompositionInterface.h Outdated Show resolved Hide resolved
lib/maths/CTimeSeriesModel.cc Outdated Show resolved Hide resolved
lib/maths/unittest/CMathsMemoryTest.cc Outdated Show resolved Hide resolved
lib/maths/unittest/CTimeSeriesChangeDetectorTest.cc Outdated Show resolved Hide resolved
lib/model/CEventRatePopulationModel.cc Outdated Show resolved Hide resolved
@@ -113,6 +114,7 @@ const core_t::TTime HOUR = core::constants::HOUR;
const core_t::TTime DAY = core::constants::DAY;
const core_t::TTime WEEK = core::constants::WEEK;
const core_t::TTime YEAR = core::constants::YEAR;
const ml::maths::CModelAddSamplesParams::TModelChangeCallback NOOP;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to test, in at least one test, that the annotations are actually created.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should I go about the unit testing in this case? Should I call addPoint method with a model annotation callback and verify it got called?
Could you point me at the test case which I can take as a base?

Copy link
Contributor Author

@przemekwitek przemekwitek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR needs a changelog entry to make the changelog PR check succeed.

Done.

lib/api/CAnnotationJsonWriter.cc Outdated Show resolved Hide resolved
lib/api/CAnnotationJsonWriter.cc Outdated Show resolved Hide resolved
lib/api/unittest/CAnnotationJsonWriterTest.cc Outdated Show resolved Hide resolved
@przemekwitek
Copy link
Contributor Author

The memory of m_CurrentBucketStats.s_Annotations needs to be accounted for.

Done.

Am I correct that at the moment it's never cleared?

I've added clearing in the beginning of the sample method in all the models.
By Tom's comment: "I think we should clear the m_CurrentBucketStats.s_Annotations at the start of this function. It'll get called exactly once per bucket and we'll write them out between calls."

IIUC this solution is equivalent to the one you outlined (using swap).

Copy link
Contributor

@tveasey tveasey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @przemekwitek! This looks really close. I have some suggestions on naming: the current choices I think are easy to confuse with change point detection. Other than that I think this looks ready.

include/maths/CTimeSeriesDecompositionInterface.h Outdated Show resolved Hide resolved
include/maths/MathsTypes.h Outdated Show resolved Hide resolved
include/maths/CTimeSeriesDecompositionInterface.h Outdated Show resolved Hide resolved
lib/api/unittest/CAnnotationJsonWriterTest.cc Outdated Show resolved Hide resolved
include/maths/CModel.h Outdated Show resolved Hide resolved
@przemekwitek przemekwitek force-pushed the annotations branch 2 times, most recently from 8efbafe to f8466d1 Compare June 8, 2020 11:05
Copy link
Contributor

@tveasey tveasey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff @przemekwitek. Thanks for iterating on this! LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants