-
Notifications
You must be signed in to change notification settings - Fork 66
[ML] Improve and use periodic boundary condition in seasonal component #84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Improve and use periodic boundary condition in seasonal component #84
Conversation
aad9921
to
0a92c58
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Left some minor comments.
docs/CHANGELOG.asciidoc
Outdated
|
||
=== Enhancements | ||
|
||
Improve and use periodic boundary condition for seasonal component modeling ({pull}84[#84]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is labelled to go in 6.4 so I think the release note should also be in the 6.4 section. Still getting my head around it though :-)
TDoubleVec f; | ||
TTimeVec times; | ||
TDoubleVec values; | ||
//std::ofstream file; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are useful to have in if there are changes in results. You can uncomment these and look at the modelling we do.
times.push_back(time); | ||
values.push_back(trend); | ||
f.push_back(prediction); | ||
//times.push_back(time); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above.
// are adjacent. Note that values need not be the same at the | ||
// start and end of the period because the gradient can vary, | ||
// but we expect them to be continuous. | ||
for (std::size_t j = n - 1; j > 0; --j) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you compared performance after this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only happens very infrequently (when we interpolate which is once per period) so typically once per day, week, etc. As a result these functions are never bottlenecks. For this specific code, n is small and typically this loop exits after the first iteration (only carries on for sparse data). The unit tests didn't noticeably slowdown, so I think this is probably safe from that perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
#84) Use the predicted value in the next and previous period when computing endpoint values for interpolating a seasonal component under periodic boundary conditions and enable periodic boundary conditions by default.
We were previously not using periodic boundary conditions when fitting seasonal components. This is because the value need not be the same at the start and end of the period because of variation in the gradient within the component. A better strategy is to use the predicted value in the next and previous period, since we expect the component to be continuous. As a result, we can enable periodic boundary conditions by default.
This gives a smoother fit to periodic signals. We get lower maximum prediction errors for smooth signals as a result which is reflected in some of the tighter unit test thresholds and also some manual testing I've done. This will affect results for event rate and metric analysis of periodic signals.