New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exponentially decaying reservoir update #1135
Exponentially decaying reservoir update #1135
Conversation
WeightedSnapshot should behave after long inactive reservoir as newly created reservoir.
white spaces
white spaces
white spaces
white spaces
Thanks for the pull request! I will review today. |
Thank you! Let me know if you need anything to be changed . |
@@ -38,6 +38,12 @@ public WeightedSample(long value, double weight) { | |||
* @param values an unordered set of values in the reservoir | |||
*/ | |||
public WeightedSnapshot(Collection<WeightedSample> values) { | |||
if (values.size()==1 && Double.compare(values.iterator().next().weight,0D)==0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's not very obvious why we perform this check in the constructor of the snapshot. I understand that the idea to avoid the situation when we have only one element in the sample with a 0 weight (because it provides bogus data), but I believe it should be performed in ExponentiallyDecayingReservoir
, where the logic of rescaling resides. What do you think, if we add a guard condition:
if (scalingFactor == 0) {
values.clear();
}
in the ExponentiallyDecayingReservoir#rescale
method? I think it could be a more clear solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi thanks, yes I will look into that today. I agree here it isn't the best to solve this problem.
Should be done, can you check please? |
Thank you very much for the contribution! |
Exponentially Decaying Reservoir was giving incorrect values in the snapshot if the inactive period was too long. This could result in wrong metrics aggregation.