-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUGFIX] tsdb/wlog.Checkpoint: Fix counting of histogram samples in stats. #13776
[BUGFIX] tsdb/wlog.Checkpoint: Fix counting of histogram samples in stats. #13776
Conversation
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Good catch! |
@machine424 I was also thinking it's |
Good question, I'm not familiar with that part of the code, I'll need to take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe @jesusvazquez can help with your questions?
I approve the fix, but let's hold back merging this until we have an idea if the stats can be ripped out entirely. |
Big picture, I want more stats about checkpoint. Quite often when I look at big memory usage, especially in Agent mode, it seems associated with the TSDB checkpoint, but it's hard to see what happened. Checkpoints only happen every 2 hours, and usually it takes days for the data to build up. I guess I should try to write a PR that adds the information I would find useful. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Hello from the bug scrub meeting!)
LGTM, and as per question if this should be removed, I think we have cases to utilize this more in the next PRs, so let's merge. Thanks! (also we need float histograms indeed)
@bwplotka I can open another PR to count also float histograms then. |
Totally |
I realized that histogram samples are wrongly counted in
tsdb/wlog.Checkpoint
, in that float samples are counted againststats.TotalSamples
andstats.DroppedSamples
respectively instead.This PR rectifies the bug, and augments
TestCheckpoint
to verify thatstats.TotalSamples
andstats.DroppedSamples
are correctly updated.That said, I don't think any callers of
Checkpoint
use thestats
return value. Should it be dropped?