New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
osd: restoring timely collection of PG stats #55478
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, does this "Fixes" https://tracker.ceph.com/issues/53342?
with_legacy: true | ||
default: 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did we get away with 500 before #54491? Was it the removed usage of need_publish
?
default configuration option for the manager collection of the OSD data.
Can you please share the option name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did we get away with 500 before #54491? Was it the removed usage of
need_publish
?
So it seems. I am still comparing logs for both versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please share the option name?
mgr_stats_period
bool is_time_expired = cutoff_time > info.stats.last_fresh ? true : false; | ||
cutoff_time -= | ||
cct->_conf.get_val<int64_t>("osd_pg_stat_report_interval_max_seconds"); | ||
const bool is_time_expired = cutoff_time > info.stats.last_fresh; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing 👍
Seems to. I do not think I've missed any other side effects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to. I do not think I've missed any other side effects.
Can you add "Fixes" to the commit message?
Sorry - I've answered the wrong question here. |
e93278d
to
1b8975f
Compare
500 seconds is way too long, e.g. when compared to the 5s default configuration option for the manager collection of the OSD data. Fixes tracker issue 53342 note 5 (a specific scenario leading to 'not all pgs scrubbed') Fixes: https://tracker.ceph.com/issues/53342 - partial fix Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
also - fixing review comments not addressed in the original PR. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
1b8975f
to
d706cec
Compare
jenkins test api |
jenkins test windows |
Merging based on my Teuthology runs |
Fixing the regression introduced by PR #54491.
Also - fixing review comments that were not addressed in the original PR.
Fixes tracker issue 53342 note 5 (a specific scenario leading to 'not all pgs scrubbed' failures)