New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mon: don't set last_osd_report when the pg stats msg is ignored #12975

Merged
merged 1 commit into from Jan 22, 2017

Conversation

Projects
None yet
3 participants
@wonzhq
Contributor

wonzhq commented Jan 18, 2017

In some cases, this may lead to mon wrongly marking an osd down
because of no pg stats after a specified time period.

Signed-off-by: Zhiqiang Wang zhiqiang@xsky.com

Zhiqiang Wang
mon: don't set last_osd_report when the pg stats msg is ignored
In some cases, this may lead to mon wrongly marking an osd down
because of no pg stats after a specified time period.

Signed-off-by: Zhiqiang Wang <zhiqiang@xsky.com>

@tchaikov tchaikov self-assigned this Jan 18, 2017

if (!stats->get_orig_source().is_osd() ||
!mon->osdmon()->osdmap.is_up(from) ||
stats->get_orig_source_inst() != mon->osdmon()->osdmap.get_inst(from)) {
dout(1) << " ignoring stats from non-active osd." << dendl;
return false;
}
last_osd_report[from] = ceph_clock_now();

This comment has been minimized.

@tchaikov

tchaikov Jan 19, 2017

Contributor

In some cases

@wonzhq what is the case exactly? for example,

  1. an osd is marked down by monitor and then
  2. we received a straying pg stat message from it
  3. last_osd_report is marked with the time stamp.
  4. after a while, a new osd joined in, and monitor assigned it the osd id of the previously marked down osd, // but the leader calls check_osd_map() after the map is committed, and check_osd_map() will clear the last_osd_report for that osd.
  5. in the tick() the newly joined osd could be wrongly mark down? (but i doubt. see above)

This comment has been minimized.

@wonzhq

wonzhq Jan 19, 2017

Contributor

check_osd_map could return earlier without clearing last_osd_report if osdmap is not readable or pgmap is not writeable. This is a long time ago fixed bug, and the log file has been removed. I can't verify.

This comment has been minimized.

@tchaikov

tchaikov Jan 19, 2017

Contributor

@wonzhq okay, just wanted to understand if we need to backport this fix or not.

This comment has been minimized.

@wonzhq

wonzhq Jan 20, 2017

Contributor

@tchaikov sure :)

@tchaikov tchaikov added the needs-qa label Jan 19, 2017

@liewegas liewegas merged commit 5dccac8 into ceph:master Jan 22, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment