New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jewel: mon: Upgrading 0.94.6 -> 0.94.9 saturating mon node networking #11679
Conversation
@@ -3061,6 +3061,25 @@ void OSDMonitor::get_health(list<pair<health_status_t,string> >& summary, | |||
} | |||
} | |||
|
|||
// warn about upgrade flags that can be set but are not. | |||
if ((osdmap.get_up_osd_features() & CEPH_FEATURE_SERVER_KRAKEN) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should remove the code referencing KRAKEN.
@tchaikov removed reference to kraken in mon/OSDMonitor: health warn if require_{jewel,kraken} flags aren't set and mon/OSDMonitor: encode canonical full osdmap based on osdmap flags |
I'm concerned by http://pulpito.ceph.com/loic-2016-10-31_09:24:48-rados-wip-17734-jewel-distro-basic-smithi/505937/ which looks like another form of http://tracker.ceph.com/issues/17728#note-6 . The rest of the failures above are either fixed or known. |
If the JEWEL or KRAKEN flags aren't set, encode the full map without those features. This ensure that older OSDs in the cluster will be able to correctly encode the full map with a matching CRC. At least, that is true as long as the encoding changes are guarded by those feature bits. That appears to be true currently, and we plan to ensure that it is true in the future as well. Signed-off-by: Sage Weil <sage@redhat.com> (cherry picked from commit 5e0daf6) Conflicts: src/mon/OSDMonitor.cc: removed reference to kraken if (!tmp.test_flag(CEPH_OSDMAP_REQUIRE_KRAKEN)) { dout(10) << __func__ << " encoding without feature SERVER_KRAKEN" << dendl; features &= ~CEPH_FEATURE_SERVER_KRAKEN; }
We want to prompt users to set these flags as soon as their upgrades complete. Signed-off-by: Sage Weil <sage@redhat.com> (cherry picked from commit 12e5083) Conflicts: src/mon/OSDMonitor.cc: remove references to kraken if ((osdmap.get_up_osd_features() & CEPH_FEATURE_SERVER_KRAKEN) && !osdmap.test_flag(CEPH_OSDMAP_REQUIRE_KRAKEN)) { string msg = "all OSDs are running kraken or later but the" " 'require_kraken_osds' osdmap flag is not set"; summary.push_back(make_pair(HEALTH_WARN, msg)); if (detail) { detail->push_back(make_pair(HEALTH_WARN, msg)); } } else
The Incremental encode stashes encode_features, which is what we use later to reencode the updated OSDMap. Use the same features so that the encoding will match! Signed-off-by: Sage Weil <sage@redhat.com> (cherry picked from commit 916ca6a)
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn> (cherry picked from commit 83ffc2b)
http://pulpito.ceph.com/loic-2016-11-07_06:19:54-rados-wip-17734-jewel-distro-basic-smithi/ has a clean run. It still is racy but no more than master and that shows the commits in this pull request do the right thing. http://tracker.ceph.com/issues/17808 was opened for this. |
@tchaikov although some upgrade tests are still missing because VPS run have had issues since last thursday, I think there is enough proof that these commits do the right thing and that we're in no danger of a regression. What say you ? |
@dachary cool, let's merge it! |
It appears that, as of 10.2.4, cluster admins will have to do "ceph osd set require_jewel_osds", otherwise the MONs will complain: "all OSDs are running jewel or later but the 'require_jewel_osds' osdmap flag is not set". Does this deserve a mention in the 10.2.4 release notes? |
(answering my own question based on clarification provided by @liewegas and @athanatos on IRC) When the last hammer OSD in a cluster containing jewel MONs is upgraded to jewel, as of 10.2.4 the jewel MONs will issue this warning: "all OSDs are running jewel or later but the 'require_jewel_osds' osdmap flag is not set" and change the cluster health status to HEALTH_WARN. This is a signal for the admin to do "ceph osd set require_jewel_osds" - by doing this, the admin acknowledges that there is no downgrade path. (I propose that we add this text, or one like it, to the 10.2.4 release notes.) |
+1
|
* Upgrading 0.94.6 -> 0.94.9 saturating mon node networking, http://tracker.ceph.com/issues/17694 ceph/ceph#11679 patches: - mon-OSDMonitor-encode-canonical-full-osdmap-based-on.patch - mon-OSDMonitor-health-warn-if-require_-jewel-kraken-.patch - mon-OSDMonitor-encode-OSDMap-Incremental-with-same-f.patch - messages-MForward-fix-encoding-features.patch - messages-MForward-reencode-forwarded-message-if-targ.patch - msg-Message-fix-set_middle-vs-throttler.patch - msg-adjust-byte_throttler-from-Message-encode.patch - all-add-const-to-operator-param.patch * mon: health does not report pgs stuck in more than one state, http://tracker.ceph.com/issues/17601 ceph/ceph#11660 patches: - mon-PGMap-PGs-can-be-stuck-more-than-one-thing.patch
* Upgrading 0.94.6 -> 0.94.9 saturating mon node networking, http://tracker.ceph.com/issues/17694 ceph/ceph#11679 patches: - mon-OSDMonitor-encode-canonical-full-osdmap-based-on.patch - mon-OSDMonitor-health-warn-if-require_-jewel-kraken-.patch - mon-OSDMonitor-encode-OSDMap-Incremental-with-same-f.patch - messages-MForward-fix-encoding-features.patch - messages-MForward-reencode-forwarded-message-if-targ.patch - msg-Message-fix-set_middle-vs-throttler.patch - msg-adjust-byte_throttler-from-Message-encode.patch - all-add-const-to-operator-param.patch * mon: health does not report pgs stuck in more than one state, http://tracker.ceph.com/issues/17601 ceph/ceph#11660 patches: - mon-PGMap-PGs-can-be-stuck-more-than-one-thing.patch (cherry picked from commit f871303)
http://tracker.ceph.com/issues/17734
http://tracker.ceph.com/issues/17694