New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mon: enable luminous monmap feature on full quorum #13379
Conversation
Please note, I realized I forgot to add cli tests to the branch just as i was pushing it. That will be taken care of once morning comes. |
src/mon/mon_types.h
Outdated
FEATURE_NONE | ||
); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need a new category for this? can't we just apply the full quorum rule for all persistent features? the fact that they're persistent seems like it implies we want to be careful about enabling them when they can't be disabled...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. We can simply rely on the persistent features. Just thought that there could be some features we would want to set on a full quorum, others maybe not so, but yeah... I suppose all the persistent would be nice to set only when all the monitors belong to the quorum.
src/mon/MonCommands.h
Outdated
"mon", "r", "cli") | ||
COMMAND("mon features set " \ | ||
"name=feature_name,type=CephString " \ | ||
"name=feature_type,type=CephChoices,strings=persistent|optional,req=false " \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this interface doesn't make sense to me. any given feature is either persistent or not; having the user pass that as an argument is just an opportunity for them to get it wrong (the mon already knows which it is). (unless it's meant as a safety check?)
I'd suggest instead
ceph mon features set
ceph mon features set-persistent [--force]
Or perhaps 'feature' instead of 'features'. :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in a way, I really dislike the idea of having users modifying the monmap features manually, so this was kind of a sledge hammer forcing the user to provide as much info as possible, even though 'persistent' and 'optional' are not required (defaults to optional).
In any case, I'm fine with adding a 'set-persistent' instead. If we don't like it, in the future, we can always refactor this bit - but I'm guessing it will be fine, as these should not be used often.
src/mon/MonCommands.h
Outdated
COMMAND("mon features set " \ | ||
"name=feature_name,type=CephString " \ | ||
"name=feature_type,type=CephChoices,strings=persistent|optional,req=false " \ | ||
"name=by_value,type=CephChoices,strings=--by-value,req=false " \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any reason to support setting a feature numerically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing unknown features. But I just realized it will only work on a cluster with one single monitor, because otherwise the monitors will nak each other on probing, and they will shutdown.
I don't really love the syntax here, but I'm not sure what would be better. Other parts of the system just use 'ceph osd set/unset ...' but those are called flags and not features. The mon features are sort of both in that we calculate the intersection across the quorum and require them.. but the are allowed to be unset, whereas for the osdmap the 'require_*_osds' flags are just coded so that they can't be unset (and aren't automatically set). So maybe the 'feature/features' part of the prefix makes sense, but singular is probably better? |
that was the rationale behind going with 'mon features set/unset/list' instead of 'mon set/unset/list' - that and that having a 'list' would need something more verbose to avoid confusion (e.g., 'list_features', and that's kinda ugly). As for the plural, sure, I'll change it. Just felt nicer wrt 'features list' instead of 'feature list'. Didn't think much about it and am definitely not attached to it ;) |
@jecluis @liewegas |
@wjwithagen these features are meant to make sure the monitors require them once all the monitors in the cluster support them, and they are not meant to ever be unset. So far we've been using them as upgrade checkpoints, but I can imagine a few scenarios in which we may rely on them for protocol updates as well, and in all these cases involving the user seems pointless - and I don't personally see a good reason to require an explicit administrator action to enable them. The cli will also allow the administrator to set and unset persistent features, but I would like to point out that this is highly discouraged unless "you-are-really-really-sure". OTOH, we also have 'optional' features, that require the administrator to set/unset, but those are meant to enable/disable certain features (which are not currently being used, but I can see a few use-cases in the nearby future). Also, the simple nature of the monitor cluster, backed by the monmap, allows to make these decisions about when to flip the switch on a given feature a lot simpler: the map contains N monitors, so we will flip the switch when all N monitors are in the quorum; not before. Although it annoys me to suggest this, I can imagine a scenario in which an administrator, wanting to postpone the switch to be flipped, would hold back a single monitor from being upgraded - keeping in mind all the possible cons of doing so, and checking the release notes' upgrade section beforehand would be imperative. |
Hmm, I think the thing that will makes this be less confusing is to drop
the distinction between persistent and non-persistent features. instead,
add a flags field for things that can be set/unset, and keep the
required_features for what it is.
|
We can do that, but internally they'll still be "features". It's easier that way, especially given we'll still have to compute the required quorum features, which will be the union of persistent and optional/flags. |
Maybe the other reason this isn't sitting right is that we don't
actually have any instances of the non-peristent features/flags. I'm
having a hard timing figuring out what they would be. Maybe
use-mgr-for-pg-stats? Even there, though, I'm not sure it makes sense to
treat a feature as a flag. It seems like what you really want is (1) a
flag that indicates what you want, and (2) some logic that says if
that flag is enabled you require a particular feature (in this case,
LUMINOUS)...
|
Using mgr for pg stats could be one instance; another would be a monitor-specific network (mon<->mon), having the monitors on both cluster and public networks (for osd<->mon over cluster network), or even an admin network (client.admin <-> mon). The flags/optional features/wtv would be means to convey to consumers of the monmap what they require to talk to the monitors (e.g., sending a message via the cluster network, instead of the typical public). Other ideas that have crossed my mind, although the usefulness may be disputable, would be to enable remote-site paxos sync for DR, for instance. We could map these flags to features, and require features depending on which flags we enable, but implementation-wise that would be pretty much the same logic we currently have, I think. |
c79e137
to
b6c1d8d
Compare
@liewegas pushed revised patches for two of the existing commits. Also, pushed an additional patch to be squashed against the others. This last patch removes all references to 'optional' features from any cli command (except 'list', which will still output the optional features on formatted output). This way we can wait for a bit for features leveraging the 'optional' features, 'flags' or whatever, before we change them or rip them out. If we don't put them to good use, we can simply rip them in a few releases time. What do you think? |
This seems reasonable.
Do we want to allow unsetting the persistent features at all though? At
the very least it should have a --yes-i-really-mean-it flag.. but I'm
not sure we should have even that. The osdmap require_*_osd flags can't
be unset, FWIW.
|
Also, I think the set command should have a check that the quorum
supports that feature.
|
Yeah, forgot to handle the '--yes-i-really-mean-it' flag, even though the command was specified with that. (Maybe removed it on an earlier patch?) Anyway, good point about unsetting them. We'll let the user do that via the monmap tool if they ever come across a situation in which that may be needed. |
sounds good!
|
ping |
b6c1d8d
to
77a0f45
Compare
@liewegas pushed. tests are passing locally, but let's wait for the checks to go green before merging. |
repushed. failures seems to be due to port collision on the test. |
latest failure is due to jq's version on jenkins being older than the one i was working on. This means my patch is using '--exit-status' to make decisions on whether we had success or not, but the build system is completely oblivious about the magic of not having to check return strings :( Adjusting the patch atm and will push once it passes locally. |
Signed-off-by: Joao Eduardo Luis <joao@suse.de>
4e21f93
to
a3e9ca8
Compare
ok, now this is just getting silly. looks like i forgot to change a few commands away from 'jq -e'. sigh. |
@liewegas if everything seems fine to you, I'll just squash that last commit where it belongs and it should then be okay to merge. |
@jecluis yep, squash away! |
4557394
to
462f601
Compare
@liewegas done. |
src/mon/MonmapMonitor.cc
Outdated
pending_map.last_changed = ceph_clock_now(); | ||
propose = true; | ||
|
||
dout(1) << __func__ << ss << "; new features will be: " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this does not compile. might want to use ss.str()
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait wat? how did that ';' get in there? /me checks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nevermind - it's part of the string. why it's not compiling is a mystery to me though, because "it works for me (tm)", but yeah, I'll add a ss.str()
instead for safe measure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't figure out why it was compiling locally, but as with any other compile mystery I'll just blame it on ccache and move on. Pushing the fixed branch now.
Allows listing supported and currently set monmap features, as well as setting and unsetting them. Signed-off-by: Joao Eduardo Luis <joao@suse.de>
We will now only enable persistent features automatically when ALL the monitors in the monmap are in the quorum. #noMonitorLeftBehind Signed-off-by: Joao Eduardo Luis <joao@suse.de>
Takes optional timeout and desired quorum size Signed-off-by: Joao Eduardo Luis <joao@suse.de>
Signed-off-by: Joao Eduardo Luis <joao@suse.de>
462f601
to
2374011
Compare
passed testing |
We will automatically enable the luminous persistent monmap feature
when we have a quorum composed of all the monitors in the monmap.
This patch series also drops the 'mon debug features' admin socket
interface, and adds instead a proper 'mon features ...' interface on the
MonmapMonitor, allowing listing, setting and unsetting features.
Signed-off-by: Joao Eduardo Luis <joao@suse.de>