New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mon: centralized config #20172

Merged
merged 144 commits into from Mar 6, 2018

Conversation

Projects
None yet
5 participants
@liewegas
Member

liewegas commented Jan 29, 2018

FUTURE TODO (later PR)

  • clean up tell|daemon config set output (gross json that makes no sense)
  • converge with mgr module config options
  • convert ConfigKeyService to a PaxosService (and unify with ConfigMonitor?)

TODO

  • use normal monclient auth for initial config
  • incorporate into global_init/common_init infrastructure
  • --no-mon-config argument
  • ceph config set WHO KEY VALUE
  • ceph config get WHO
  • ceph config get WHO KEY (includes default value)
  • ceph config dump
  • ceph config show WHO
  • ceph config show-with-defaults WHO
  • ceph config show WHO KEY
  • ceph config help
  • report mgr running config to mgr
  • annotate conf-only options (paxos_* maybe?)
  • make ConfigMonitor refresh more efficient than reparsing all config
  • show config option level in dump, get output
  • ceph config assimilate-conf -i old-ceph.conf -o new-ceph.conf
  • sort out ceph-conf behavior
  • 'option can be changed' in get and/or show (json?) output
  • docs
  • tests
@liewegas

This comment has been minimized.

Member

liewegas commented Jan 29, 2018

jenkins render docs

@liewegas liewegas referenced this pull request Jan 29, 2018

Closed

WIP: mon: centralized config #19289

20 of 21 tasks complete
@liewegas

This comment has been minimized.

Member

liewegas commented Jan 31, 2018

jenkins render docs

@@ -327,6 +329,9 @@ void MgrClient::send_report()
}
report->osd_health_metrics = std::move(osd_health_metrics);
cct->_conf->get_config_bl(&report->config_bl);

This comment has been minimized.

@jcsp

jcsp Feb 2, 2018

Contributor

It feels a bit much to send this on every report from every service: the value_bl part in md_config_t is already sorta-stateful and explicitly reset when changes are made, so get_config_bl could return a bool to indicate whether anything changed (i.e. whether value_bl was rewritten), and then this function would only include the config in the report in that situation.

" name=who,type=CephString" \
" name=name,type=CephString" \
" name=value,type=CephString", \
"Cet a configuration option for one or more entities",

This comment has been minimized.

@jcsp

jcsp Feb 2, 2018

Contributor

s/Cet/Set/

@liewegas

This comment has been minimized.

Member

liewegas commented Feb 18, 2018

retest this please

@liewegas

This comment has been minimized.

Member

liewegas commented Feb 19, 2018

@tchaikov @ktdreyer I may need some help here. The make check build is failing with a compilation error that I can't reproduce locally (on multiple xenial boxes). And the error doesn't make any sense to me--the code is identical to that in several other mon/*.cc files that build just fine. Any ideas?

@liewegas

This comment has been minimized.

Member

liewegas commented Feb 19, 2018

Nevermind, I fixed it! Used cmdmap_t instead of the raw type and it seemed to resolve it (somehow one variant added a std::less<void> template arg).

@liewegas

This comment has been minimized.

Member

liewegas commented Feb 19, 2018

jenkins render docs

@liewegas

This comment has been minimized.

Member

liewegas commented Feb 19, 2018

@jcsp @jdurgin this is ready for a final look!

@jcsp

This comment has been minimized.

Contributor

jcsp commented Feb 26, 2018

We have a lot of Option::FLAG_NO_MON_UPDATE on settings that intuitively should be live-adjustable (things like mon_osd_full_ratio). I guess this is a plumbing thing, because of monitors not being MonClients to themselves, and we need another pathway to make "ceph config set..." work for these settings?

Settings like admin_socket, erasure_code_dir, plugin_dir, run_dir (i.e. strings without handlers) generate lots of spam (-1 set_mon_vals failed to set admin_socket =), including from the CLI itself, if they've been set in the central monitor config.

@jcsp

This comment has been minimized.

Contributor

jcsp commented Feb 26, 2018

There's a weird bit of output where mon is both in SOURCE and in IGNORES:

 bin/ceph config show mds.a
NAME                                                       VALUE                                             SOURCE  OVERRIDES IGNORES 
admin_socket                                               /tmp/ceph-asok.GttqJR/$name.asok                  mon               mon  

I thought perhaps it meant it was ignoring a more general mon setting in favour of a more specific one, but it seems like it's set directly for the mds:

bin/ceph config dump | grep heartbeat_file
  mds               advanced heartbeat_file                                             /home/jspray/ceph.bravo/build/out/$name.heartbeat                                                             *  
  mgr               advanced heartbeat_file                                             /home/jspray/ceph.bravo/build/out/$name.heartbeat                                                             *  
  mon               advanced heartbeat_file                                             /home/jspray/ceph.bravo/build/out/$name.heartbeat                                                             *  
  osd               advanced heartbeat_file                                             /home/jspray/ceph.bravo/build/out/$name.heartbeat  

I wonder if the $name substitution is tripping up the detection of whether the setting is overridden?

@liewegas

This comment has been minimized.

Member

liewegas commented Feb 26, 2018

mon_osd_full_ratio is only used during mkfs time, on the mon, so it doesn't make sense to pull them--that's why.

@liewegas

This comment has been minimized.

Member

liewegas commented Mar 5, 2018

retest this please

liewegas added some commits Nov 9, 2017

mon/ConfigKeyService: make STORE_PREFIX public
Signed-off-by: Sage Weil <sage@redhat.com>
common/config: find_option
Signed-off-by: Sage Weil <sage@redhat.com>
common/options: convert bool safe to a flags field
Signed-off-by: Sage Weil <sage@redhat.com>
common: don't pass alt_def_args if we don't use them
Signed-off-by: Sage Weil <sage@redhat.com>
common/config: record is_daemon
Signed-off-by: Sage Weil <sage@redhat.com>

liewegas added some commits Feb 8, 2018

ceph-disk: --no-mon-config
Some of these cases make sense to fetch mon configs, but we are deprecating
ceph-disk anyway, and the tests currently make use of ceph-disk in places
that do not have a mon_host defined via a ceph.conf or other environment.
This avoids breaking those test cases without any real impact on users
(which will either use ceph-volume or presumably remain in a legacy config
environment).

Signed-off-by: Sage Weil <sage@redhat.com>
mon/MonClient: tolerate pre-mimic mons
Return success if the mons are pre-mimic (and thus have no config).

Signed-off-by: Sage Weil <sage@redhat.com>
mon/MonClient: apply timeout while fetching config
The normal timeouts automatically apply during the authenticate() stage,
but not to the explicit wait for a config.  If we don't get that quickly
we shoudl retry another monitor because it is possible we will connect to
an out-of-quorum (or otherwise unresponsive) mon.

Signed-off-by: Sage Weil <sage@redhat.com>
test/cli/ceph-conf: fix test
Signed-off-by: Sage Weil <sage@redhat.com>
mon/ConfigMonitor: add missing #include, tweak types
Seems to resolve a build error on some compilers!  Meh.

Signed-off-by: Sage Weil <sage@redhat.com>
common/config: check against raw value (no meta) to detect unchanged …
…option

If we are looking at a new value from the mon and comparing it to what we
already have active, compare the non-meta-substituted form.  This way a
value from the mon that can't update at runtime but we have already set to
the same value will not be falsely flagged as ignored.

Signed-off-by: Sage Weil <sage@redhat.com>
librados: fix common_init_finish timing
Common_init_finish does start_service_thread and does
set_safe_to_start_threads() on the cct, which switches us to 'runtime'
mode where we can't accept many config options. Do that *after* we fetch
our config from the mon so that we can accept+set runtime options (and
not complain to stderr about it).

Signed-off-by: Sage Weil <sage@redhat.com>
common/config: intercept "keyfile", translate into "key"
The keyfile arg might be - (stdin), which we can only read once.  Ensure
that we consume it once by intercepting the CLI value early and inserting
the value into the 'key' option.

This robs future code of the knowledge that the key came from --keyfile
and not --key, but avoids the issue of multiple users (notably, KeyRing.cc
and the OSD mkfs code).

Remove the - special case from OSD at the same time, since it can no
longer be reached (unless something other than the CLI specified '-', but
neither ceph.conf nor the mon config make sense here).

Signed-off-by: Sage Weil <sage@redhat.com>
common/common_init: use unique admin_socket path for unprivileged dae…
…mons

These qualify as 'daemon', but their path is usually not unique.

Signed-off-by: Sage Weil <sage@redhat.com>
mon/ConfigMonitor: fix dump when by_type is missing an item
Signed-off-by: Sage Weil <sage@redhat.com>
mon/MonClient: fix auth timeout vs error race
It's possible that we successfully set active_con *and* time out the
cond WaitUntil.  Only set the error if we don't have a connection; if we
set it *and* time out then let's call it success.

Signed-off-by: Sage Weil <sage@redhat.com>
mon/ConfigMonitor: process subs from update_from_paxos
This is what OSDMonitor and MDSMonitor do.

Signed-off-by: Sage Weil <sage@redhat.com>
vstart.sh: -c to ceph cli
Signed-off-by: Sage Weil <sage@redhat.com>
common/config: normalize key name for get_val (external) variants
Signed-off-by: Sage Weil <sage@redhat.com>

@liewegas liewegas merged commit d7692a2 into ceph:master Mar 6, 2018

3 of 5 checks passed

make check running make check
Details
make check (arm64) running make check
Details
Docs: build check OK - docs built
Details
Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details

@liewegas liewegas deleted the liewegas:wip-config branch Mar 6, 2018

@@ -906,6 +842,67 @@ fi
if [ $CEPH_NUM_MON -gt 0 ]; then
start_mon
echo Populating config ...
cat <<EOF | $CEPH_BIN/ceph -c ceph.conf config assimilate-conf -i -

This comment has been minimized.

@dillaman

dillaman Mar 9, 2018

Contributor

Note: this breaks clusters started w/ mstart.sh since you assume the cluster config is located at ceph.conf instead of $conf_fn, which in turn breaks qa/workunits/rbd/rbd_mirror.sh on a dev box.

OSD_SECRET=$($CEPH_BIN/ceph-authtool --gen-print-key)
echo "{\"cephx_secret\": \"$OSD_SECRET\"}" > dev/osd$osd/new.json

This comment has been minimized.

@dillaman

dillaman Mar 9, 2018

Contributor

... also broken here w/ the hard-coded dev instead of $CEPH_DEV_DIR

@leseb

This comment has been minimized.

Contributor

leseb commented on src/common/common_init.cc in 59ee2e8 Jul 30, 2018

Is there any reason for using $ccitd here? Doesn't $pid make the instance unique already?
This new line makes it hard for monitoring tools to predict the socket name, the same goes when restarting a container service and looking for a socket to validate a successful restart.
Thanks!

This comment has been minimized.

Member

liewegas replied Jul 30, 2018

You can have multiple librados instances in a the same process. Key example is qemu, which has one per volume, IIRC.

This comment has been minimized.

Contributor

leseb replied Jul 30, 2018

Yeah but what about rbd-mirror and rgw?

epuertat added a commit to rhcs-dashboard/ceph that referenced this pull request Sep 10, 2018

mgr: update python interface for store vs. config
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit bd80cf9)
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>

As Centralized Config feature (PR ceph#20172) has not been backported to Luminous,
   `set_store` and `get_store` methods are just wrappers around
   `set_config` and `get_config`, respectively.

epuertat added a commit to rhcs-dashboard/ceph that referenced this pull request Sep 18, 2018

mgr: update python interface for store vs. config
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit bd80cf9)
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>

As Centralized Config feature (PR ceph#20172) has not been backported to Luminous,
   `set_store` and `get_store` methods are just wrappers around
   `set_config` and `get_config`, respectively.

epuertat added a commit to rhcs-dashboard/ceph that referenced this pull request Sep 26, 2018

mgr: update python interface for store vs. config
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit bd80cf9)
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>

As Centralized Config feature (PR ceph#20172) has not been backported to Luminous,
   `set_store` and `get_store` methods are just wrappers around
   `set_config` and `get_config`, respectively.

Signed-off-by: Ernesto Puerta <epuertat@redhat.com>

epuertat added a commit to rhcs-dashboard/ceph that referenced this pull request Sep 26, 2018

mgr: update python interface for store vs. config
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit bd80cf9)
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>

As Centralized Config feature (PR ceph#20172) has not been backported to Luminous,
   `set_store` and `get_store` methods are just wrappers around
   `set_config` and `get_config`, respectively.

Signed-off-by: Ernesto Puerta <epuertat@redhat.com>

epuertat added a commit to rhcs-dashboard/ceph that referenced this pull request Oct 1, 2018

mgr: update python interface for store vs. config
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit bd80cf9)
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>

As Centralized Config feature (PR ceph#20172) has not been backported to Luminous,
   `set_store` and `get_store` methods are just wrappers around
   `set_config` and `get_config`, respectively.

Signed-off-by: Ernesto Puerta <epuertat@redhat.com>

epuertat added a commit to rhcs-dashboard/ceph that referenced this pull request Oct 17, 2018

mgr: update python interface for store vs. config
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit bd80cf9)
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>

As Centralized Config feature (PR ceph#20172) has not been backported to Luminous,
   `set_store` and `get_store` methods are just wrappers around
   `set_config` and `get_config`, respectively.

Signed-off-by: Ernesto Puerta <epuertat@redhat.com>

epuertat added a commit to rhcs-dashboard/ceph that referenced this pull request Oct 26, 2018

mgr: update python interface for store vs. config
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit bd80cf9)
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>

As Centralized Config feature (PR ceph#20172) has not been backported to Luminous,
   `set_store` and `get_store` methods are just wrappers around
   `set_config` and `get_config`, respectively.

Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment