New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

common: Revamp config option definitions #16211

Merged
merged 44 commits into from Jul 21, 2017

Conversation

Projects
None yet
5 participants
@jcsp
Contributor

jcsp commented Jul 7, 2017

No description provided.

@jcsp jcsp added the common label Jul 7, 2017

@jcsp

This comment has been minimized.

Contributor

jcsp commented Jul 7, 2017

This compiles and works.

The daemon_default stuff isn't respected yet (md_config_t doesn't know at load time what type of executable it's running in).

Still want to slightly tweak the definition of Option to have a component field separate to tags

@liewegas

This comment has been minimized.

Member

liewegas commented Jul 7, 2017

This looks good! Simple and relatively painless. We should probably fix the validators, squash this down, and get it merged asap before we grow conflicts with new/changed config options.

Also add 'config help' or 'config dump' or similar that includes all the config option metadata? And write a script that generates rst for docs...

@@ -203,10 +203,10 @@ int main(int argc, const char **argv)
// We need to specify some default values that may be overridden by the
// user, that are specific to the monitor. The options we are overriding
// are also used on the OSD (or in any other component that uses leveldb),
// so changing them directly in common/config_opts.h is not an option.
// so changing the global defaults is not an optino.

This comment has been minimized.

@dillaman

dillaman Jul 7, 2017

Contributor

Nit: typo

}
};
// TODO: reinstate corner case logic for these RBD settings

This comment has been minimized.

@dillaman

dillaman Jul 7, 2017

Contributor

Perhaps just convert those three RBD configuration options to the new-style and add a new Option::set_validator method that can take a lambda.

}
template<typename T>
Option& set_nondaemon_default(const T& v) {
return set_value(daemon_value, v);

This comment has been minimized.

@dillaman

dillaman Jul 7, 2017

Contributor

nondaemon_value

@jcsp jcsp added this to the luminous milestone Jul 10, 2017

@tchaikov tchaikov self-requested a review Jul 11, 2017

@jcsp jcsp changed the title from [WIP] Revamp config option definitions to Revamp config option definitions Jul 12, 2017

@jcsp

This comment has been minimized.

Contributor

jcsp commented Jul 12, 2017

This should be good to go. Includes adding "config set" to use instead of injectargs, as a bonus.

@jcsp

This comment has been minimized.

Contributor

jcsp commented Jul 12, 2017

@liewegas this is all rebased now

@jcsp

This comment has been minimized.

Contributor

jcsp commented Jul 13, 2017

Added a commit to fix the compile failures in tools/rbd

@@ -530,9 +551,20 @@ void CephContext::do_command(std::string command, cmdmap_t& cmdmap,
}
static bool is_daemon(uint32_t module_type)

This comment has been minimized.

@dillaman

dillaman Jul 13, 2017

Contributor

Nit: shouldn't this just come from CODE_ENVIRONMENT_DAEMON? There are lots of other daemons like rbd-mirror that should also probably get the daemon default overrides.

This comment has been minimized.

@tchaikov

tchaikov Jul 13, 2017

Contributor

yeah, i feel the same. probably, we should check g_code_env here.

}
Option &set_safe() {
safe = true;

This comment has been minimized.

@dillaman

dillaman Jul 13, 2017

Contributor

Nit: is this used? should it be included in is_safe below? Of course, I guess once the few config options that use the set_safe call are converted, this can all be removed.

This comment has been minimized.

@jcsp

jcsp Jul 13, 2017

Contributor

Yes, this was meant to respected in is_safe, but it wasn't -fixed

// These C members are consumed by code that was written before
// the new options.cc infrastructure: all newer code should
// be consume options via explicit get() rather than C members.
#define OPTION_OPT_INT(name) int64_t name;

This comment has been minimized.

@dillaman

dillaman Jul 13, 2017

Contributor

We are OK with removing the (fake) const-ness of the config options during this transition period?

@@ -840,7 +840,7 @@ namespace rgw {
/* max events to gc in one cycle */
uint32_t max_ev =
std::max(1, get_context()->_conf->rgw_nfs_max_gc);
std::max(int64_t(1), get_context()->_conf->rgw_nfs_max_gc);

This comment has been minimized.

@dillaman

dillaman Jul 13, 2017

Contributor

Nit: perhaps do the same set_min(1) for these few keys?

This comment has been minimized.

@jcsp

jcsp Jul 13, 2017

Contributor

Done.

@@ -530,9 +551,20 @@ void CephContext::do_command(std::string command, cmdmap_t& cmdmap,
}
static bool is_daemon(uint32_t module_type)

This comment has been minimized.

@tchaikov

tchaikov Jul 13, 2017

Contributor

yeah, i feel the same. probably, we should check g_code_env here.

static inline void apply(const char *val, float &x, std::string &err) {
x = strict_strtof(val, &err);
// Generic validation: min
if (!boost::get<boost::blank>(&(opt.min))) {

This comment has been minimized.

@tchaikov

tchaikov Jul 13, 2017

Contributor

can we have some simple tests exercising these validators?

This comment has been minimized.

@jcsp

jcsp Jul 13, 2017

Contributor

Sure, done.

#define OPTION_OPT_FLOAT(name) double name;
#define OPTION_OPT_BOOL(name) bool name;
#define OPTION_OPT_ADDR(name) entity_addr_t name;
#define OPTION_OPT_U32(name) uint64_t name;

This comment has been minimized.

@tchaikov

tchaikov Jul 13, 2017

Contributor

why are we keeping the U32 alias?

This comment has been minimized.

@jcsp

jcsp Jul 13, 2017

Contributor

All the options in legacy_config_opts are just silently mapping to new types, rather than modifying their old OPTION() definitions. This can go away when they're converted.

@tchaikov tchaikov added the feature label Jul 13, 2017

@jcsp

This comment has been minimized.

Contributor

jcsp commented Jul 13, 2017

I'm going to do the g_code_env change in a followup PR because it seems like that's going to involve changing the constructor prototype of CephContext (a bit invasive)

@dillaman

This comment has been minimized.

Contributor

dillaman commented Jul 13, 2017

@jcsp wrt the daemon change, that's fine -- but can you restore the defaults for run_dir, admin_socket, and log_file back so that things like rbd-mirror will work properly by default? Those variables are already cleared in common_preinit for non-daemons.

@jcsp

This comment has been minimized.

Contributor

jcsp commented Jul 13, 2017

@dillaman will do: I'm finding in testing that I need to also work out how to handle cases where ceph-conf (which is of course a non-daemon) outputs what it thinks the options are for OSDs.

@jcsp

This comment has been minimized.

Contributor

jcsp commented Jul 14, 2017

retest this please

@jcsp

This comment has been minimized.

Contributor

jcsp commented Jul 14, 2017

jenkins stuff barfing on erasure code unit tests

@jcsp

This comment has been minimized.

Contributor

jcsp commented Jul 14, 2017

I've temporarily removed all the set_daemon_default stuff, http://tracker.ceph.com/issues/20627 is for adding it back.

@idryomov

Did you mean to leave convert.cc in?

@tchaikov

This comment has been minimized.

Contributor

tchaikov commented Jul 14, 2017

retest this please.

@liewegas

This comment has been minimized.

Member

liewegas commented Jul 14, 2017

http://pulpito.ceph.com/sage-2017-07-14_16:49:10-rados-wip-options-jcsp-distro-basic-smithi/

most of teh failures seem to be because injectargs is returning errors now when it didn't before. for example,

2017-07-14T16:57:13.144 INFO:teuthology.orchestra.run.smithi060:Running: "sudo ceph --cluster ceph tell 'mon.*' injectargs -- --no-mon-health-to-clog"
2017-07-14T16:57:13.231 INFO:teuthology.orchestra.run.smithi060.stderr:2017-07-14 16:57:13.232357 7fca63337700 -1 WARNING: all dangerous and experimental features are enabled.
2017-07-14T16:57:13.247 INFO:teuthology.orchestra.run.smithi060.stderr:2017-07-14 16:57:13.248736 7fca63337700 -1 WARNING: all dangerous and experimental features are enabled.
2017-07-14T16:57:13.363 INFO:teuthology.orchestra.run.smithi060.stderr:injectargs:mon_health_to_clog = 'false'
2017-07-14T16:57:13.374 ERROR:teuthology.contextutil:Saw exception from nested tasks
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/contextutil.py", line 32, in nested
    yield vars
  File "/home/teuthworker/src/github.com_ceph_ceph-c_wip-options-jcsp/qa/tasks/ceph.py", line 1627, in task
    yield
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/run_tasks.py", line 86, in run_tasks
    manager = run_one_task(taskname, ctx=ctx, config=config)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/run_tasks.py", line 65, in run_one_task
    return task(**kwargs)
  File "/home/teuthworker/src/github.com_ceph_ceph-c_wip-options-jcsp/qa/tasks/divergent_priors.py", line 158, in task
    assert exit_status is 0
AssertionError

and

2017-07-14T16:54:56.949 INFO:tasks.workunit.client.0.smithi158.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:48: expect_false:  ceph tell osd.0 injectargs --osd-scrub-auto-repair-num-errors -1
2017-07-14T16:54:56.952 INFO:tasks.workunit.client.0.smithi158.stderr:Error ENOSYS: You cannot change osd_scrub_auto_repair_num_errors using injectargs.
2017-07-14T16:54:56.956 INFO:tasks.workunit.client.0.smithi158.stderr:
2017-07-14T16:54:56.959 INFO:tasks.workunit.client.0.smithi158.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:48: expect_false:  return 0
2017-07-14T16:54:56.962 INFO:tasks.workunit.client.0.smithi158.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:121: check_response:  exit 1

@liewegas liewegas changed the title from Revamp config option definitions to common: Revamp config option definitions Jul 18, 2017

@jcsp

This comment has been minimized.

Contributor

jcsp commented Jul 20, 2017

@jcsp

This comment has been minimized.

Contributor

jcsp commented Jul 20, 2017

In the rerun we had one failure on the objectstore tool tests, which the latest commit fixes, and one timed out because an OSD was OOM killed, presumably unrelated to this.

@jcsp

This comment has been minimized.

Contributor

jcsp commented Jul 20, 2017

retest this please

@liewegas

This comment has been minimized.

Member

liewegas commented Jul 20, 2017

was make check passing?

@jcsp

This comment has been minimized.

Contributor

jcsp commented Jul 20, 2017

make check had a failure that the daemon_config.cc change was there to fix, then on the last run it had some unhappy EC plugin tests, hence the rerun.

jcsp added some commits Jul 12, 2017

common: pass up error strings from set_val
This is so that we can use it as a replacment
for the ugly injectargs.

Signed-off-by: John Spray <john.spray@redhat.com>
osd: implement `config set`
This is a friendlier replacement for injectargs.

Signed-off-by: John Spray <john.spray@redhat.com>
mds: implement `config set`
Signed-off-by: John Spray <john.spray@redhat.com>
mon: implement `config set`
Signed-off-by: John Spray <john.spray@redhat.com>
common: update options.cc for latest additions
Signed-off-by: John Spray <john.spray@redhat.com>
include/uuid: give uuid_d a < comparator
This is only there so that a variant<> containing
a uuid (amongst other types) can have operator< acting on it.

Signed-off-by: John Spray <john.spray@redhat.com>
common: enforce min/max/enum on config options
Signed-off-by: John Spray <john.spray@redhat.com>
common/options: separate "service" from "tags"
Signed-off-by: John Spray <john.spray@redhat.com>
tools: avoid max() calls on rbd config options
These were awkward for typing of the '1' literal vs.
the int64_t settings.  The whole max() thing is also
unnecessary now, if we set proper bounds on the option
definitions.

Signed-off-by: John Spray <john.spray@redhat.com>
common: revert public_addr setting to a string
This was a string in the old schema, and tests
depended on that -- if we want to change its type
let's do that separately to the infrastructure changes.

Signed-off-by: John Spray <john.spray@redhat.com>
test: update md_config_t unit test
Signed-off-by: John Spray <john.spray@redhat.com>
common: move validation in Option and add a test
Signed-off-by: John Spray <john.spray@redhat.com>
common: update options.cc for master
Signed-off-by: John Spray <john.spray@redhat.com>
common: fix Option::is_safe
Signed-off-by: John Spray <john.spray@redhat.com>
rgw: set mins on options to avoid unneeded max()
Signed-off-by: John Spray <john.spray@redhat.com>
common: run validator on all defaults
RBD relies on this behaviour to get the int-ized
form for rbd_default_features.

Signed-off-by: John Spray <john.spray@redhat.com>
common: remove usage of set_daemon_default for now
The code in common_preinit is still there to override
these settings as appropriate.

The set_daemon_default stuff was breaking ceph-conf tests (because
you would get the client-side defaults when asking about an OSD's
settings), and md_config_t isn't properly identifying daemons
using code_env yet.

Ticket to add it back in:
http://tracker.ceph.com/issues/20627

Signed-off-by: John Spray <john.spray@redhat.com>
common: update options.cc for latest master
Signed-off-by: John Spray <john.spray@redhat.com>
test: update daemon_config.cc for conf changes
This was only partially updated in previous commits
for --num-clients, --num-open-files.

Also update int validation test to reflect that values
are now 64 bit internally.

Signed-off-by: John Spray <john.spray@redhat.com>
common/options: fix overflowing 64 bit literals
This manifested as a failure in objectstore tool test_fuse.sh

Signed-off-by: John Spray <john.spray@redhat.com>
test: use config set_val_or_die instead of set_val
...in places that the return code was not being checked.

This fixes cases where an error in the config schema or the
value being passed in would cause weird failures beacuse the
set_val had not taken effect.

Signed-off-by: John Spray <john.spray@redhat.com>
common: fix erasure_code_dir definition
This was missing its `safe` flag, causing some attempts
to set it during testing to fail.

Signed-off-by: John Spray <john.spray@redhat.com>
common/options: update for latest added
Signed-off-by: John Spray <john.spray@redhat.com>
@tchaikov

This comment has been minimized.

Contributor

tchaikov commented Jul 21, 2017

retest this please

@jcsp

This comment has been minimized.

Contributor

jcsp commented Jul 21, 2017

I was watching the output of the previous try (https://jenkins.ceph.com/job/ceph-pull-requests/29141/console), it was all passing with one test still running.

@tchaikov was it stuck? How could you tell?

@jcsp

This comment has been minimized.

Contributor

jcsp commented Jul 21, 2017

Green tick! Woohoo!

@jcsp jcsp merged commit 223c8ce into ceph:master Jul 21, 2017

3 of 4 checks passed

make check (arm64) running make check
Details
Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
make check make check succeeded
Details

@jcsp jcsp deleted the jcsp:wip-options-jcsp branch Jul 21, 2017

@tchaikov

This comment has been minimized.

Contributor

tchaikov commented Jul 21, 2017

@jcsp i just found it stuck for a long time, but didn't check it closer. anyway, YEAH!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment