New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mon,mgr: tag some commands for ceph-mgr #13617

Merged
merged 74 commits into from Mar 30, 2017

Conversation

Projects
None yet
7 participants
@liewegas
Member

liewegas commented Feb 23, 2017

No description provided.

@jcsp

This comment has been minimized.

Contributor

jcsp commented Feb 23, 2017

Is the idea that MonCommands.h would define an overall list of commands, and the CLI piece would work out where to send things by checking the flag? It would rely on having an equally new ceph-mon to call any new commands added in ceph-mgr.

I guess in practice the two services are unlikely to be different versions, but it feels a bit weird to have the mon defining what commands the mgr can handle -- I had been thinking about having the CLI talk to both the mgr and the mon, to try matching commands against one before falling back to the other, though obviously that's not hugely slick either. Hmm.

@gregsfortytwo

This comment has been minimized.

Member

gregsfortytwo commented Feb 23, 2017

The monitor and manager are at present pretty likely to match versions. ;)

That said, I'd prefer having one endpoint queried by the client and having it relay what the other tells it. That way partial upgrades don't mess up the available commands.

@liewegas

This comment has been minimized.

Member

liewegas commented Feb 23, 2017

@dmick

This comment has been minimized.

Member

dmick commented Feb 23, 2017

How is startup handled? Is there an implied "the mgr needs a mon connection before it starts servicing requests" today? (wondering about when the mon can be expected to know the mgr commands, and/or whether it asks the mgr on-demand, and when it retries)

@liewegas

This comment has been minimized.

Member

liewegas commented Feb 24, 2017

The mgr has to be actively heartbeating with the mon to stay the current active mgr. Probably the simplest thing would be for it to send it's commands when it starts up; the mon could persist that alongside the MgrMap and include it in the get_command_descriptions results.

@liewegas

This comment has been minimized.

Member

liewegas commented Feb 24, 2017

What I like about 2 above is that the client gets all the commands from one place, all at once, from someone it is already connect to (the mon), which is fast and efficient and we don't have to consider the case where some command-processing things are available and some are not. It also means that the client behavior is independent of how we decide to do the mgr command registration/negotiation between the mon and mgr. We can start with the trivial thing and hard-code them to get this up and running and move to the mgr registering it's supported commands whenever we're ready (hopefully soon!).

@jecluis

This comment has been minimized.

Member

jecluis commented Feb 24, 2017

I also like the idea of having the monitor compiling and sharing that info with the clients, even if the clients then talk to the mgr directly. It will certainly help with those clients that do not know what a mgr is, as they'll get the command descriptions from the place they usually get their command descriptions (i.e., the monitor).

Also, if there happens to be any functionality (now or in the future) that the mon provides in case a mgr is not available (unless we make the mgr a hard requirement of clusters), then this makes even more sense, because then the monitor can simply state "I handle this command X", but also allowing us to override those commands when a mgr is available and say "mgr.foo handles command X".

The annoying part is that we'd have to propose the commands across the quorum - or, at least, have the leader compiling the supported commands from mons and mgrs and making sure the quorum offers those commands. The logic should be similar to what we do for when the monitors diverge in command sets, except that we would not need to forward all commands to the leader for handling (as long as the individual monitors support the offered mon commands). We could probably even piggy-back on that logic.

@liewegas

This comment has been minimized.

Member

liewegas commented Feb 24, 2017

@gregsfortytwo

This comment has been minimized.

Member

gregsfortytwo commented Feb 25, 2017

Again, just skimming this. But I think we'll be a lot happier in the long run if we take this opportunity to set up good interfaces and keep stuff as modular as we can. There's no need for the pg stat commands to be embedded in DaemonServer and we're going to build up a fair bit of infrastructure around pg stats that should probably be separated from the basic message dispatching logic.

@liewegas

This comment has been minimized.

Member

liewegas commented Feb 25, 2017

@gregsfortytwo yeah... and it turns out it's all just a function of const PGMap and const OSDMap. Moved it into a helper in PGMap.cc and it's much better! Only the scrub stuff gets copied since it uses a different interface to send.

Included as far as I got with the MgrClient. Right now it fails to authenticate because there's no authorizer.

@liewegas

This comment has been minimized.

Member

liewegas commented Feb 26, 2017

@gregsfortytwo fix the auth thing and got command proxying working. easy way to test is to comment out these lines in ceph_argparse.py:

    if flags & 8:
        d['target'] = ('mgr','')

@liewegas liewegas changed the title from WIP: mon: tag some commands for ceph-mgr to mon,mgr: tag some commands for ceph-mgr Feb 26, 2017

@@ -982,6 +982,8 @@ def validate(args, signature, partial=False):
print(save_exception[0], 'not valid: ', save_exception[1], file=sys.stderr)
raise ArgumentError("unused arguments: " + str(myargs))
if flags & 8:

This comment has been minimized.

@dmick

dmick Feb 27, 2017

Member

needs to be symbolic

@@ -7645,6 +7645,18 @@ bool OSD::require_mon_peer(const Message *m)
return true;
}
bool OSD::require_mon_or_mgr_peer(const Message *m)

This comment has been minimized.

@tchaikov

tchaikov Feb 28, 2017

Contributor

this method could be a const.

@@ -66,7 +66,6 @@ class MgrClient : public Dispatcher
void wait_on_list(list<Cond*>& ls);
void signal_cond_list(list<Cond*>& ls);

This comment has been minimized.

@tchaikov

tchaikov Feb 28, 2017

Contributor

so we can remove these two methods?

@@ -710,6 +710,12 @@ int main(int argc, const char **argv)
prefork.exit(1);
}
Messenger *mgr_msgr = Messenger::create(g_ceph_context, public_msgr_type,

This comment has been minimized.

@tchaikov

tchaikov Feb 28, 2017

Contributor

can we put mgr_msgr in a scoped_ptr or a unique_ptr, so we don't leak it.

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Mar 7, 2017

Member

👍 to Kefu's comment.
And this works for getting on with, but I'm not sure we want to keep a separate messenger long term. Although it does simplify the Manager's behavior to be able to treat the monitor client session as a normal client, so maybe we will.

This comment has been minimized.

@liewegas

liewegas Mar 7, 2017

Member

we have to have a separate messenger because it's a different policy. one is a lossy client and one a lossy server. note that the only "overhead" from that is a separate dispatch thread. i didn't use a unique_ptr because none of the other msgrs do and wanted to keep it symmetric

@@ -23,6 +23,9 @@
import uuid
public static final FLAG_MGR = 8; # command is intended for mgr

This comment has been minimized.

@tchaikov

tchaikov Mar 3, 2017

Contributor

no, i must be reading java !

This comment has been minimized.

@liewegas

liewegas Mar 3, 2017

Member

ha! i googled 'python integer constant' and cut and pasted the stackoverflow questin instead of the answer :)

@gregsfortytwo

Few issues I noticed.

}
// fall back to registered python handlers
if (r == -EOPNOTSUPP) {

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Mar 7, 2017

Member

A little concerned about trying to optimize this bit, although it at least gets them all in line.

// Leaving fsid argument null because it isn't used.
MCommand *m = op.get_message({});
assert(session);
assert(session->con);

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Mar 7, 2017

Member

These asserts can go away now it's in the if-block.

This comment has been minimized.

@liewegas

liewegas Mar 7, 2017

Member

fixing

if (!session) {
lderr(cct) << "no session, waiting" << dendl;
wait_on_list(waiting_for_session);
}

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Mar 7, 2017

Member

I don't get it. Why do we want to silently throw out commands instead of waiting for a session? Was this necessary for making the mon<->mgr auth work properly or something?

This comment has been minimized.

@liewegas

liewegas Mar 7, 2017

Member

we don't throw them out; we queue them and they get sent later when teh connection happens. the problem si that start_command() used to block, which made it uncallable from the handle_message context.

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Mar 7, 2017

Member

What queues them up for later sending? In this function we're calling send_message() if there's a session->con and otherwise just returning. I don't see anything in Monitor::handle_command() either.

This comment has been minimized.

@tchaikov

tchaikov Mar 14, 2017

Contributor

CommandTable queues the commands up, and a command is removed from queue only when it is acked by handle_command_reply().

// mgr
ret = key_server.build_session_auth_info(service_id, auth_ticket_info, info);
if (ret < 0) {
derr << __func__ << " failed to build service session_auth_info" << ret

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Mar 7, 2017

Member

Should probably say "mgr service" here and "mon session" above to disambiguate more strongly.

This comment has been minimized.

@liewegas

liewegas Mar 7, 2017

Member

good call, cleaningup all these msgs

@@ -710,6 +710,12 @@ int main(int argc, const char **argv)
prefork.exit(1);
}
Messenger *mgr_msgr = Messenger::create(g_ceph_context, public_msgr_type,

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Mar 7, 2017

Member

👍 to Kefu's comment.
And this works for getting on with, but I'm not sure we want to keep a separate messenger long term. Although it does simplify the Manager's behavior to be able to treat the monitor client session as a normal client, so maybe we will.

if (mgr_proxy_bytes + size > max) {
dout(10) << __func__ << " current mgr proxy bytes " << mgr_proxy_bytes
<< " + " << size << " > max " << max << dendl;
reply_command(op, -EAGAIN, "hit limit on proxied mgr commands", rdata, 0);

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Mar 7, 2017

Member

We may want to also test against the mgr client throttle settings, in case of dramatic variance.

Also I'm a bit concerned about the different client behavior they can see for no apparent reason. But we probably can't do much about that. :/

@liewegas

This comment has been minimized.

Member

liewegas commented Mar 7, 2017

@@ -23,6 +23,9 @@
import uuid
FLAG_MGR = 8 # command is intended for mgr

This comment has been minimized.

@dmick

dmick Mar 7, 2017

Member

already some flags in ceph.in; we should find a better place if they need sharing

@liewegas

This comment has been minimized.

Member

liewegas commented Mar 10, 2017

Now passes make check!

@liewegas

This comment has been minimized.

Member

liewegas commented Mar 10, 2017

@gregsfortytwo @tchaikov Please re-review. Eager to get this in since it has a lot of pieces the other mgr work needs! Will run through rados suite this afternoon.

@dmick

This comment has been minimized.

Member

dmick commented Mar 10, 2017

I broke the submodule check for this one

@gregsfortytwo

This comment has been minimized.

Member

gregsfortytwo commented Mar 11, 2017

@liewegas, did you amend any of the existing commits or just add new ones?

@liewegas

This comment has been minimized.

Member

liewegas commented Mar 13, 2017

@tchaikov

This comment has been minimized.

Contributor

tchaikov commented Mar 13, 2017

i will finish the review tomorrow morning.

@gregsfortytwo

This looks pretty good. More visibly noting a few things you already flagged.

Mostly, I'm still concerned about the way we see to be defaulting everything into DaemonServer (I'm doing it too). It does have a pretty random assortment to start with but let's pull pieces into more appropriate places as they grow, instead of co tinting to mash it.

MgrSession() : RefCountedObject(0) {}
~MgrSession() override {}
};

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Mar 14, 2017

Member

This struct looks fine but I don't think it belongs in DaemonServer. (Lets try to avoid the few-huge-files-and-classes problem). Probably Mgr?

This comment has been minimized.

@liewegas

liewegas Mar 14, 2017

Member

moving to MgrSession.h

param_str_map, mgr_cmd)) {
dout(1) << __func__ << " access denied" << dendl;
#if 0
// FIXME: audit_clog?

This comment has been minimized.

@gregsfortytwo

This comment has been minimized.

@liewegas

liewegas Mar 14, 2017

Member

added clsuter logging infrastructure

profile_grants.push_back(MonCapGrant("config-key exists", "key",
StringConstraint("", prefix)));
profile_grants.push_back(MonCapGrant("config-key delete", "key",
StringConstraint("", prefix)));

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Mar 14, 2017

Member

We should probably restrict manager put/delete manager to a prefix space?

This comment has been minimized.

@liewegas

liewegas Mar 14, 2017

Member

that's what the second prefix arg to StringConstraint does

}
}
ENCODE_FINISH(report->packed);
ldout(cct, 20) << by_path.size() << " counters, of which "
<< report->declare_types.size() << " new" << dendl;

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Mar 14, 2017

Member

@jcsp, does the rest of the manager care if perfcpunters stop showing up?

profile_grants.push_back(MonCapGrant("mon", MON_CAP_R)); // read monmap
profile_grants.push_back(MonCapGrant("osd", MON_CAP_R)); // read osdmap
profile_grants.push_back(MonCapGrant("mon getmap"));
profile_grants.push_back(MonCapGrant("auth get-or-create")); // FIXME: this can expose other mgr keys

This comment has been minimized.

@gregsfortytwo

This comment has been minimized.

@liewegas

liewegas Mar 14, 2017

Member

this matches every other bootstrap key; it's a lot of work to fix.

// TODO: invent some caps suitable for ceph-mgr
if (is_valid) {
if (caps_info.allow_all)
if (caps_info.allow_all) {

This comment has been minimized.

@tchaikov

tchaikov Mar 14, 2017

Contributor

i don't quite understand this.

This comment has been minimized.

@liewegas

liewegas Mar 14, 2017

Member

typo :)

if (!session) {
lderr(cct) << "no session, waiting" << dendl;
wait_on_list(waiting_for_session);
}

This comment has been minimized.

@tchaikov

tchaikov Mar 14, 2017

Contributor

CommandTable queues the commands up, and a command is removed from queue only when it is acked by handle_command_reply().

@@ -51,6 +49,10 @@ void MgrClient::shutdown()
{
Mutex::Locker l(lock);
// forget about in-flight commands if we are prematurely shut down
// (e.g., by control-C)
command_table.clear();

This comment has been minimized.

@tchaikov

tchaikov Mar 16, 2017

Contributor

maybe we can just remove the assert in CommandTable 's dtor.

liewegas added some commits Mar 13, 2017

vstart.sh: fix mgr caps
Signed-off-by: Sage Weil <sage@redhat.com>
systemd/ceph-mgr@.service: fix mgr mon cap
Signed-off-by: Sage Weil <sage@redhat.com>
mon/AuthMonitor: fix mgr mon caps to 'allow profile mgr'
Signed-off-by: Sage Weil <sage@redhat.com>
mgr: move MgrSession to its own header
Signed-off-by: Sage Weil <sage@redhat.com>
mgr/Session: keep inst, not addr
For parity with log messages from other daemons etc.

Signed-off-by: Sage Weil <sage@redhat.com>
mgr/MgrClient: do not assert on control-c
Signed-off-by: Sage Weil <sage@redhat.com>
mgr: do not crash if FSMap doesn't have 'addr' metadata
Signed-off-by: Sage Weil <sage@redhat.com>
qa/suites/upgrade/jewel-x: don't initially start mgr daemons
Signed-off-by: Sage Weil <sage@redhat.com>
mgr/MgrClient: cope with disappearing perf_counters
Perfcounters can go away; deal with it.  This collapses
into a single loop, but it means that the mgr might
stop getting certain counters without explicit
notification.

Signed-off-by: Sage Weil <sage@redhat.com>
mgr/DaemonServer: log commands to cluster + audit logs
Signed-off-by: Sage Weil <sage@redhat.com>
mgr/DaemonServer: whitespace
Signed-off-by: Sage Weil <sage@redhat.com>
mgr/DaemonServer: debug on shutdown
Signed-off-by: Sage Weil <sage@redhat.com>
mgr/MgrStandby: add LogClient
Signed-off-by: Sage Weil <sage@redhat.com>
qa/tasks/dump_stuck.py: stop making assertions about 'health' report
Health comes from teh mon, while the pg stats come from teh mgr, so they
may be out of sync.

Signed-off-by: Sage Weil <sage@redhat.com>
librados: return error from mgr start_command
Signed-off-by: Sage Weil <sage@redhat.com>
mgr/MgrClient: assume missing MgrMap means no acces to mgr at all
If we get as far as authenticating and have no MgrMap that implies the
mon didn't provide us one (despite our request) and we have no access to
the mgr at all.

Signed-off-by: Sage Weil <sage@redhat.com>
mon/MgrMonitor: do not tick() if not active or leader
Signed-off-by: Sage Weil <sage@redhat.com>
mon/PGMap: set initial pg timestamp as osdmap modified stamp
Currently the PGMap may register the new pg in multiple places: the
mgr or mon, and the timestamp is when the mon or mgr gets around to
recording it into its PGMap.  Make this deterministic by using the OSDMap
mtime (which is when the PG was *actually* created).

This fixes a problem where the pg create command has one timestamp (from
the mon) and the pgmap on the mgr has another, so an observer will see the
last_scrub_stamp (and other timestamps) change when nothing has actually
happened.

Signed-off-by: Sage Weil <sage@redhat.com>

liewegas added some commits Mar 30, 2017

mon/MgrMonitor: health warn/err if no active mgr
Start warning once mons are luminous; start erroring once
require_luminous is set in osdmap.  Allow a grace period for
mgr to restart or standby to take over before we turn a warning
into an error.

Signed-off-by: Sage Weil <sage@redhat.com>
mon/MgrMonitor: print MgrMap to log on each change
The other services do this.

Signed-off-by: Sage Weil <sage@redhat.com>
mon/MgrMap: implement dump, add to ceph-dencoder
Signed-off-by: Sage Weil <sage@redhat.com>
mon/MgrMonitor: show delta
This is mostly because I couldn't debug a weird state where
it kept updating but it appeared the addr was not changing
(always '-').

Signed-off-by: Sage Weil <sage@redhat.com>
mon/ConfigKeyService: wait for quorum
Among other things, this prevents a mgr activation stall if it
happens to request config-keys from a mon that is out of
quorum.

Signed-off-by: Sage Weil <sage@redhat.com>
@liewegas

This comment has been minimized.

Member

liewegas commented Mar 30, 2017

http://pulpito.ceph.com/sage-2017-03-30_18:06:27-rados-wip-mgr-commands---basic-smithi/

failures are:

  • (2x) bluestore bug http://tracker.ceph.com/issues/19379
  • (2x) broken mon rebuild test (kefu is working on a fix)
  • upgrade test (i got the warning condition a bit wrong; fixed in final branch)

@liewegas liewegas merged commit 578b0f7 into ceph:master Mar 30, 2017

2 of 3 checks passed

default Build started sha1 is merged.
Details
Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details

@liewegas liewegas deleted the liewegas:wip-mgr-commands branch Mar 30, 2017

@tchaikov

This comment has been minimized.

Contributor

tchaikov commented Mar 31, 2017

@liewegas the broken test of mon rebuild test is tracked by http://tracker.ceph.com/issues/19439

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment