Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mgr/iostat: implement 'ceph iostat' as a mgr plugin #20100

Merged
merged 8 commits into from Apr 15, 2018

Conversation

mogeb
Copy link
Contributor

@mogeb mogeb commented Jan 24, 2018

Requires pg_dump to include the PGMap::Incremental object, which holds
the IO activity recently completed by the OSDs.

Signed-off-by: Mohamad Gebai mgebai@suse.com

@mogeb
Copy link
Contributor Author

mogeb commented Jan 24, 2018

IO statistics given by ceph -s:

$> while true; do ./bin/ceph -s 2>&1 | grep client; sleep 1; done
    client:   21488 kB/s wr, 0 op/s rd, 10 op/s wr
    client:   21488 kB/s wr, 0 op/s rd, 10 op/s wr
    client:   21489 kB/s wr, 0 op/s rd, 10 op/s wr
    client:   74705 kB/s wr, 0 op/s rd, 36 op/s wr
    client:   74709 kB/s wr, 0 op/s rd, 36 op/s wr
    client:   74712 kB/s wr, 0 op/s rd, 36 op/s wr
    client:   74712 kB/s wr, 0 op/s rd, 36 op/s wr
    client:   120 MB/s wr, 0 op/s rd, 60 op/s wr
    client:   120 MB/s wr, 0 op/s rd, 60 op/s wr
    client:   102005 kB/s wr, 0 op/s rd, 49 op/s wr
    client:   148 MB/s wr, 0 op/s rd, 74 op/s wr

ceph iostat running simultaneously:

$> ./bin/ceph iostat
wr: 21488 kB/s, rd: 0 kB/s, iops: 10
wr: 21489 kB/s, rd: 0 kB/s, iops: 10
wr: 21489 kB/s, rd: 0 kB/s, iops: 10
wr: 74705 kB/s, rd: 0 kB/s, iops: 36
wr: 74709 kB/s, rd: 0 kB/s, iops: 36
wr: 74709 kB/s, rd: 0 kB/s, iops: 36
wr: 74712 kB/s, rd: 0 kB/s, iops: 36
wr: 123501 kB/s, rd: 0 kB/s, iops: 60
wr: 123501 kB/s, rd: 0 kB/s, iops: 60
wr: 102005 kB/s, rd: 0 kB/s, iops: 49
wr: 152495 kB/s, rd: 0 kB/s, iops: 74

src/ceph.in Outdated
if len(childargs) > 0 and childargs[0] == 'iostat':
def call_iostat():
while 1:
subprocess.call([sys.argv[0], 'mgr', 'iostat'])
Copy link
Contributor Author

@mogeb mogeb Jan 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of now, it doesn't seem like there's a way to keep an open connection between a mgr plugin and the client. The alternative, which is implemented here, is having the client invoke the iostat plugin periodically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@badone added a blocking mode in scrub command. #19793
maybe we could do this in same way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liuchang0812 Thanks for the comments. The issue is that (to my knowleddge) there's no way for the plugin to continuously communicate with the client. Adding a blocking mode for the client wouldn't solve that.

src/mon/PGMap.cc Outdated
@@ -1367,6 +1367,8 @@ void PGMap::dump(Formatter *f) const
dump_pg_stats(f, false);
dump_pool_stats(f);
dump_osd_stats(f);
pg_sum_delta.dump(f);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we already have PGMapDigest::overall_client_io_rate_summary function, we could achieve iostats command via adding a python_get('client_io`) interface. likes 6272c26

Copy link
Contributor Author

@mogeb mogeb Jan 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, pg_sum_delta (dumped here) is what is indeed used by PGMapDigest::overall_client_io_rate_summary. I added 'io_rate' as a way to fetch it from a mgr plugin.

@mogeb
Copy link
Contributor Author

mogeb commented Jan 25, 2018

@jcsp
Copy link
Contributor

jcsp commented Feb 12, 2018

I see three reasonable-ish options here:

  1. (What this PR currently does) Add special cases to the CLI client, like we have for "ceph -w" and "ceph daemonperf"
  2. Extend the blocking command mode to enable sending interim output. Probably quite awkward to pass all the way up the python stack with callbacks etc for the interim output.
  3. Extend the command description format to identify specific commands as "polling" so that the CLI knows to block and call the command repeatedly until the user ctrl-c's it.

I think my instinct is for option 3, anyone else?

@liuchang0812
Copy link
Contributor

I think my instinct is for option 3, anyone else?

+1

@mogeb
Copy link
Contributor Author

mogeb commented Feb 13, 2018

@jcsp In option 3, there won't really be any polling though, correct? There will still be repeated calls from the client, just like in this PR. In any case, I agree with you for option 3 - I'll update this PR accordingly.

@jcsp
Copy link
Contributor

jcsp commented Feb 13, 2018

@mogeb right, the basic behaviour (sending an MCommand repeatedly) would be the same, it would just be a more generic mechanism so that the CLI doesn't have to be hardcoded with the "magic" commands.

I guess you'll probably want a convention for passing a "first=true" or similar argument, so that the server can generate the table header on the first call, then just output table lines later. Perhaps also have the CLI send the terminal width as a parameter of the command too.

@mogeb
Copy link
Contributor Author

mogeb commented Feb 13, 2018

@jcsp I was hoping to keep the formatting in the client. Maybe the plugin can expose a get header command, which the client calls when needed (on the first call, and when the header is out of sight). But, as you said, the terminal width would still need to be sent.

What do you think about formatting the returned values in JSON, and having the client unpack it and print it however it wants? This adds more flexibility too, so one can easily fetch the current throughput/iops. Or is this extra conversion something we want to avoid?

@jcsp
Copy link
Contributor

jcsp commented Feb 13, 2018

The trouble is that the client doesn't know anything about the data it's printing, so it generally can't do anything but pass through strings (and if we're just passing through strings, why not just send the whole line?).

For example, some plugins might want to selectively drop columns (like we do with "daemonperf") depending on the width of the terminal, and the client doesn't know the rules for that. Colour coding is also simpler if the plugin can just output its colourized lines rather than having to invent a scheme for the plugin to hint to the client about colours.

@mogeb
Copy link
Contributor Author

mogeb commented Feb 13, 2018

@jcsp Ah, I understand - you want to get completely rid of all command-specific code on the client. That part went over my head. Ok, I'll update this PR then.

@mogeb
Copy link
Contributor Author

mogeb commented Feb 26, 2018

@jcsp Ok, here area a few points regarding option 3:

  1. daemonperf calls perf dump on the admin socket, and the results are in JSON, so there isn't really much consolidation we can do

  2. The log watch (-w) is only invoked once, but it keeps an open connection with the monitor (which is what I was looking to do initially with iostat, but we can't keep an open connection with a plugin). Its behavior is different from iostat, no consolidation possible here either

  3. At the moment, there are no other ceph commands that have the same behavior as iostat, but I think we can expect more mgr plugins commands to have this behavior, which leads to the final point:

  4. We currently don't have any command description format we can extend to add the polling flag. All subcommands are hardcoded in the monstrous ceph.in file. We could add a command description format but, IMO, it would be better to just refactor that file completely at this point, maybe using python's ArgumentParser's sub_parsers. I suggest we start by getting iostat in, then we can refactor ceph.in independently (or vice versa).

If that makes sense, I'll go ahead and update the output of iostat (table header, etc.) as we discussed.

@jcsp
Copy link
Contributor

jcsp commented Feb 26, 2018

We currently don't have any command description format we can extend to add the polling flag. All subcommands are hardcoded in the monstrous ceph.in file. We could add a command description format but, IMO, it would be better to just refactor that file completely at this point, maybe using python's ArgumentParser's sub_parsers. I suggest we start by getting iostat in, then we can refactor ceph.in independently (or vice versa).

Fortunately, the commands are definitely not hardcoded in ceph.in (although a few special ones are).

The command descriptions are present in the COMMANDS member of a python module, or in mon/MonCommands.h for the C++ ones. These are passed to clients (such as the CLI) by the special "get_command_descriptions" command, and interpreted by parse_json_funcsigs and similar code in src/pybind/ceph_argparse.py

@mogeb mogeb force-pushed the iostat-plugin branch 6 times, most recently from 7ee547f to 2fa6386 Compare March 1, 2018 17:53
@mogeb
Copy link
Contributor Author

mogeb commented Mar 1, 2018

@jcsp I've updated (and rebased) the PR. I've added a poll flag that can be defined either in a mgr module, or in MonCommand.h. What are your thoughts on this version?

src/ceph.in Outdated
print(outs, file=sys.stderr)
sleep(1)
else:
ret, outbuf, outs = json_command(cluster_handle, target=target, argdict=valid_dict,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be something like

while True:
// run the command
if 'poll' in valid_dict and valid_dict['poll']:
break

no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that's clearer. Added.

@@ -24,7 +24,7 @@


FLAG_MGR = 8 # command is intended for mgr

FLAG_POLL = 16
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mind adding a brief comment telling us where the magic number comes from (i.e., mon/MonCommands.h's command flag)? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right :) Done.

src/ceph.in Outdated
print(outbuf)
if outs:
print(outs, file=sys.stderr)
sleep(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice if the polling interval was configurable, defaulting to some value (may it be 1 or something else). Maybe an option passed to the cli tool?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I was thinking of having it in the command description itself, but I think a flag to the cli tool is just as valid and definitely easier. Done.

@mogeb mogeb force-pushed the iostat-plugin branch 2 times, most recently from 1110767 to e1c2976 Compare March 1, 2018 20:02
@jcsp jcsp requested a review from tchaikov March 5, 2018 15:22
@jcsp
Copy link
Contributor

jcsp commented Mar 5, 2018

I like this. Since we're extending the command definition language, let's take the opportunity to make it futureproof and add a way for polling commands to supply a title and have some knowledge of the terminal size to do neat formatting (I'm thinking of commands that have a more table-like format).

Here's what I think we should do for futureproofing:

  • CLI client always pass a "title" boolean to the first time it calls (this could later be extended to do periodic title output depending on terminal height)
  • pass the terminal width as a command argument (see Termsize in ceph_daemon.py for how to get it).
  • pass a boolean for whether color should be used (see Daemonwatcher.supports_color)

We can do that in a followup PR as long as we land it before Mimic, or do it in this PR: what do you think @mogeb ?

Copy link
Contributor

@tchaikov tchaikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2018-04-11T03:23:37.356 INFO:tasks.workunit.client.0.mira101.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:1873: test_mon_pg:  ceph pg 1.0 query
2018-04-11T03:23:37.665 INFO:tasks.workunit.client.0.mira101.stderr:no valid command found; 10 closest matches:
2018-04-11T03:23:37.665 INFO:tasks.workunit.client.0.mira101.stderr:pg scrub <pgid>
2018-04-11T03:23:37.665 INFO:tasks.workunit.client.0.mira101.stderr:pg debug unfound_objects_exist|degraded_pgs_exist
2018-04-11T03:23:37.665 INFO:tasks.workunit.client.0.mira101.stderr:pg ls {<int>} {<states> [<states>...]}
2018-04-11T03:23:37.665 INFO:tasks.workunit.client.0.mira101.stderr:pg dump_stuck {inactive|unclean|stale|undersized|degraded [inactive|unclean|stale|undersized|degraded...]} {<int>
}
2018-04-11T03:23:37.666 INFO:tasks.workunit.client.0.mira101.stderr:pg ls-by-primary <osdname (id|osd.id)> {<int>} {<states> [<states>...]}
2018-04-11T03:23:37.666 INFO:tasks.workunit.client.0.mira101.stderr:pg ls-by-osd <osdname (id|osd.id)> {<int>} {<states> [<states>...]}
2018-04-11T03:23:37.666 INFO:tasks.workunit.client.0.mira101.stderr:pg dump_pools_json
2018-04-11T03:23:37.666 INFO:tasks.workunit.client.0.mira101.stderr:pg ls-by-pool <poolstr> {<states> [<states>...]}
2018-04-11T03:23:37.666 INFO:tasks.workunit.client.0.mira101.stderr:pg dump_json {all|summary|sum|pools|osds|pgs [all|summary|sum|pools|osds|pgs...]}
2018-04-11T03:23:37.666 INFO:tasks.workunit.client.0.mira101.stderr:pg map <pgid>
2018-04-11T03:23:37.666 INFO:tasks.workunit.client.0.mira101.stderr:Error EINVAL: invalid command

this change breaks "ceph pg 1.0 query".

@liewegas liewegas added this to the mimic milestone Apr 11, 2018
@mogeb mogeb force-pushed the iostat-plugin branch 2 times, most recently from 8d09e5e to d729c9d Compare April 11, 2018 16:19
@mogeb
Copy link
Contributor Author

mogeb commented Apr 11, 2018

@tchaikov fixed

@mogeb
Copy link
Contributor Author

mogeb commented Apr 11, 2018

retest this please

@@ -0,0 +1 @@
from module import Module
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mogeb should be from .module import Module

@tchaikov
Copy link
Contributor

@mogeb could you highlight the changes you made in the latest version? for instance, how did you fix "this change breaks "ceph pg 1.0 query"?

@@ -999,6 +1001,9 @@ def validate(args, signature, flags=0, partial=False):
if flags & FLAG_MGR:
d['target'] = ('mgr','')

if flags & FLAG_POLL:
d['poll'] = True
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tchaikov The problem was that ceph.in looks at len(valid_dict) (ie the number of parameters) to decide whether it's a pg command or not. See here. Setting valid_dict['poll'] = False (previously done here) broke that logic. The fix is to keep valid_dict['poll'] unset, unless the command server-side expects it. This is OK because the length of valid_dict isn't used for the plugin. There's already a check that 'poll' in valid_dict whenever we want to access it, so having it unset is not an issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh, thanks for the explanation! actually, i was debugging this yesterday.

@mogeb
Copy link
Contributor Author

mogeb commented Apr 12, 2018

@tchaikov sorry, I hesitated to do that :) I added a comment that explains the problem.

src/ceph.in Outdated
file=sys.stderr)
break
if outbuf:
print(outbuf)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if outbuf is human-readable content, we should print(outbuf.decode('utf-8')), this is important in py3.

src/ceph.in Outdated
if valid_dict:
if parsed_args.output_format:
valid_dict['format'] = parsed_args.output_format

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please drop this empty line.

src/ceph.in Outdated
while True:
ret, outbuf, outs = json_command(cluster_handle, target=target, argdict=valid_dict,
inbuf=inbuf)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please drop this empty line.

src/ceph.in Outdated
if 'poll' not in valid_dict or not valid_dict['poll']:
# Don't print here if it's not a polling command
break

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please drop this empty line.

item.polling = false;
PyObject *pPoll = PyDict_GetItemString(command, "poll");
if (pPoll) {
std::string polling = PyString_AsString(pPoll);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could just use boost::iequals(polling, "true")

@@ -787,6 +788,7 @@ def parse_json_funcsigs(s, consumer):
cmd['sig'] = parse_funcsig(cmd['sig'])
# just take everything else as given
sigdict[cmdtag] = cmd

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop this empty line.

Allow a mgr module to fetch 'io_rate' to access pg_sum_delta,
which holds the IO activity recently completed by the OSDs.

Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
@mogeb
Copy link
Contributor Author

mogeb commented Apr 12, 2018

@tchaikov sorry for the added iteration. Should be all done.

@tchaikov
Copy link
Contributor

no worries. thanks for your persistence!

@tchaikov tchaikov merged commit 5f65683 into ceph:master Apr 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants