Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LTTng based code profiling; focused on the path in ceph-osd for a write request to an object. #2877

Closed
wants to merge 3 commits into from

Conversation

andreas-bluemle
Copy link

Added 13 tracepoints for ceph-osd daemon to provide timing analysis of functions in the path for primary and secondary write operations. These tracepoints collect timestamps within each of the instrumented functions and allow to track a complete transaction, i.e. the contributions to the overall latency for a write request.
A primary write starts when it arrives at the socket of ceph-osd and ends when the acknowledgement is sent back to the client.
At the layer of the Messenger, only the "simple" Messenger is instrumented; corresponding tracepoints for the "async" Messenger are pending.

@ghost ghost added feature core labels Nov 7, 2014
@yuyuyu101
Copy link
Member

I'm not sure, is it better to wrap "tp_stamps" alike statements to macro and make the original logic code path more clear? It's really ugly to me now:-(.

@andreas-bluemle
Copy link
Author

totally agreed.

Regards

Andreas

On Sat, 08 Nov 2014 04:55:02 -0800
Haomai Wang notifications@github.com wrote:

I'm not sure, is it better to wrap "tp_stamps" alike statements to
macro and make the original logic code path more clear? It's really
ugly to me now:-(.


Reply to this email directly or view it on GitHub:
#2877 (comment)

Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de
Heinrich Boell Strasse 88 Phone: (+49) 89 4317582
D-81829 Muenchen (Germany) Mobil: (+49) 177 522 0151

@@ -8257,6 +8288,7 @@ void OSD::ShardedOpWQ::_process(uint32_t thread_index, heartbeat_handle_d *hb )

// osd:opwq_process marks the point at which an operation has been dequeued
// and will begin to be handled by a worker thread.
#if 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to remove this? If so, we should just do it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Sam,

I removed the "old" tracepoints (two of them) in
commit ea631dc

I have also fixed compilation issues if WITH_LTTNG is not set:
commit 34e0a45

Regards

Andreas Bluemle

On Tue, 11 Nov 2014 10:47:51 -0800
Samuel Just notifications@github.com wrote:

@@ -8257,6 +8288,7 @@ void OSD::ShardedOpWQ::_process(uint32_t
thread_index, heartbeat_handle_d *hb )
// osd:opwq_process marks the point at which an operation has
been dequeued // and will begin to be handled by a worker thread.
+#if 0

Do you want to remove this? If so, we should just do it.


Reply to this email directly or view it on GitHub:
https://github.com/ceph/ceph/pull/2877/files#r20170661

Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910

Company details: http://www.itxperts.de/imprint.htm

@agshew agshew mentioned this pull request Nov 14, 2014
@yuandong1222
Copy link
Contributor

Seems tp_stamps only keep usec of ceph_clock_now, is that enough?

@andreas-bluemle
Copy link
Author

I think that microseconds is good enough, when I look at overall latencies in the order of about 400 microseconds for a write to secondary OSD and more than 800 microseconds for a write to primary OSD.

@yuandong1222
Copy link
Contributor

Maybe we can use utime_t.to_nsec() / 1000, which also include the utime_t.tv_sec in the result, it may be more friendly with shell script. But I am not quite sure about it.

@andreas-bluemle
Copy link
Author

That may make sense; I'll give it a try.

@johnkeepmoving
Copy link
Contributor

Hi, andreas-bluemle:
I tried your patch, but it got something wrong when "lttng view". Here is the console output.
10:52:41-root@server-185:~$ lttng view
Trace directory: /root/lttng-traces/auto-20141126-105121
[error] ctf_struct_type_declarators_visit: duplicate field _tid in struct
[error] ctf_visitor_construct_metadata: event declaration error
[error] Error in CTF metadata constructor -1
[warning] Unable to open trace metadata for path "/root/lttng-traces/auto-20141126-105121/ust/uid/0/64-bit".
[warning] [Context] Cannot open_trace of format ctf at path /root/lttng-traces/auto-20141126-105121/ust/uid/0/64-bit.
[warning] [Context] cannot open trace "/root/lttng-traces/auto-20141126-105121/ust/uid/0/64-bit" from /root/lttng-traces/auto-20141126-105121 for reading.
[error] Cannot open any trace for reading.
[error] opening trace "/root/lttng-traces/auto-20141126-105121" for reading.
[error] none of the specified trace paths could be opened.

My environment is: on the ceph master branch, update to the latest commit (5b500fa), "lttng list -u" can see all the tracepoints.
And before I apply your patch, I can "lttng view" the trace results on master branch.
Is there something I forget? Or is there something different between our environment?

Thanks!

@andreas-bluemle
Copy link
Author

Hi,

I had not been using "lttng view" to look at results,
but always used babeltrace to do so.

I will crosscheck in my environment.

Regards

Andreas

On Tue, 25 Nov 2014 19:15:02 -0800
Heng Jiang notifications@github.com wrote:

Hi, andreas-bluemle:
I tried your patch, but it got something wrong when "lttng
view". Here is the console output. 10:52:41-root@server-185:~$ lttng
view Trace directory: /root/lttng-traces/auto-20141126-105121
[error] ctf_struct_type_declarators_visit: duplicate field _tid in
struct [error] ctf_visitor_construct_metadata: event declaration error
[error] Error in CTF metadata constructor -1
[warning] Unable to open trace metadata for path
"/root/lttng-traces/auto-20141126-105121/ust/uid/0/64-bit". [warning]
[Context] Cannot open_trace of format ctf at
path /root/lttng-traces/auto-20141126-105121/ust/uid/0/64-bit.
[warning] [Context] cannot open trace
"/root/lttng-traces/auto-20141126-105121/ust/uid/0/64-bit"
from /root/lttng-traces/auto-20141126-105121 for reading. [error]
Cannot open any trace for reading. [error] opening trace
"/root/lttng-traces/auto-20141126-105121" for reading. [error] none
of the specified trace paths could be opened.

My environment is: on the ceph master branch, update to the latest
commit (5b500fa), "lttng list -u" can see all the tracepoints.
And before I apply your patch, I can "lttng view" the trace results
on master branch. Is there something I forget? Or is there something
different between our environment?

Thanks!


Reply to this email directly or view it on GitHub:
#2877 (comment)

Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910

Company details: http://www.itxperts.de/imprint.htm

@johnkeepmoving
Copy link
Contributor

Hi,
I also used the babeltrace, but it still have this problem.

@andreas-bluemle
Copy link
Author

Hi,

I just executed "lttng view" in my test environemnt;
it does not complain and the output looks very much like
the output produced by babeltrace.

I am using lttng-tools and lttng-ust version 2.4.1 on
a CentOS 6.5 installation.

I am running my tests in an environment where I just replace
a single ceph-osd daemon. In my case, I have a ceph-v0.86 base
installation, stop on of the ceph-osd daemons and start one
with the tracepoints enabled. I am running this instance of
ceph-osd in the foreground (option -f); if I start it in the
background then I cannot successfully setup the lttng tracing.

Regards

Andreas

On Tue, 25 Nov 2014 19:15:02 -0800
Heng Jiang notifications@github.com wrote:

Hi, andreas-bluemle:
I tried your patch, but it got something wrong when "lttng
view". Here is the console output. 10:52:41-root@server-185:~$ lttng
view Trace directory: /root/lttng-traces/auto-20141126-105121
[error] ctf_struct_type_declarators_visit: duplicate field _tid in
struct [error] ctf_visitor_construct_metadata: event declaration error
[error] Error in CTF metadata constructor -1
[warning] Unable to open trace metadata for path
"/root/lttng-traces/auto-20141126-105121/ust/uid/0/64-bit". [warning]
[Context] Cannot open_trace of format ctf at
path /root/lttng-traces/auto-20141126-105121/ust/uid/0/64-bit.
[warning] [Context] cannot open trace
"/root/lttng-traces/auto-20141126-105121/ust/uid/0/64-bit"
from /root/lttng-traces/auto-20141126-105121 for reading. [error]
Cannot open any trace for reading. [error] opening trace
"/root/lttng-traces/auto-20141126-105121" for reading. [error] none
of the specified trace paths could be opened.

My environment is: on the ceph master branch, update to the latest
commit (5b500fa), "lttng list -u" can see all the tracepoints.
And before I apply your patch, I can "lttng view" the trace results
on master branch. Is there something I forget? Or is there something
different between our environment?

Thanks!


Reply to this email directly or view it on GitHub:
#2877 (comment)

Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910

Company details: http://www.itxperts.de/imprint.htm

@andreas-bluemle
Copy link
Author

Hi,

if you run "babeltrace -d -v" then it will produce output
about the events which are subject to tracing.

This output could help to identify the tracepoint which
with the duplicate _tid field.

The traces I take are limited to the events
filestore:queue_transactions
osd:ms_fast_dispatch
osd:opwq_process
osd:rpg_do_op
osd:rpg_execute_ctx
osd:handle_op
osd:handle_replica_op
osd:rbe_submit_transaction
osd:rbe_issue_op
osd:eval_repop
filejournal:prep_single_write
pipe:reader
pipe:writer
pipe:read_message
pipe:do_sendmsg

Regards

Andreas

On Tue, 25 Nov 2014 23:22:55 -0800
Heng Jiang notifications@github.com wrote:

Hi,
I also used the babeltrace, but it still have this problem.


Reply to this email directly or view it on GitHub:
#2877 (comment)

Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910

Company details: http://www.itxperts.de/imprint.htm

@johnkeepmoving
Copy link
Contributor

Hi, I have solved it. It's a bug in event "pg:queue_op", which contains in src/tracing/pg.tp
In TP_FIELDS, tid appeared twice.
So, I delete one, and everything is going well now.
Thanks.

ctf_integer(uint8_t, type, type)
ctf_integer(int64_t, num, num)
ctf_integer(int64_t, reqnum, reqnum)
ctf_integer(uint64_t, tid, tid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should delete.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

removed in extra tid in commit
4f8a469

Thanks

On Wed, 26 Nov 2014 04:12:36 -0800
Heng Jiang notifications@github.com wrote:

     ctf_integer(uint8_t, type, type)
  •    ctf_integer(int64_t, num, num)
    
  •    ctf_integer(int64_t, reqnum, reqnum)
     ctf_integer(uint64_t, tid, tid)
    

This line should delete.


Reply to this email directly or view it on GitHub:
https://github.com/ceph/ceph/pull/2877/files#r20931387

Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910

Company details: http://www.itxperts.de/imprint.htm

@ghost
Copy link

ghost commented Jan 29, 2015

@andreas-bluemle FYI this needs rebasing

@andreas-bluemle
Copy link
Author

Hi Loic,

thanks for the hint.

Question: what should I rebase against?
In my test environment, I am running with 0.91 already, but I
guess that the pull request should be against master, shouldn't it?

Maybe related to all this: the lttng tracepoints only provide a
startpoint for analysis. Evaluation of the data produced by the
tracing is a second step. To support this, I am using a couple of
scripts (bash, awk and python) which may not be perfect but which
at least helped me a lot.

Would it make sense to publish such scripts and how could I do this?

Regards

Andreas Bluemle

On Thu, 29 Jan 2015 09:35:33 -0800
Loic Dachary notifications@github.com wrote:

@andreas-bluemle FYI this needs rebasing


Reply to this email directly or view it on GitHub:
#2877 (comment)

Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910

Company details: http://www.itxperts.de/imprint.htm

@ghost
Copy link

ghost commented Jan 30, 2015

guess that the pull request should be against master, shouldn't it?

right

Would it make sense to publish such scripts and how could I do this?

If it's a few scripts why not add them in a lttn directory in ceph ? And document their existence (if not their usage) in some place a developer is most likely to find when trying to use. I'm speaking from my personal point of view : that's what I would find convenient :-)

@loic-bot
Copy link

loic-bot commented Feb 2, 2015

SUCCESS: the output of run-make-check.sh on centos-centos7 for 53cb871 is http://paste2.org/_MGj3aKBp

:octocat: Sent from GH.

Andreas Bluemle added 3 commits February 3, 2015 08:43
write request to an object.

Cleanup: Refine and use macros to set tp_stamps instead of using the
ifdef WITH_LTTNG all over the place.

OSD::ShardedOpWQ::_process(): remove disabled code (old tracepoints)

fix compile if WITH_LTTNG is not set.

Missed in previous commit: tp_stamps array increased for tracepoint
pipe:writer.

Added code-profiling tracepoint to AsyncMessenger.

Addititional tracepoints for OSD::handle_op() and
Replicatedbackend::submit_transaction()

move to using absolute microsecond timestamps including the seconds;
use utime_t:to_nsec()/1000 to do so.

fine granularity check of Pipe::read_message()

fix double reference to tid in tracepoint pg:queue_op

Added two more tracepoints for sub_op_modify and sub_op_modify_reply.
Finer granularity in some other places.

Added two more cod profiling tracepoints.

refine some timestamp tracepoints.
Add timestamp tracepoints to FileStore transaction handling.

fixes due to backporting.
@loic-bot
Copy link

loic-bot commented Feb 4, 2015

FAIL: the output of run-make-check.sh on centos-centos7 for d74cf5d is http://paste2.org/_dwszBABH

:octocat: Sent from GH.

@liewegas liewegas closed this Feb 5, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants