-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LTTng based code profiling; focused on the path in ceph-osd for a write request to an object. #2877
Conversation
I'm not sure, is it better to wrap "tp_stamps" alike statements to macro and make the original logic code path more clear? It's really ugly to me now:-(. |
totally agreed. Regards Andreas On Sat, 08 Nov 2014 04:55:02 -0800
Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de |
@@ -8257,6 +8288,7 @@ void OSD::ShardedOpWQ::_process(uint32_t thread_index, heartbeat_handle_d *hb ) | |||
|
|||
// osd:opwq_process marks the point at which an operation has been dequeued | |||
// and will begin to be handled by a worker thread. | |||
#if 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to remove this? If so, we should just do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Sam,
I removed the "old" tracepoints (two of them) in
commit ea631dc
I have also fixed compilation issues if WITH_LTTNG is not set:
commit 34e0a45
Regards
Andreas Bluemle
On Tue, 11 Nov 2014 10:47:51 -0800
Samuel Just notifications@github.com wrote:
@@ -8257,6 +8288,7 @@ void OSD::ShardedOpWQ::_process(uint32_t
thread_index, heartbeat_handle_d *hb )
// osd:opwq_process marks the point at which an operation has
been dequeued // and will begin to be handled by a worker thread.
+#if 0Do you want to remove this? If so, we should just do it.
Reply to this email directly or view it on GitHub:
https://github.com/ceph/ceph/pull/2877/files#r20170661
Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910
Company details: http://www.itxperts.de/imprint.htm
Seems tp_stamps only keep usec of ceph_clock_now, is that enough? |
I think that microseconds is good enough, when I look at overall latencies in the order of about 400 microseconds for a write to secondary OSD and more than 800 microseconds for a write to primary OSD. |
Maybe we can use utime_t.to_nsec() / 1000, which also include the utime_t.tv_sec in the result, it may be more friendly with shell script. But I am not quite sure about it. |
That may make sense; I'll give it a try. |
Hi, andreas-bluemle: My environment is: on the ceph master branch, update to the latest commit (5b500fa), "lttng list -u" can see all the tracepoints. Thanks! |
Hi, I had not been using "lttng view" to look at results, I will crosscheck in my environment. Regards Andreas On Tue, 25 Nov 2014 19:15:02 -0800
Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de Company details: http://www.itxperts.de/imprint.htm |
Hi, |
Hi, I just executed "lttng view" in my test environemnt; I am using lttng-tools and lttng-ust version 2.4.1 on I am running my tests in an environment where I just replace Regards Andreas On Tue, 25 Nov 2014 19:15:02 -0800
Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de Company details: http://www.itxperts.de/imprint.htm |
Hi, if you run "babeltrace -d -v" then it will produce output This output could help to identify the tracepoint which The traces I take are limited to the events Regards Andreas On Tue, 25 Nov 2014 23:22:55 -0800
Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de Company details: http://www.itxperts.de/imprint.htm |
Hi, I have solved it. It's a bug in event "pg:queue_op", which contains in src/tracing/pg.tp |
ctf_integer(uint8_t, type, type) | ||
ctf_integer(int64_t, num, num) | ||
ctf_integer(int64_t, reqnum, reqnum) | ||
ctf_integer(uint64_t, tid, tid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line should delete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
removed in extra tid in commit
4f8a469
Thanks
On Wed, 26 Nov 2014 04:12:36 -0800
Heng Jiang notifications@github.com wrote:
ctf_integer(uint8_t, type, type)
ctf_integer(int64_t, num, num)
ctf_integer(int64_t, reqnum, reqnum) ctf_integer(uint64_t, tid, tid)
This line should delete.
Reply to this email directly or view it on GitHub:
https://github.com/ceph/ceph/pull/2877/files#r20931387
Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910
Company details: http://www.itxperts.de/imprint.htm
@andreas-bluemle FYI this needs rebasing |
Hi Loic, thanks for the hint. Question: what should I rebase against? Maybe related to all this: the lttng tracepoints only provide a Would it make sense to publish such scripts and how could I do this? Regards Andreas Bluemle On Thu, 29 Jan 2015 09:35:33 -0800
Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de Company details: http://www.itxperts.de/imprint.htm |
right
If it's a few scripts why not add them in a lttn directory in ceph ? And document their existence (if not their usage) in some place a developer is most likely to find when trying to use. I'm speaking from my personal point of view : that's what I would find convenient :-) |
SUCCESS: the output of run-make-check.sh on centos-centos7 for 53cb871 is http://paste2.org/_MGj3aKBp |
write request to an object. Cleanup: Refine and use macros to set tp_stamps instead of using the ifdef WITH_LTTNG all over the place. OSD::ShardedOpWQ::_process(): remove disabled code (old tracepoints) fix compile if WITH_LTTNG is not set. Missed in previous commit: tp_stamps array increased for tracepoint pipe:writer. Added code-profiling tracepoint to AsyncMessenger. Addititional tracepoints for OSD::handle_op() and Replicatedbackend::submit_transaction() move to using absolute microsecond timestamps including the seconds; use utime_t:to_nsec()/1000 to do so. fine granularity check of Pipe::read_message() fix double reference to tid in tracepoint pg:queue_op Added two more tracepoints for sub_op_modify and sub_op_modify_reply. Finer granularity in some other places. Added two more cod profiling tracepoints. refine some timestamp tracepoints. Add timestamp tracepoints to FileStore transaction handling. fixes due to backporting.
09d4e3b
to
23e78f9
Compare
FAIL: the output of run-make-check.sh on centos-centos7 for d74cf5d is http://paste2.org/_dwszBABH |
Added 13 tracepoints for ceph-osd daemon to provide timing analysis of functions in the path for primary and secondary write operations. These tracepoints collect timestamps within each of the instrumented functions and allow to track a complete transaction, i.e. the contributions to the overall latency for a write request.
A primary write starts when it arrives at the socket of ceph-osd and ends when the acknowledgement is sent back to the client.
At the layer of the Messenger, only the "simple" Messenger is instrumented; corresponding tracepoints for the "async" Messenger are pending.