-
Notifications
You must be signed in to change notification settings - Fork 84
About cost of TRACE_EVENT when tracing is not enabled #494
Comments
From jamescook@chromium.org on October 21, 2013 17:19:34 nat, it looks like my implementation of TRACE_EVENT_MEMORY might create a function call into the ScopedTraceMemory constructor to see if memory tracing is enabled (or a bunch more instructions if inlined). This will happen for TRACE_EVENT{0,1,2}, basically anything that adds a scoped trace event. It might be possible to avoid this by changing the ScopedTraceMemory constructor to be inline in the header file and do something like: ScopedTraceMemory(foo, bar) { OTOH, I haven't yet used this system to diagnose any real-world problems, so maybe I should revert it until I have time to work on it some more. Thoughts? |
From wangxianzhu@chromium.org on October 21, 2013 18:03:47 @jamescook: https://codereview.chromium.org/32823005/ :) Omitting TRACE_MEMORY, on x86, TRACE_EVENT0 will generate the following instructions in the disabled path: Begin of scope: mov ...(%rip), %rbx ; category_group_enabled = NoBarrier_Load(&atomic); End of scope: mov %rbx,...(%rsp) That is, on begin of scope, the number of instructions is twice of that of simple-variable-checking; on end of scope, the number of instructions seems already minimum. Thought of a method using a global variable indicating if tracing is enabled: In this way, the numbers of overhead instructions in different cases are:
Though the benefit seems not that much, it makes the overhead comparable with simple-variable-checking when tracing is not enabled (making https://codereview.chromium.org/29963002/ feasible). Another benefit is that TraceLog::GetCategoryGroupEnabled() won't be called until tracing is enabled, which may reduce the trace_event overhead during chrome startup. In addition, the benefit looks bigger considering TRACE_MEMORY because extra checking in it can be eliminated. Status: Assigned |
From wangxianzhu@chromium.org on October 21, 2013 18:11:17 Forgot to mention, the above "current" data is before my X Event change or after the extra constructor issue fixed by ( https://codereview.chromium.org/33423006/ ). The extra constructors cause 3 more mov instructions on begin of the scope. |
From wangxianzhu@chromium.org on October 22, 2013 09:44:36 The method in #2 would cause the difficulty of TraceLog::GetKnownCategoryGroups(). Perhaps we could collect the known categories with a build step which generates the categories in a 'static const char*[]' array. wdyt? |
From wangxianzhu@chromium.org on October 22, 2013 14:17:08 Measured the cost of TraceLog::GetCategoryGroupEnabled() during chrome startup with https://codereview.chromium.org/30253003 . Linux: Android: The cost looks trivial, but I'm wondering how much cost during startup could be considered as significant. However, static category_groups would be required by the method in #2, and reducing startup cost would be a good side-effect of it. |
From piman@chromium.org on October 22, 2013 14:22:38 @#5: 1us (if that number is correct) isn't trivial at all for https://codereview.chromium.org/29963002/ because heavy pages can do 10k to 100k GL calls per frame, i.e. 10-100ms per frame. Or 40-400ms on Android. |
From wangxianzhu@chromium.org on October 22, 2013 14:26:57 The 1us cost is only for the first call of TRACE_EVENT at each place. #5 is about the impact of the first calls chrome startup performance. The normal cost should still be as low as 3~6 instructions as shown in first half of #2. |
From wangxianzhu@chromium.org on October 22, 2013 15:08:51 piman@, do you think the overhead acceptable for CommandParser::ProcessCommand(): 6 (or 9 if TRACE_MEMORY is enabled) instructions on scope begin and 3 instructions on scope end, after https://codereview.chromium.org/33423006/ and https://codereview.chromium.org/32823005/? Currently with trace_gl_commands_ the overhead is 3/3 on scope begin/end. Would the 3- or 6-instruction extra cost be worth it considering the benefits of https://codereview.chromium.org/29963002/? - it allows the category to be selected at runtime in chrome://tracing;
|
From piman@chromium.org on October 22, 2013 15:38:57 What is the cost in real time, on a real world benchmark? Instruction cound can be highly misleading. |
From wangxianzhu@chromium.org on October 22, 2013 15:44:25 I think the cost is so trivial and not actually measurable (less than several ns), so just use number of instructions. |
From piman@chromium.org on October 22, 2013 16:21:42 In the past, we reverted a similar patch because, even though it was thought it would have no measurable impact, it actually significantly regressed performance. So excuse me for being extra careful. |
From wangxianzhu@chromium.org on October 22, 2013 17:09:01 Here are the measurement results. Test method:
Without the change: With the change: I knew the change ( https://chromiumcodereview.appspot.com/10442090 ) to disable gl tracing by default when I made the change. I believe the reason of it was not the overhead of the not-enabled path, but because at that time we didn't have DISABLED_BY_DEFAULT tracing category, so "cb_command" category would be enabled by default during tracing, and the too many events made tracing unusable for normal use. Forgot to mention another biggest benefit of my CL: it can reduce the number of gl tracing events by half :) |
From piman@chromium.org on October 22, 2013 17:19:58 Ok, thanks, that looks good. Please keep an eye on possible perf regressions on the bots. |
From wangxianzhu@chromium.org on October 23, 2013 11:57:45 https://codereview.chromium.org/33423006/ Remove constructors of TraceEventHandle https://codereview.chromium.org/32823005/ Reduce overhead of TRACE_MEMORY when disable The overhead of initialization and not-enabled path seem acceptable now. Status: Fixed |
From wangxianzhu@chromium.org on October 21, 2013 16:39:13
The cost of TRACE_EVENT when tracing is not enabled is supposed to be very trivial just like checking the value of a single boolean variable in memory. Perhaps it should cost just like the following three instructions:
load variable -> register
compare register, 0
jmp_if_zero
However, now it seems to generate much more instructions in the disabled-path, especially when TRACE_MEMORY is enabled.
Original issue: http://code.google.com/p/trace-viewer/issues/detail?id=489
The text was updated successfully, but these errors were encountered: