New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ui_tests -no_track_allocs 22s vs plain DR 13s: mostly from -show_threads #714
Comments
From bruen...@google.com on December 08, 2011 12:38:14 a better comparison is to DR using drmem's options: => 2.5x slowdown for drmem -no_track_allocs! |
From bruen...@google.com on December 08, 2011 12:39:07 Owner: bruen...@google.com |
From bruen...@google.com on December 09, 2011 09:55:00 the majority of this time is not the bb event: it seems to be ntdll Nt* processing, but it doesn't make sense: on laptop: but how can the ntdll export processing take 5-6s on ui_tests when the duplicated on windesk: yet debug build says: ??? |
From bruen...@google.com on December 09, 2011 11:20:37 neither get_proc_addr nor hashtable_add showed up in the sampling (below) remove hashtable_add if..else from add_syscall_entry: xref issue #691 . |
From bruen...@google.com on December 09, 2011 13:10:55 Summary: ui_tests -no_track_allocs 22s vs plain DR 13s: mostly from -show_threads |
From bruen...@google.com on December 15, 2011 08:19:38 the time is all from symbolization: address lookups in the giant chrome.dll pdb. one solution is to record mod+offs but not symbolize until later, just like |
From derek.br...@gmail.com on December 15, 2011 19:20:33 This issue was closed by revision r682 . Status: Fixed |
From bruen...@google.com on December 08, 2011 15:26:58
xref issue #681 xref issue #460 xref DRi#622, DRi#606
xref numbers from comment 10 of issue #460 , pasted here, where drmem doing
essentially nothing useful is 65% or so slower than plain DR:
DR
[----------] 1 test from NPAPITesterBase (12905 ms total)
[----------] 1 test from NPAPITesterBase (12950 ms total)
[----------] 1 test from NPAPITesterBase (13154 ms total)
-leaks_only -no_count_leaks -no_track_allocs
[----------] 1 test from NPAPITesterBase (21472 ms total)
[----------] 1 test from NPAPITesterBase (21593 ms total)
[----------] 1 test from NPAPITesterBase (21742 ms total)
presumably this is all from instrument_bb doing analysis, routine
replacement, repstr unrolling, etc.? worth investigating whether we can
reduce the overhead.
here's -leaks_only -no_count_leaks -no_track_allocs in CodeAnalyst:
Process Name 64-bit Timer samples
dynamorio.dll 59.22
unknown module pid (388) 1 15.33
ntoskrnl.exe 1 10.32
dbghelp.dll 7.98
drmemorylib.dll 2.44
ntdll.dll 1.89
Parity.sys 1 0.65
wow64cpu.dll 1 0.59
fltmgr.sys 1 0.33
wow64.dll 1 0.29
win32k.sys 1 0.28
11 modules, Total: 24757 samples, 99.32% of shown samples
CS:EIP Symbol + Offset Timer samples
0x7037d460 mutex_lock 13.99
0x7037b890 read_lock 3.16
0x70367fd0 hashtable_fragment_lookup 2.87
0x703a9df0 instr_is_cti 2.32
0x7037b740 mutex_trylock 2.15
0x703824c0 common_heap_alloc 1.98
0x70367d90 hashtable_ibl_lookup_for_removal 1.93
0x7037b9f0 read_unlock 1.88
0x70398d70 read_instruction 1.72
0x703715c0 find_free_list_slot 1.69
0x703b2d30 build_bb_ilist 1.61
0x703a9bf0 instr_is_cbr 1.37
0x7037b780 mutex_unlock 1.29
0x703998f0 decode_operand 1.29
0x7037ab60 emit_fragment_common 1.19
0x7039c0a0 encoding_possible_pass 1.18
0x703a9d90 instr_is_ubr 1.13
0x7039e100 instr_encode_common 1.09
0x7039ace0 opnd_type_ok 1.06
0x7039a0b0 decode_common 1
0x703a9c60 instr_is_mbr 0.97
0x703ac3f0 instr_is_exit_cti 0.88
0x70399270 decode_modrm 0.87
0x703d2430 is_intercepted_app_pc 0.79
0x7037ba40 write_unlock 0.77
0x70386fc0 vmvector_overlap 0.77
0x7036f9f0 hashtable_fragment_add 0.72
0x70373960 fcache_return_extra_space 0.72
0x703b8170 mangle 0.7
0x70382750 common_heap_free 0.67
0x70397800 get_ibl_routine_type_ex 0.67
0x70371450 add_to_free_list 0.64
0x703a9a90 instr_is_call 0.64
0x7038c1d0 check_thread_vm_area 0.62
0x7037a8d0 set_linkstub_fields 0.61
0x70386e10 lookup_addr 0.61
0x703985e0 resolve_variable_size 0.59
0x703b2a60 client_process_bb 0.58
0x7036c740 fragment_prepare_for_removal 0.57
0x703789b0 dispatch_exit_fcache_stats 0.54
0x70398970 read_operand 0.54
0x703aac10 instr_get_src 0.54
0x703ac4f0 instr_is_wow64_syscall 0.54
0x7037b830 release_recursive_lock 0.53
0x70368d80 fragment_lookup_type 0.52
0x703af1a0 instr_is_syscall 0.51
0x703aab00 instr_get_opcode 0.5
0x7036c690 fragment_prepare_for_removal_from_table 0.48
0x703a8620 instr_get_next 0.47
0x703aac80 instr_get_dst 0.47
0x703aae70 instr_length 0.47
0x703c72a0 get_thread_private_dcontext 0.46
0x703a6040 nop_pad_ilist 0.43
0x7039a740 size_ok 0.42
0x7039eda0 patch_branch 0.41
0x7037a360 dispatch 0.39
0x703ae4d0 instr_is_nop 0.39
0x7039d800 encode_operand 0.37
0x703d6ad0 is_wow64_process 0.36
0x7036e750 hashtable_fragment_reset 0.35
0x70379ed0 dispatch_enter_dynamorio 0.35
0x70388190 remove_fraglist_entry 0.35
0x703b3be0 build_basic_block_fragment 0.35
63 functions, 1933 instructions, Total: 10338 samples, 70.04% of shown samples, 11.65% of total session samples
CS:EIP Symbol + Offset Timer samples
0x735d7840 instrument_bb 46.14
0x735e5c80 instr_writes_esp 8.05
0x735f7a30 alloc_instrument 6.73
0x735d41b0 opnd_uses_nonignorable_memory 5.75
0x735f63a0 alloc_replace_instrument 4.76
0x73607050 symcache_symbol_add 3.78
0x735d2b50 reg_is_gpr 2.79
0x735f2b30 is_alloc_sysroutine 2.46
0x735d2bc0 opc_is_push 1.97
0x7360ca50 instr_num_dsts 1.97
0x7360ca68 opnd_is_reg 1.81
0x735d2c00 opc_is_pop 1.48
0x735d52b0 app_to_app_transformations 1.48
0x7360ca5c opnd_is_memory_reference 1.31
14 functions, 227 instructions, Total: 551 samples, 90.48% of shown samples, 0.62% of total session samples
Original issue: http://code.google.com/p/drmemory/issues/detail?id=714
The text was updated successfully, but these errors were encountered: