Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i#6104 multi-window: Refresh timestamps in a new window #6165

Merged
merged 59 commits into from
Aug 26, 2023
Merged

Conversation

bete0
Copy link
Contributor

@bete0 bete0 commented Jun 22, 2023

Updates the timestamp in the first buffer of a new window in multi-window traces
by refreshing the timestamp when re-starting tracing. Previously, the timestamp in
the first buffer was coming from the end of a prior window causing low activity at the
beginning of a new window.

Adds a unit test that fails without the fix in this change but passes with the change.

Fixes an issue where multi-window tracing fails to write the last window raw file.

Fixes #6104
Fixes #6245

@bete0
Copy link
Contributor Author

bete0 commented Jul 26, 2023

Got the following error for Windows tests (https://github.com/DynamoRIO/dynamorio/actions/runs/5664479853/job/15347856021):

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----        7/26/2023   3:59 AM                install                                                               
Exception calling "DownloadFile" with "2" argument(s): "The remote server returned an error: (504) Gateway Timeout."
At D:\a\_temp\2ec[14](https://github.com/DynamoRIO/dynamorio/actions/runs/5664479853/job/15347856021#step:5:15)538-3f2a-4097-b196-8d018e891f69.ps1:4 char:1
+ (New-Object System.Net.WebClient).DownloadFile("https://sourceforge.n ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [], ParentContainsErrorRecordException
    + FullyQualifiedErrorId : WebException
 
Error: Process completed with exit code 1.

Rerunning test.

@bete0
Copy link
Contributor Author

bete0 commented Jul 26, 2023

code_api|api.rseq test failed on ci-x86/x86-64-ubuntu22.

@bete0 bete0 requested a review from derekbruening July 26, 2023 07:28
Copy link
Contributor

@derekbruening derekbruening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Biggest issue is that the new test is never added as a test so it is never run.

clients/drcachesim/CMakeLists.txt Show resolved Hide resolved
clients/drcachesim/CMakeLists.txt Show resolved Hide resolved
clients/drcachesim/CMakeLists.txt Outdated Show resolved Hide resolved
clients/drcachesim/CMakeLists.txt Outdated Show resolved Hide resolved
clients/drcachesim/tests/window_test.cpp Outdated Show resolved Hide resolved
clients/drcachesim/tracer/instru.cpp Outdated Show resolved Hide resolved
clients/drcachesim/tracer/instru.h Outdated Show resolved Hide resolved
clients/drcachesim/tests/window_test.cpp Outdated Show resolved Hide resolved
clients/drcachesim/tracer/output.cpp Outdated Show resolved Hide resolved
clients/drcachesim/tracer/instr_counter.cpp Outdated Show resolved Hide resolved
  - Add anonymous namespace
  - Put all system headers together
  - Remove temporary disgnostic logs
  - Add comments on what the window test does
  - Clean-up variable names
  - Add comment explaining the test approach
  - Rename first_timestamp to prior_timetsamp
  - Remove unused headers
  - Remove unused function param out_subdir
@bete0 bete0 requested a review from derekbruening July 28, 2023 21:43
@derekbruening
Copy link
Contributor

Best to get the test passing before requesting review: I see it failing on 4 different test jobs.

Copy link
Contributor

@derekbruening derekbruening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best to get the test passing before requesting review: I see it failing on 4 different test jobs.

@bete0
Copy link
Contributor Author

bete0 commented Jul 28, 2023

The new windows-timestamps test is failing on Windows (vs2019-64 suite)

63: 
263:   Hit retrace threshold: enabling tracing for window #137.
263: 
263:   Hit tracing window #137 limit: disabling tracing.
263: 
263:   4Hit retrace threshold: enabling tracing for window #138.
263: 
263:   Hit tracing window #138 limit: disabling tracing.
263: 
263:    seconds
263: 
263:   Hit retrace threshold: enabling tracing for window #139.
263: 
263:   Hit tracing window #139 limit: disabling tracing.
263: 
263:   Hit retrace threshold: enabling tracing for window #140.
263: 
263:   Hit tracing window #140 limit: disabling tracing.
263: 
263:   Fatal error: failed to create trace file
263:   .\drmemtrace.tool.drcacheoff.windows-timestamps.exe.05792.5808.dir\raw\window.0141\drmemtrace.tool.drcacheoff.windows-timestamps.exe.05628.2643.raw
263: 

Also the test app results in 100+ windowed traces in Windows vs <20 in Linux.

@bete0 bete0 enabled auto-merge August 24, 2023 11:14
@bete0 bete0 added this pull request to the merge queue Aug 24, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 24, 2023
@bete0 bete0 added this pull request to the merge queue Aug 24, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 24, 2023
@bete0 bete0 enabled auto-merge August 24, 2023 17:15
@bete0 bete0 added this pull request to the merge queue Aug 24, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 24, 2023
@bete0 bete0 added this pull request to the merge queue Aug 24, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 24, 2023
@bete0 bete0 added this pull request to the merge queue Aug 24, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 24, 2023
@derekbruening
Copy link
Contributor

derekbruening commented Aug 25, 2023

I pulled your branch onto a Windows machine and reproduced a popup which shows that popups are not disabled and that it is the debug libc:

00 00ef4b68 773c4954 win32u!NtUserWaitMessage+0xc
01 00ef4b98 77421b3b user32!DialogBoxIndirectParamAorW+0x114
02 00ef4c64 774208d1 user32!SoftModalMessageBox+0x72b
03 00ef4dc0 77421397 user32!DrawStateA+0x1421
04 00ef4e4c 77421405 user32!MessageBoxTimeoutW+0x197
05 00ef4e6c 00568cf8 user32!MessageBoxW+0x45
06 00ef4e8c 00566162 tool_drcacheoff_windows_timestamps!__acrt_MessageBoxW+0x38 [minkernel\crts\ucrt\src\appcrt\internal\winapi_thunks.cpp @ 696] 
07 00ef4ea4 00566139 tool_drcacheoff_windows_timestamps!__crt_char_traits<wchar_t>::message_box<HWND__ *,wchar_t const * const &,wchar_t const * const &,unsigned int const &>+0x22 [minkernel\crts\ucrt\inc\corecrt_internal_traits.h @ 124] 
08 00ef4ed4 005661a6 tool_drcacheoff_windows_timestamps!common_show_message_box<wchar_t>+0x109 [minkernel\crts\ucrt\src\appcrt\misc\crtmbox.cpp @ 75] 
09 00ef4ee8 00542f38 tool_drcacheoff_windows_timestamps!__acrt_show_wide_message_box+0x16 [minkernel\crts\ucrt\src\appcrt\misc\crtmbox.cpp @ 93] 
0a 00ef7174 005430d2 tool_drcacheoff_windows_timestamps!common_message_window<wchar_t>+0x488 [minkernel\crts\ucrt\src\appcrt\misc\dbgrpt.cpp @ 409] 
0b 00ef7194 00567d14 tool_drcacheoff_windows_timestamps!__acrt_MessageWindowW+0x22 [minkernel\crts\ucrt\src\appcrt\misc\dbgrpt.cpp @ 464] 
0c 00eff24c 00543064 tool_drcacheoff_windows_timestamps!_VCrtDbgReportW+0x964 [minkernel\crts\ucrt\src\appcrt\misc\dbgrptt.cpp @ 673] 
0d 00eff278 0055ec2c tool_drcacheoff_windows_timestamps!_CrtDbgReportW+0x34 [minkernel\crts\ucrt\src\appcrt\misc\dbgrpt.cpp @ 278] 
0e 00eff29c 0055edd1 tool_drcacheoff_windows_timestamps!issue_debug_notification+0x1c [minkernel\crts\ucrt\src\appcrt\internal\report_runtime_error.cpp @ 25] 
0f 00eff2b4 00545f3a tool_drcacheoff_windows_timestamps!__acrt_report_runtime_error+0x11 [minkernel\crts\ucrt\src\appcrt\internal\report_runtime_error.cpp @ 154] 
10 00eff2c4 00541774 tool_drcacheoff_windows_timestamps!abort+0x1a [minkernel\crts\ucrt\src\appcrt\startup\abort.cpp @ 51] 
11 00eff2d8 00540a5a tool_drcacheoff_windows_timestamps!common_assert_to_stderr<wchar_t>+0xa4 [minkernel\crts\ucrt\src\appcrt\startup\assert.cpp @ 187] 
12 00eff2f0 00541cda tool_drcacheoff_windows_timestamps!common_assert<wchar_t>+0x2a [minkernel\crts\ucrt\src\appcrt\startup\assert.cpp @ 415] 
13 00eff308 005256d9 tool_drcacheoff_windows_timestamps!_wassert+0x1a [minkernel\crts\ucrt\src\appcrt\startup\assert.cpp @ 443] 
14 00eff75c 0052586a tool_drcacheoff_windows_timestamps!dynamorio::drmemtrace::post_process+0x349 [D:\derek\dr\git\src\clients\drcachesim\tests\window_test.cpp @ 109] 
15 00eff79c 005258e6 tool_drcacheoff_windows_timestamps!dynamorio::drmemtrace::gather_trace+0xfa [D:\derek\dr\git\src\clients\drcachesim\tests\window_test.cpp @ 132] 
16 00eff96c 008fa655 tool_drcacheoff_windows_timestamps!dynamorio::drmemtrace::test_main+0x36 [D:\derek\dr\git\src\clients\drcachesim\tests\window_test.cpp @ 138] 
17 00eff97c 0053a7d3 tool_drcacheoff_windows_timestamps!main+0x15

0:000> .frame 14
14 00eff75c 0052586a tool_drcacheoff_windows_timestamps!dynamorio::drmemtrace::post_process+0x349 [D:\derek\dr\git\src\clients\drcachesim\tests\window_test.cpp @ 109]
0:000> dv error
          error = "Failed to process file for thread 10352: memref entry found outside of bb"

disable_popups() needs DEBUG set but that is not set:

build clients\drcachesim\CMakeFiles\tool.drcacheoff.windows-timestamps.dir\tests\window_test.cpp.obj: CXX_COMPILER__tool.2edrcacheoff.2ewindows-timestamps D$:\derek\dr\git\src\clients\drcachesim\tests\window_test.cpp || cmake_object_order_depends_target_tool.drcacheoff.windows-timestamps
  DEFINES = -DDR_APP_EXPORTS -DDR_HOST_X86 -DWINDOWS -DX86_32 -D_CRT_SECURE_NO_WARNINGS -D_X86_

Looks like it's because you've disabled adding DEBUG:

      set(tool.drcacheoff.windows-timestamps_uses_configure ON) # Avoid dup DEBUG.

I don't see a configure.h being included in your test: so not sure why you have that line?

Getting DEBUG there by removing that line should get rid of the popups and avoid the timeouts.
There is still the assert itself for the raw2trace failure.

@derekbruening
Copy link
Contributor

282:   [drmemtrace]: Appending 3 instrs in bb 0A032AB0 in mod 5 +0x72ab0 =
282:   C:\WINDOWS\SYSTEM32\ntdll.dll
282:
282:   [drmemtrace]: Appending delayed branch tagalong entry type write (1) for
282:   thread 0
282:
282:   [drmemtrace]: Appending delayed branch tagalong entry type read (0) for
282:   thread 0
282:
282:   [drmemtrace]: Inserted indirect branch target 77802AB0
282:
282:   [drmemtrace]: Appending delayed branch type=15 pc=00508CE3 decode=09498CE3
282:   target=00000000 for thread 0
282:
282:   [drmemtrace]: Appending delayed branch tagalong entry type encoding (47)
282:   for thread 0
282:
282:   [drmemtrace]: Appending delayed branch tagalong entry type encoding (47)
282:   for thread 0
282:
282:   [drmemtrace]: Chunk instr count is now 0
282:
282:   [drmemtrace]: Appended encoding entry for 0A032AB0 sz=4 0x1a0008b8...
282:
282:   [drmemtrace]: Appended encoding entry for 0A032AB0 sz=1 0x00000000...
282:
282:   [drmemtrace]: Appended instr fetch for original 77802AB0
282:
282:   [drmemtrace]: Chunk instr count is now 0
282:
282:   [drmemtrace]: Appended encoding entry for 0A032AB5 sz=4 0x048a20ba...
282:
282:   [drmemtrace]: Appended encoding entry for 0A032AB5 sz=1 0x0000000a...
282:
282:   [drmemtrace]: Appended instr fetch for original 77802AB5
282:
282:   [drmemtrace]: Chunk instr count is now 0
282:
282:   [drmemtrace]: Omitting syscall instr without subsequent number marker.
282:
282:   [drmemtrace]: extra memref entry: 00BDF1AC
282:
282:   raw2trace failed: Failed to process file for thread 13192: memref entry
282:   found outside of bb
282:
282:   Assertion failed: false, file
282:   D:\derek\dr\git\src\clients\drcachesim\tests\window_test.cpp, line 109

0:000> U ntdll+72ab0
ntdll!NtWriteFile:
77802ab0 b808001a00      mov     eax,1A0008h
77802ab5 ba208a8177      mov     edx,offset ntdll!RtlInterlockedCompareExchange64+0x170 (77818a20)
77802aba ffd2            call    edx
77802abc c22400          ret     24h
77802abf 90              nop

Looks like that assert is from removing what DR treats as a syscall instr on WOW64 which is a call which has a store and the store is left dangling.
Filed #6280.

derekbruening added a commit that referenced this pull request Aug 25, 2023
Due to complexities in identifying and removing them, omitting false
wow64 syscalls is disabled.

Tested: Ran the windows-test from PR #6165 which was failing every
time and confirmed it now passes.

Issue: #6280, #6131
Fixes #6280
derekbruening added a commit that referenced this pull request Aug 25, 2023
Due to complexities in identifying and removing them, omitting false
wow64 syscalls is disabled.

Tested: Ran the windows-test from PR #6165 which was failing every time
and confirmed it now passes.

Issue: #6280, #6131
Fixes #6280
@bete0 bete0 enabled auto-merge August 26, 2023 00:14
@bete0 bete0 added this pull request to the merge queue Aug 26, 2023
Merged via the queue into master with commit 4b7245f Aug 26, 2023
15 checks passed
@bete0 bete0 deleted the i6104-window-fix branch August 26, 2023 01:44
derekbruening added a commit that referenced this pull request Sep 8, 2023
Adds a timestamp+cpuid pair at the end of normal buffers, to help
separate trace output i/o time.

Adds a timestamp+cpuid pair before and after each application syscall.
It appears around the syscall marker. The trace buffer is no longer
output pre-syscall (this addresses #3113) except for i-filtered traces
or small-window traces where we need a frequent trace size check.

Adds a timestamp+cpuid pair when a kernel transfer event occurs. The
trace buffer is no longer output here.

Fixes the existing behavior where the post-syscall-buffer's timestamp
contains the pre-syscall time.

Bumps the trace version for this change as the contents of post-syscall
timestamps are now different.

Augments the documentation around timestamps and adds a missing entry on
syscall markers.

Adds some invariant checks, and improve existing syscall marker checks.
Further checks are more difficult as the post-syscall time is not
present when there is no post-syscall event, such as with exit or
sigreturn. Did some manual testing with the view tool.

Applies several auxiliary changes required to get all modes to work with
these additional timestamps:
+ Removes the timestamp+cpuid from the split-pipe header, as that
timestamp is delayed by i/o and results in misleading out-of-order
timestamps.
+ Prevents pipe-splitting before a cpuid, to avoid a confusing sequence.
+ Fixes a bug where a syscall marker is missing when the prior block
wrote out its buffer (#6291).
+ Fixes #6245 by re-instating the footer for non-split window files and
removing the template `.*` from PR #6165 which was hiding the warning
(this one could possibly be split into its own PR).
+ Adjusts the PC discontinuity relaxation for online unfetched instrs
(#4915) to check the size now that there's no timestamp.

Issue: #4915
Fixes #6289
Fixes #3113
Fixes #6291 
Fixes #6245
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants