New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replay divergence when replaying wine without syscallbuf enabled #1708

Closed
Keno opened this Issue May 21, 2016 · 14 comments

Comments

Projects
None yet
2 participants
@Keno
Contributor

Keno commented May 21, 2016

~/rr-vanilla-build/bin/rr record -n wine64 ./usr/bin/julia-debug.exe -e 'bytestring(Ptr{UInt8}(1))'
rr: Saving the execution of `wine64' to trace directory `/home/kfischer/.local/share/rr/wine64-9'.
fixme:winsock:WS_EnterSingleProtocolW unknown Protocol <0x00000000>
fixme:winsock:WS_EnterSingleProtocolW unknown Protocol <0x00000000>
fixme:ntdll:NtQueryVolumeInformationFile 0x118: volume info not supported

Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x7fc2fd54caea -- Assertion failed!

Program: Z:\home\kfischer\julia-win64\usr\bin\julia-debug.exe
File: /home/kfischer/julia-win64/src/debuginfo.cpp, Line 966

Expression: (*obj)->isCOFF()

abnormal program termination

signal (22): SIGABRT
while loading no file, in expression starting on line 0
crt_sig_handler at /home/kfischer/julia-win64/src\.\signals-win.c:87
MSVCRT_raise at /home/kfischer/wine-build/dlls/msvcrt/../../../wine/dlls/msvcrt/except.c:261
Allocations: 866423 (Pool: 865501; Big: 922); GC: 1
kfischer2@nanosoldier1:~/julia-win64$ ~/rr-vanilla-build/bin/rr replay -a
fixme:winsock:WS_EnterSingleProtocolW unknown Protocol <0x00000000>
fixme:winsock:WS_EnterSingleProtocolW unknown Protocol <0x00000000>
[ERROR /home/kfischer/rr-vanilla/src/Registers.cc:303:maybe_print_reg_mismatch() errno: SUCCESS] rdx 0 != 0xff (replaying vs. recorded)
[FATAL /home/kfischer/rr-vanilla/src/Registers.cc:423:compare_register_files() errno: SUCCESS]
 (task 24036 (rec:24003) at time 1132365)
 -> Assertion `!bail_error || match' failed to hold. Fatal register mismatch (ticks/rec:4473519809/4473519809)
Launch gdb with
  gdb /home/kfischer/.local/share/rr/latest-trace/mmap_hardlink_23_wine64
and attach to the rr debug server with:
  target remote :24036

The relevant backtrace is:

#0  0x00007fc2fd898400 in __close_nocancel ()
    at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fc2fae80d7d in getIPAddrTable (ppIpAddrTable=0xc5eb68,
    heap=<optimized out>, flags=<optimized out>)
    at ../../../wine/dlls/iphlpapi/ifenum.c:433
#2  0x00007fc2fae812d5 in v4addressesFromIndex (index=1, addrs=0xc5ebf8,
    num_addrs=0xc5ebe4, masks=0xc5ec00)
    at ../../../wine/dlls/iphlpapi/iphlpapi_main.c:802
#3  0x00007fc2fae816db in adapterAddressesFromIndex (family=<optimized out>,
    flags=30, index=1, aa=0x0, size=0xc5ed00)
    at ../../../wine/dlls/iphlpapi/iphlpapi_main.c:925
#4  0x00007fc2fae84071 in GetAdaptersAddresses (family=0, flags=30,
    reserved=<optimized out>, aa=0x0, buflen=0xc5ee54)
    at ../../../wine/dlls/iphlpapi/iphlpapi_main.c:1440

so it looks like the close syscall is changing the registers.

@rocallahan

This comment has been minimized.

Show comment
Hide comment
@rocallahan

rocallahan May 22, 2016

Member

What's the tail of rr dump -b -m -p?

It may be helpful to use rr replay -g to replay up to the previous event and then singlestep through to find the last instruction setting rdx.

Member

rocallahan commented May 22, 2016

What's the tail of rr dump -b -m -p?

It may be helpful to use rr replay -g to replay up to the previous event and then singlestep through to find the last instruction setting rdx.

@Keno

This comment has been minimized.

Show comment
Hide comment
@Keno

Keno May 22, 2016

Contributor

The log around that event is

{
  real_time:20234337.075895 global_time:1132364, event:`SYSCALL: ioctl' (state:EXITING_SYSCALL) tid:24003, ticks:4473519807
 rax:0x0 rbx:0x7d186e68 rcx:0xffffffffffffffff rdx:0xc5ead0 rsi:0x891b rdi:0x36 rbp:0x36 rsp:0xc5ea78 r8:0x0 r9:0x4 r10:0x2 r11:0x246 r12:0x7d186f0c r13:0xc5eb68 r14:0x7ffffe078c94 r15:0x0 rip:0x7fc2fd5b5337 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0x10 fs_base:0x7fc2fe285780 gs_base:0x7fffff7e8000
}
{
  real_time:20234337.075913 global_time:1132365, event:`SYSCALL: close' (state:ENTERING_SYSCALL) tid:24003, ticks:4473519809
 rax:0xffffffffffffffda rbx:0x7d186e68 rcx:0xffffffffffffffff rdx:0xff rsi:0x891b rdi:0x36 rbp:0x36 rsp:0xc5ea78 r8:0x0 r9:0x4 r10:0x2 r11:0x246 r12:0x7d186f0c r13:0xc5eb68 r14:0x7ffffe078c94 r15:0x0 rip:0x7fc2fd898400 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0x3 fs_base:0x7fc2fe285780 gs_base:0x7fffff7e8000
}

will try going to the event before and single stepping.

Contributor

Keno commented May 22, 2016

The log around that event is

{
  real_time:20234337.075895 global_time:1132364, event:`SYSCALL: ioctl' (state:EXITING_SYSCALL) tid:24003, ticks:4473519807
 rax:0x0 rbx:0x7d186e68 rcx:0xffffffffffffffff rdx:0xc5ead0 rsi:0x891b rdi:0x36 rbp:0x36 rsp:0xc5ea78 r8:0x0 r9:0x4 r10:0x2 r11:0x246 r12:0x7d186f0c r13:0xc5eb68 r14:0x7ffffe078c94 r15:0x0 rip:0x7fc2fd5b5337 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0x10 fs_base:0x7fc2fe285780 gs_base:0x7fffff7e8000
}
{
  real_time:20234337.075913 global_time:1132365, event:`SYSCALL: close' (state:ENTERING_SYSCALL) tid:24003, ticks:4473519809
 rax:0xffffffffffffffda rbx:0x7d186e68 rcx:0xffffffffffffffff rdx:0xff rsi:0x891b rdi:0x36 rbp:0x36 rsp:0xc5ea78 r8:0x0 r9:0x4 r10:0x2 r11:0x246 r12:0x7d186f0c r13:0xc5eb68 r14:0x7ffffe078c94 r15:0x0 rip:0x7fc2fd898400 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0x3 fs_base:0x7fc2fe285780 gs_base:0x7fffff7e8000
}

will try going to the event before and single stepping.

@rocallahan

This comment has been minimized.

Show comment
Hide comment
@rocallahan

rocallahan May 22, 2016

Member

This should be easy since there were only two conditional branches retired between the ioctl and the close and rdx must have obtained its incorrect value in between.

Member

rocallahan commented May 22, 2016

This should be easy since there were only two conditional branches retired between the ioctl and the close and rdx must have obtained its incorrect value in between.

@Keno

This comment has been minimized.

Show comment
Hide comment
@Keno

Keno May 22, 2016

Contributor

Yes, I believe I found it, but I haven't yet made sense of what's happening:

   0x00007fc2fae80d61 <+449>:   callq  0x7fc2fae7d4f0 <ioctl@plt>
   0x00007fc2fae80d66 <+454>:   mov    $0xffffffff,%edx
=> 0x00007fc2fae80d6b <+459>:   test   %eax,%eax
   0x00007fc2fae80d6d <+461>:   cmove  0x64(%rsp),%edx
   0x00007fc2fae80d72 <+466>:   mov    %ebp,%edi
   0x00007fc2fae80d74 <+468>:   mov    %edx,0x34(%rsp)
   0x00007fc2fae80d78 <+472>:   callq  0x7fc2fae7d510 <close@plt>
Contributor

Keno commented May 22, 2016

Yes, I believe I found it, but I haven't yet made sense of what's happening:

   0x00007fc2fae80d61 <+449>:   callq  0x7fc2fae7d4f0 <ioctl@plt>
   0x00007fc2fae80d66 <+454>:   mov    $0xffffffff,%edx
=> 0x00007fc2fae80d6b <+459>:   test   %eax,%eax
   0x00007fc2fae80d6d <+461>:   cmove  0x64(%rsp),%edx
   0x00007fc2fae80d72 <+466>:   mov    %ebp,%edi
   0x00007fc2fae80d74 <+468>:   mov    %edx,0x34(%rsp)
   0x00007fc2fae80d78 <+472>:   callq  0x7fc2fae7d510 <close@plt>
@rocallahan

This comment has been minimized.

Show comment
Hide comment
@rocallahan

rocallahan May 22, 2016

Member

I guess the divergence is because that cmove afterwards is firing during replay but not during recording.

Member

rocallahan commented May 22, 2016

I guess the divergence is because that cmove afterwards is firing during replay but not during recording.

@rocallahan

This comment has been minimized.

Show comment
Hide comment
@rocallahan

rocallahan May 22, 2016

Member

Or, hmm, eax must be the same or we'd have asserted for the ioctl, so maybe 0x64(%rsp) is different between replay and recording?

Member

rocallahan commented May 22, 2016

Or, hmm, eax must be the same or we'd have asserted for the ioctl, so maybe 0x64(%rsp) is different between replay and recording?

@rocallahan

This comment has been minimized.

Show comment
Hide comment
@rocallahan

rocallahan May 22, 2016

Member

Looking that the source for whatever code this is might give us a clue.

Member

rocallahan commented May 22, 2016

Looking that the source for whatever code this is might give us a clue.

@Keno

This comment has been minimized.

Show comment
Hide comment
@rocallahan

This comment has been minimized.

Show comment
Hide comment
@rocallahan

rocallahan May 22, 2016

Member

OK, I'm pretty sure the bug will be in rr's handling of that ioctl, not recording enough of the modified memory.

Member

rocallahan commented May 22, 2016

OK, I'm pretty sure the bug will be in rr's handling of that ioctl, not recording enough of the modified memory.

@rocallahan

This comment has been minimized.

Show comment
Hide comment
@rocallahan

rocallahan May 22, 2016

Member

Ah, these SIO* ioctl codes are nonstandard and don't encode the RW bits, and process_ioctl lets them through hoping that they're safe.

Member

rocallahan commented May 22, 2016

Ah, these SIO* ioctl codes are nonstandard and don't encode the RW bits, and process_ioctl lets them through hoping that they're safe.

@rocallahan

This comment has been minimized.

Show comment
Hide comment
@rocallahan

rocallahan May 22, 2016

Member

So let me know if you're patching this, or I am :-)

Member

rocallahan commented May 22, 2016

So let me know if you're patching this, or I am :-)

@Keno

This comment has been minimized.

Show comment
Hide comment
@Keno

Keno May 22, 2016

Contributor

I'm not sure I know exactly what to do yet. I'm sure I can figure it out, but if you already know what and where, please go right ahead.

Contributor

Keno commented May 22, 2016

I'm not sure I know exactly what to do yet. I'm sure I can figure it out, but if you already know what and where, please go right ahead.

@rocallahan

This comment has been minimized.

Show comment
Hide comment
@rocallahan

rocallahan May 22, 2016

Member

SUre, I'll do it.

Member

rocallahan commented May 22, 2016

SUre, I'll do it.

@Keno

This comment has been minimized.

Show comment
Hide comment
@Keno

Keno May 23, 2016

Contributor

Confirmed fixed, thanks!

Contributor

Keno commented May 23, 2016

Confirmed fixed, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment