Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replay fails when watchpoint is set #2238

Closed
bgamari opened this issue Aug 12, 2018 · 10 comments
Closed

Replay fails when watchpoint is set #2238

bgamari opened this issue Aug 12, 2018 · 10 comments

Comments

@bgamari
Copy link
Contributor

bgamari commented Aug 12, 2018

I'm seeing the following failure on Linux 4.17.12 (running NixOS) whenever a watchpoint is enabled:

[FATAL /build/source/src/fast_forward.cc:308:fast_forward_through_instruction()] 
 (task 4080 (rec:25236) at time 1381)
 -> Assertion `ok' failed to hold. Can't even handle one watchpoint???
=== Start rr backtrace:
/opt/exp/ghc/ghc-utils/gdb/result/bin/rr(_ZN2rr13dump_rr_stackEv+0x41)[0x558ce1]
/opt/exp/ghc/ghc-utils/gdb/result/bin/rr(_ZN2rr9GdbServer15emergency_debugEPNS_4TaskE+0x5c5)[0x48b225]
/opt/exp/ghc/ghc-utils/gdb/result/bin/rr(_ZN2rr21EmergencyDebugOstreamD1Ev+0x11b)[0x49856b]
/opt/exp/ghc/ghc-utils/gdb/result/bin/rr(_ZN2rr32fast_forward_through_instructionEPNS_4TaskENS_13ResumeRequestERKSt6vectorIPKNS_9RegistersESaIS6_EE+0x9e3)[0x461ca3]
/opt/exp/ghc/ghc-utils/gdb/result/bin/rr(_ZN2rr13ReplaySession20emulate_async_signalEPNS_10ReplayTaskERKNS0_15StepConstraintsEl+0x7a2)[0x504542]
/opt/exp/ghc/ghc-utils/gdb/result/bin/rr(_ZN2rr13ReplaySession18try_one_trace_stepEPNS_10ReplayTaskERKNS0_15StepConstraintsE+0x142)[0x504f82]
/opt/exp/ghc/ghc-utils/gdb/result/bin/rr(_ZN2rr13ReplaySession11replay_stepERKNS0_15StepConstraintsE+0x128)[0x505b98]
/opt/exp/ghc/ghc-utils/gdb/result/bin/rr(_ZN2rr14ReplayTimeline33run_forward_to_intermediate_pointERKNS0_4MarkENS0_13ForceProgressE+0x57e)[0x5188fe]
/opt/exp/ghc/ghc-utils/gdb/result/bin/rr(_ZN2rr14ReplayTimeline16reverse_continueERKSt8functionIFbPNS_10ReplayTaskEEERKS1_IFbvEE+0x1340)[0x51e4b0]
/opt/exp/ghc/ghc-utils/gdb/result/bin/rr(_ZN2rr9GdbServer14debug_one_stepERNS_10GdbRequestE+0xd3b)[0x48937b]
/opt/exp/ghc/ghc-utils/gdb/result/bin/rr(_ZN2rr9GdbServer12serve_replayERKNS0_15ConnectionFlagsE+0x6f3)[0x48a983]
/opt/exp/ghc/ghc-utils/gdb/result/bin/rr[0x4fc1af]
/opt/exp/ghc/ghc-utils/gdb/result/bin/rr(_ZN2rr13ReplayCommand3runERSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EE+0x574)[0x4fd314]
/opt/exp/ghc/ghc-utils/gdb/result/bin/rr(main+0x200)[0x56140f]
/nix/store/2kcrj1ksd2a14bm5sky182fv2xwfhfap-glibc-2.26-131/lib/libc.so.6(__libc_start_main+0xf0)[0x7fc80645a020]
/opt/exp/ghc/ghc-utils/gdb/result/bin/rr(_start+0x2a)[0x43cc9a]
=== End rr backtrace
Launch gdb with
  gdb '-l' '10000' '-ex' 'set sysroot /' '-ex' 'target extended-remote 127.0.0.1:4080' /home/ben/.local/share/rr/spellcheck-31/mmap_hardlink_3_spellcheck

Any idea what might be going on here?

@khuey
Copy link
Collaborator

khuey commented Aug 12, 2018

Are you running in a VM? Do watchpoints work in GDB?

@bgamari
Copy link
Contributor Author

bgamari commented Aug 12, 2018

Are you running in a VM? Do watchpoints work in GDB?

I am not; this is running directly on my Skylake laptop. Yes, watchpoints work under GDB.

Looking more carefully it looks like this is somewhat address-specific. For a given replay setting a watchpoint on some addresses work whereas others fail with the above error (regardless of whether the program is running forward or backwards).

@rocallahan
Copy link
Collaborator

Weird. Can you characterize which addresses work and which don't?

@bgamari
Copy link
Contributor Author

bgamari commented Aug 17, 2018

I'm still trying to sort that out; I'm debugging a language runtime (namely that of the Glasgow Haskell Compiler and the issue has been hard to reproduce since I first saw it.

@bgamari
Copy link
Contributor Author

bgamari commented Aug 18, 2018

I am once again seeing this; which addresses fail seems to be quite random. However, they all share the property that they reside in the Haskell heap. @rocallahan, any hints on debugging this? I'd be happy to dig in a bit as I'm quite reliant on rr to debug my current project.

@bgamari bgamari closed this as completed Aug 18, 2018
@bgamari bgamari reopened this Aug 18, 2018
@rocallahan
Copy link
Collaborator

The simplest approach is probably to run rr pack on your trace, and put that somewhere I can download it, along with a reproducible test case in the form of a list of gdb commands to run.

@bgamari
Copy link
Contributor Author

bgamari commented Oct 9, 2018

Alright, I've uploaded the packed trace here. The following will hopefully reproduce the issue:

$ rr replay
>>> cont
>>> watch *(void**) 0x4200105788
>>> reverse-cont

This fails with,

[FATAL /build/source/src/fast_forward.cc:309:fast_forward_through_instruction()] 
 (task 12713 (rec:6155) at time 430)
 -> Assertion `ok' failed to hold. Can't even handle one watchpoint???
=== Start rr backtrace:
rr(_ZN2rr13dump_rr_stackEv+0x41)[0x561b51]
rr(_ZN2rr9GdbServer15emergency_debugEPNS_4TaskE+0x5c5)[0x4902b5]
rr(_ZN2rr21EmergencyDebugOstreamD2Ev+0x11b)[0x49e03b]
rr(_ZN2rr32fast_forward_through_instructionEPNS_4TaskENS_13ResumeRequestERKSt6vectorIPKNS_9RegistersESaIS6_EE+0xa07)[0x466ac7]
rr(_ZN2rr13ReplaySession21cont_syscall_boundaryEPNS_10ReplayTaskERKNS0_15StepConstraintsE+0x149)[0x507ae9]
rr(_ZN2rr13ReplaySession18patch_next_syscallEPNS_10ReplayTaskERKNS0_15StepConstraintsE+0x2b)[0x50aa5b]
rr(_ZN2rr13ReplaySession18try_one_trace_stepEPNS_10ReplayTaskERKNS0_15StepConstraintsE+0xfb)[0x50d32b]
rr(_ZN2rr13ReplaySession11replay_stepERKNS0_15StepConstraintsE+0x128)[0x50df88]
rr(_ZN2rr14ReplayTimeline31fix_watchpoint_coalescing_quirkERNS_12ReplayResultERKNS0_9ProtoMarkE+0x322)[0x521e52]
rr(_ZN2rr14ReplayTimeline40update_strategy_and_fix_watchpoint_quirkERNS0_24ReplayStepToMarkStrategyERKNS_13ReplaySession15StepConstraintsERNS_12ReplayResultERKNS0_9ProtoMarkE+0x1f)[0x5222cf]
rr(_ZN2rr14ReplayTimeline19replay_step_to_markERKNS0_4MarkERNS0_24ReplayStepToMarkStrategyE+0x1dd)[0x5224bd]
rr(_ZN2rr14ReplayTimeline16reverse_continueERKSt8functionIFbPNS_10ReplayTaskEEERKS1_IFbvEE+0xbdd)[0x52618d]
rr(_ZN2rr9GdbServer14debug_one_stepERNS_10GdbRequestE+0xd3b)[0x48e40b]
rr(_ZN2rr9GdbServer12serve_replayERKNS0_15ConnectionFlagsE+0x6f3)[0x48fa13]
rr[0x5041bf]
rr(_ZN2rr13ReplayCommand3runERSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EE+0x574)[0x505324]
rr(main+0x200)[0x56a297]
/nix/store/fg4yq8i8wd08xg3fy58l6q73cjy8hjr2-glibc-2.27/lib/libc.so.6(__libc_start_main+0xee)[0x7f43971d3b8e]
rr(_start+0x2a)[0x44307a]

with rr-5.2.0.

Thanks for your help!

@bgamari
Copy link
Contributor Author

bgamari commented Oct 9, 2018

Hmm, I'm actually having trouble reproducing this on master. Perhaps it was accidentally fixed?

@bgamari
Copy link
Contributor Author

bgamari commented Oct 9, 2018

Alright, I'm going to close this until I see it again with master.

@bgamari bgamari closed this as completed Oct 9, 2018
@rocallahan
Copy link
Collaborator

Might have been fixed by 3858729.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants