Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spurious failure "ESRCH: No such process" #207

Closed
elinorbgr opened this issue Feb 3, 2019 · 7 comments
Closed

Spurious failure "ESRCH: No such process" #207

elinorbgr opened this issue Feb 3, 2019 · 7 comments
Labels
bug Instrumentation Issues relating to ptrace and parsing of DWARF tables

Comments

@elinorbgr
Copy link
Contributor

Rarely, it happens that when running tarpaulin on a test suite, a test fails with the following error:

[ERROR tarpaulin] Failed to get test coverage! Error: Failed to run tests!
Error: Error occurred when handling ptrace event: ESRCH: No such process

I didn't really find a way to consistently reproduce it, but it happens from time to time.

@xd009642
Copy link
Owner

xd009642 commented Feb 3, 2019

Interesting... I'm going to add a new --debug flag to tarpaulin to print out a lot more information to aid in debugging tracing based issues. I'll let you know when it's ready

@xd009642 xd009642 added bug Instrumentation Issues relating to ptrace and parsing of DWARF tables labels Feb 3, 2019
@elinorbgr
Copy link
Contributor Author

Here is a debug output I managed to reproduce, if that helps.

The test code is as follows, with a dependency on futures-executor = "0.2.1"

#[test]
pub fn a() {
    futures_executor::ThreadPool::new();
}

#[test]
pub fn b() {
    futures_executor::ThreadPool::new();
}

and the debug log is:

[TRACE tarpaulin] Debug mode activated
[INFO tarpaulin] Running Tarpaulin
[INFO tarpaulin] Building project
[DEBUG tarpaulin] Processing tarpotest
[INFO tarpaulin] Launching test
[INFO tarpaulin] running /home/levans/test/tarpotest/target/debug/deps/tarpotest-a3b55d59815e497a
[TRACE tarpaulin] Adding trace at address 0x403650 in src/lib.rs:2
[TRACE tarpaulin] Adding trace at address 0x403654 in src/lib.rs:3
[TRACE tarpaulin] Adding trace at address 0x403674 in src/lib.rs:8
[TRACE tarpaulin] Adding trace at address 0x403670 in src/lib.rs:7
[TRACE tarpaulin] Adding trace at address 0x403b10 in src/lib.rs:2
[TRACE tarpaulin] Adding trace at address 0x403b30 in src/lib.rs:7
[TRACE tarpaulin] Test PID is 13759
[TRACE tarpaulin] Caught inferior transitioning to Initialise state
[TRACE tarpaulin] Initialised inferior, transitioning to wait state

running 2 tests
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403b10
[TRACE tarpaulin] Incrementing hit count for trace
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403b10
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403650
[TRACE tarpaulin] Incrementing hit count for trace
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403653
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403654
[TRACE tarpaulin] Incrementing hit count for trace
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403656
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13759), SIGTRAP, 3)
[TRACE tarpaulin] New thread spawned 13760
[TRACE tarpaulin] Caught signal Stopped(Pid(13760), SIGSTOP)
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13759), SIGTRAP, 3)
[TRACE tarpaulin] New thread spawned 13761
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13759), SIGTRAP, 3)
[TRACE tarpaulin] New thread spawned 13762
[TRACE tarpaulin] Caught signal Stopped(Pid(13762), SIGSTOP)
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13759), SIGTRAP, 3)
[TRACE tarpaulin] New thread spawned 13763
.[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403b30
[TRACE tarpaulin] Incrementing hit count for trace
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403b30
[TRACE tarpaulin] Caught signal Stopped(Pid(13763), SIGSTOP)
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403670
[TRACE tarpaulin] Incrementing hit count for trace
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403673
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13763), SIGTRAP, 6)
[TRACE tarpaulin] Child exiting
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403674
[TRACE tarpaulin] Incrementing hit count for trace
[TRACE tarpaulin] Caught signal Stopped(Pid(13759), SIGTRAP)
[TRACE tarpaulin] Hit address 0x403676
[TRACE tarpaulin] Caught signal Exited(Pid(13763), 0)
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13762), SIGTRAP, 6)
[TRACE tarpaulin] Child exiting
[TRACE tarpaulin] Caught signal Stopped(Pid(13761), SIGSTOP)
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13760), SIGTRAP, 6)
[TRACE tarpaulin] Child exiting
[TRACE tarpaulin] Caught signal Exited(Pid(13762), 0)
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13761), SIGTRAP, 6)
[TRACE tarpaulin] Child exiting
[TRACE tarpaulin] Caught signal Exited(Pid(13761), 0)
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13759), SIGTRAP, 3)
[TRACE tarpaulin] New thread spawned 13764
[TRACE tarpaulin] Caught signal Stopped(Pid(13764), SIGSTOP)
[TRACE tarpaulin] Caught signal Exited(Pid(13760), 0)
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13759), SIGTRAP, 3)
[TRACE tarpaulin] New thread spawned 13765
[TRACE tarpaulin] Caught signal Stopped(Pid(13765), SIGSTOP)
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13759), SIGTRAP, 3)
[TRACE tarpaulin] New thread spawned 13766
[TRACE tarpaulin] Caught signal Stopped(Pid(13766), SIGSTOP)
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13759), SIGTRAP, 3)
[TRACE tarpaulin] New thread spawned 13767
[TRACE tarpaulin] Caught signal Stopped(Pid(13767), SIGSTOP)
.
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13764), SIGTRAP, 6)
test result: [TRACE tarpaulin] Child exiting
ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13765), SIGTRAP, 6)
[TRACE tarpaulin] Child exiting
[TRACE tarpaulin] Caught signal Exited(Pid(13764), 0)
[TRACE tarpaulin] Caught signal Exited(Pid(13765), 0)
[TRACE tarpaulin] Caught signal PtraceEvent(Pid(13766), SIGTRAP, 6)
[TRACE tarpaulin] Child exiting
[ERROR tarpaulin] Failed to get test coverage! Error: Failed to run tests! Error: Error occurred when handling ptrace event: ESRCH: No such process

@xd009642
Copy link
Owner

xd009642 commented Feb 6, 2019

That potentially helps... For some reason 13767 is being killed off without me catching it.

If I had to hazard a guess - ptrace is poorly designed in some areas and one of them is that PTRACE_CONTINUE sends a continue command to every thread in the process. Given the threadpool seems to spawn 3 threads that immediately exit this might cause a race condition in some instances where one PTRACE_CONTINUE continues the exit on 2+ threads before I handle the signal for the second.

That sounds somewhat feasible to me, I'll add some more debug printouts through the coming days and try some things out. Hopefully, this will lead to a solution

xd009642 added a commit that referenced this issue Feb 6, 2019
The two issues seem related so grouping all work here. From what I've found there's some issues with bad jumps or addresses during the run which can cause these errors either consistently or inconsistently
@xd009642
Copy link
Owner

@vberger could you see if this issue still persists with the latest version on the develop branch?

@elinorbgr
Copy link
Contributor Author

I can still reproduce it, though it seems to occur more rarely (occurred once in around 30 runs).

xd009642 added a commit that referenced this issue Jul 28, 2019
Added the minimal example from #207 and had the coverage test run
it 10 times to deal with the more spurious nature of this test
@xd009642
Copy link
Owner

xd009642 commented Oct 3, 2019

@vberger I'm pretty sure this is solved by the fix in #190, I'm currently releasing 0.9.0 if you could try to recreate and let me know if I can close this. Otherwise I'll close it in a few days and reopen it if there are any updates

@elinorbgr
Copy link
Contributor Author

I cannot seem to reproduce it any more. 🎉

I'll reopen if I ever see it again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Instrumentation Issues relating to ptrace and parsing of DWARF tables
Projects
None yet
Development

No branches or pull requests

2 participants