Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: fix sporadic deadlock in SIGUSR1 handler #5904

Merged
merged 1 commit into from
May 10, 2016

Conversation

bnoordhuis
Copy link
Member

Pull Request check-list

  • Does make -j8 test (UNIX) or vcbuild test nosign (Windows) pass with
    this change (including linting)?
  • Is the commit message formatted according to [CONTRIBUTING.md][0]?
  • If this change fixes a bug (or a performance problem), is a regression
    test (or a benchmark) included?
  • Is a documentation update included (if this change modifies
    existing APIs, or introduces new ones)?

Affected core subsystem(s)

debugger, test

Description of change

Calling v8::Debug::DebugBreak() directly from the signal handler is
unsafe because said function tries to grab a mutex. Work around that
by starting a watchdog thread that is woken up from the signal handler,
which then calls v8::Debug::DebugBreak().

Fixes: #5368

CI: https://ci.nodejs.org/job/node-test-pull-request/2066/

@bnoordhuis bnoordhuis added debugger test Issues and PRs related to the tests. labels Mar 25, 2016
@jasnell
Copy link
Member

jasnell commented Mar 26, 2016

Failure in CI looks related.

@bnoordhuis
Copy link
Member Author

The freebsd102-64 one anyway:

not ok 176 test-debug-signal-cluster.js
# got pids [12787,12788,12789]
# 
# assert.js:89
#   throw new assert.AssertionError({
#   ^
# AssertionError: test timed out.
#     at Timeout.testTimedOut [as _onTimeout] (/usr/home/iojs/build/workspace/node-test-commit-freebsd/nodes/freebsd102-64/test/parallel/test-debug-signal-cluster.js:53:3)
#     at Timer.unrefdHandle (timers.js:454:14)
# > all workers are running
# > Starting debugger agent.
# > Debugger listening on port 12388
  ---
  duration_ms: 4.294

I'm going to run the CI again because that test has been flaky in the past: https://ci.nodejs.org/job/node-test-pull-request/2073/

@jasnell
Copy link
Member

jasnell commented Apr 1, 2016

Two failures in CI on freebsd again.

@jasnell
Copy link
Member

jasnell commented Apr 1, 2016

is this relevant to v4, btw?

@bnoordhuis
Copy link
Member Author

Yes, v4 is affected. I speculate the freebsd failures are due to libc's sem_post() not being async signal-safe, contrary to what POSIX mandates. It's probably possible to work around that by picking up an alternative implementation from libthr but I need to build up the courage to go in and actually do it.

@MylesBorins
Copy link
Contributor

@bnoordhuis If something scares you then I probably wouldn't know where to start, but if you want to pair on it or if there is something I can do to help let me know

@bnoordhuis
Copy link
Member Author

Slightly different take that doesn't use atomics: https://ci.nodejs.org/job/node-test-pull-request/2524/

If that doesn't fix FreeBSD, I'm marking the test flaky on that platform.

@addaleax
Copy link
Member

addaleax commented May 8, 2016

@bnoordhuis https://lists.freebsd.org/pipermail/freebsd-current/2014-March/048885.html

Tried this PR out in a FreeBSD VM, and removing the PTHREAD_STACK_MIN line fixes the test failure.

@bnoordhuis
Copy link
Member Author

@addaleax Good catch, although that attitude is frustrating: why make writing cross-platform software easier when you can be pedantically correct in your interpretation of the spec?

Anyway, I added a workaround, let's see how it fares: https://ci.nodejs.org/job/node-test-pull-request/2537/

@addaleax
Copy link
Member

addaleax commented May 8, 2016

@bnoordhuis I don’t know, they effectively made the minimum stack provided by the pthreads implementation useless for anything but setting up a new stack and seem to know that. Wouldn’t be a fan of that either if I were a regular FreeBSD user.

CI is green up to general hickups and the code changes here LGTM.

Calling v8::Debug::DebugBreak() directly from the signal handler is
unsafe because said function tries to grab a mutex.  Work around that
by starting a watchdog thread that is woken up from the signal handler,
which then calls v8::Debug::DebugBreak().

Using a watchdog thread also removes the need to use atomic operations
so this commit does away with that.

Fixes: nodejs#5368
PR-URL: nodejs#5904
Reviewed-By: Anna Henningsen <anna@addaleax.net>
@bnoordhuis bnoordhuis closed this May 10, 2016
@bnoordhuis bnoordhuis deleted the fix5368 branch May 10, 2016 09:27
@bnoordhuis bnoordhuis merged commit 844f0a9 into nodejs:master May 10, 2016
evanlucas pushed a commit that referenced this pull request May 17, 2016
Calling v8::Debug::DebugBreak() directly from the signal handler is
unsafe because said function tries to grab a mutex.  Work around that
by starting a watchdog thread that is woken up from the signal handler,
which then calls v8::Debug::DebugBreak().

Using a watchdog thread also removes the need to use atomic operations
so this commit does away with that.

Fixes: #5368
PR-URL: #5904
Reviewed-By: Anna Henningsen <anna@addaleax.net>
@MylesBorins
Copy link
Contributor

@bnoordhuis do you think this is safe to land in v4.x?

@bnoordhuis
Copy link
Member Author

I'd give it a little longer. It's only been released in v6.2.0 so far.

@MylesBorins
Copy link
Contributor

@bnoordhuis thought on backporting now?

@bnoordhuis
Copy link
Member Author

I think it should be good to go now.

@MylesBorins
Copy link
Contributor

@bnoordhuis just tried and it didn't land cleanly, would you be willing to backport?

@MylesBorins
Copy link
Contributor

ping @bnoordhuis for a backport

@bnoordhuis
Copy link
Member Author

#7675

bnoordhuis added a commit to bnoordhuis/io.js that referenced this pull request Jul 12, 2016
Calling v8::Debug::DebugBreak() directly from the signal handler is
unsafe because said function tries to grab a mutex.  Work around that
by starting a watchdog thread that is woken up from the signal handler,
which then calls v8::Debug::DebugBreak().

Using a watchdog thread also removes the need to use atomic operations
so this commit does away with that.

Fixes: nodejs#5368
PR-URL: nodejs#5904
Reviewed-By: Anna Henningsen <anna@addaleax.net>
MylesBorins pushed a commit that referenced this pull request Jul 12, 2016
Calling v8::Debug::DebugBreak() directly from the signal handler is
unsafe because said function tries to grab a mutex.  Work around that
by starting a watchdog thread that is woken up from the signal handler,
which then calls v8::Debug::DebugBreak().

Using a watchdog thread also removes the need to use atomic operations
so this commit does away with that.

Fixes: #5368
PR-URL: #5904
Reviewed-By: Anna Henningsen <anna@addaleax.net>
MylesBorins pushed a commit that referenced this pull request Jul 12, 2016
Calling v8::Debug::DebugBreak() directly from the signal handler is
unsafe because said function tries to grab a mutex.  Work around that
by starting a watchdog thread that is woken up from the signal handler,
which then calls v8::Debug::DebugBreak().

Using a watchdog thread also removes the need to use atomic operations
so this commit does away with that.

Fixes: #5368
PR-URL: #5904
Reviewed-By: Anna Henningsen <anna@addaleax.net>
@MylesBorins MylesBorins mentioned this pull request Jul 12, 2016
MylesBorins pushed a commit that referenced this pull request Jul 14, 2016
Calling v8::Debug::DebugBreak() directly from the signal handler is
unsafe because said function tries to grab a mutex.  Work around that
by starting a watchdog thread that is woken up from the signal handler,
which then calls v8::Debug::DebugBreak().

Using a watchdog thread also removes the need to use atomic operations
so this commit does away with that.

Fixes: #5368
PR-URL: #5904
Reviewed-By: Anna Henningsen <anna@addaleax.net>
MylesBorins pushed a commit that referenced this pull request Jul 14, 2016
Calling v8::Debug::DebugBreak() directly from the signal handler is
unsafe because said function tries to grab a mutex.  Work around that
by starting a watchdog thread that is woken up from the signal handler,
which then calls v8::Debug::DebugBreak().

Using a watchdog thread also removes the need to use atomic operations
so this commit does away with that.

Fixes: #5368
PR-URL: #5904
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate flaky test-debug-port-from-cmdline
4 participants