Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JDK-8260485: Simplify and unify handler vectors in Posix signal code #2251

Closed

Conversation

tstuefe
Copy link
Member

@tstuefe tstuefe commented Jan 27, 2021

In signal handling code, we have code sections which save signal handler state into vectors of sigaction structures, or of integers (if only flags are saved). All these code sections can be unified, disentangled and the using code simplified.

There are three places where we do this:

  1. When installing hotspot signal handlers, should we find a handler in place and signal chaining is enabled, we save the original handler inside a sigaction array and a corresponding sigset:

    struct sigaction sigact[NSIG];

    static struct sigaction* get_preinstalled_handler(int sig) {

  2. if diagnostics are enabled with -Xcheck:jni, we periodically check if our hotspot signal handlers had been replaced (static void check_signal_handler(int sig)):

    static void check_signal_handler(int sig) {

    To do that, we store information about the handlers we installed and we expect to be intact; in this case we only store the sigaction flags (int sigflags[NSIG];) and deduce the handler address from context.

  3. There is a complicated dance between VMError and the posix signal handler code: If a fatal error happens, we enter error reporting and install the secondary handler (VMError::install_secondary_signal_handler()). Before doing that, we store the handler we replace in yet another array, in this case one array for the handler address, one for the flag:

    static void save_signal(int idx, int sig)

    I believe the purpose of this is to - when printing signal handlers as part of error reporting - print the original signal handler instead of the secondary crash handler (see PosixSignals::print_signal_handler()):
    address rh = VMError::get_resetted_sighandler(sig);

    and additionally to not trip this warning here:
    ", flags was changed from " PTR32_FORMAT ", consider using jsig library",


Changes in this patch:

  • I added some convenience macros to check if a handler matches a given function (HANDLER_IS), check if a handler is set to ignore or default or both (HANDLER_IS_IGN, HANDLER_IS_DFL, HANDLER_IS_IGN_OR_DFL). Makes code more readable.

  • I added convenience class SavedSignalHandlers to keep a vector of handler information by signal number.

  • I used that class to cover cases (1)..(3):

    • chained_handlers contains all information of chained handlers
    • expected_handlers contains a copy of the handlers the hotspot installed
    • replaced_handlers contains information about replaced handlers
  • about (1): I store the chained signal handler information in chained_handlers when installing a hotspot handler, UseSignalChaining is 1, and a non-default handler was encountered.

  • about (2): I simplified the signal checking mechanism quite a bit: it compares the handler (address and flags) it finds present with expectations. Before this patch, the expected handler address was deduced in a hard-wired way, now, we just compare the active sigaction structure with the one we installed on VM start.

  • about (3): when installing any handler (hotspot as well as user defined via java), I store the handler it replaced in replaced_handlers. I use that to print which handler had been replaced in PosixSignals::print_signal_handler. I simplified PosixSignals::print_signal_handler such that it does not retain any knowledge about hotspot signal handlers. Now, it just prints out the currently established handlers. In addition to that, it prints out chaining information and which handlers had been replaced. I removed the associated coding from VMError.

Output Before:

663 Signal Handlers:
664    SIGSEGV: javaSignalHandler in libjvm.so, sa_mask[0]=11100100010111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
665     SIGBUS: javaSignalHandler in libjvm.so, sa_mask[0]=11100100010111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
666     SIGFPE: javaSignalHandler in libjvm.so, sa_mask[0]=11100100010111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
667    SIGPIPE: javaSignalHandler in libjvm.so, sa_mask[0]=11100100010111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
668    SIGXFSZ: javaSignalHandler in libjvm.so, sa_mask[0]=11100100010111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
669     SIGILL: javaSignalHandler in libjvm.so, sa_mask[0]=11100100010111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
670    SIGUSR2: SR_handler in libjvm.so, sa_mask[0]=00000000000000000000000000000000, sa_flags=SA_RESTART|SA_SIGINFO
671     SIGHUP: UserHandler in libjvm.so, sa_mask[0]=11100100010111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
672     SIGINT: UserHandler in libjvm.so, sa_mask[0]=11100100010111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
673    SIGTERM: UserHandler in libjvm.so, sa_mask[0]=11100100010111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
674    SIGQUIT: UserHandler in libjvm.so, sa_mask[0]=11100100010111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
675    SIGTRAP: javaSignalHandler in libjvm.so, sa_mask[0]=11100100010111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO

Now:

 Signal Handlers:
    SIGSEGV: crash_handler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
   replaced:    SIGSEGV: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
     SIGBUS: crash_handler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
   replaced:     SIGBUS: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
     SIGFPE: crash_handler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
   replaced:     SIGFPE: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
    SIGPIPE: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
    SIGXFSZ: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
     SIGILL: crash_handler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
   replaced:     SIGILL: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
    SIGUSR2: SR_handler in libjvm.so, mask=00000000000000000000000000000000, flags=SA_RESTART|SA_SIGINFO
     SIGHUP: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
     SIGINT: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
    SIGTERM: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
    SIGQUIT: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
    SIGTRAP: crash_handler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO

Tests: GA, and the patch has been tested in our nighlies for over a month now. I manually executed the runtime/jni/checked tests too.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8260485: Simplify and unify handler vectors in Posix signal code

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/2251/head:pull/2251
$ git checkout pull/2251

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 27, 2021

👋 Welcome back stuefe! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jan 27, 2021

@tstuefe The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Jan 27, 2021
@tstuefe tstuefe force-pushed the JDK-8260485-signal-handler-improvements branch from ae8a2c4 to 9476779 Compare January 27, 2021 10:28
@tstuefe tstuefe force-pushed the JDK-8260485-signal-handler-improvements branch from 9476779 to 016d35c Compare January 27, 2021 10:35
@tstuefe tstuefe marked this pull request as ready for review January 27, 2021 10:39
@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 27, 2021
@mlbridge
Copy link

mlbridge bot commented Jan 27, 2021

Webrevs

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Thomas,

I've taken a first pass at this. I like parts of it, perhaps most of it, but am having trouble seeing the before/after picture clearly.

A few comments below.

Thanks,
David

Comment on lines 98 to 99
assert(sig > 0 || sig < NSIG, "invalid signal number %d", sig);
return sig > 0 || sig < NSIG;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surely && not ||

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, this was an oversight.

// SavedSignalHandlers is a helper class for those cases, keeping an array of sigaction
// structures and membership information.
class SavedSignalHandlers {
struct sigaction _v[NSIG];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why _v ??

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v as in vector... I changed it to _sa.

Comment on lines 102 to 105
bool is_set(int sig) const { return sigismember(&_set, sig); }
void mark_as_set(int sig) { sigaddset(&_set, sig); }
void mark_as_clear(int sig) { sigdelset(&_set, sig); }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These don't really pay for themselves IMO. They are private and only used once each, and are a single line of code. The extra abstraction layer really doesn't add anything in terms of clarity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. I removed the whole sigset. See my comment below.

src/hotspot/os/posix/signals_posix.cpp Outdated Show resolved Hide resolved
Comment on lines 842 to 844
if (sig == SHUTDOWN2_SIGNAL && !isatty(fileno(stdin))) { // Flags?
tty->print_cr("Running in non-interactive shell, %s handler is replaced by shell",
os::exception_name(sig, buf, O_BUFLEN));
os::exception_name(sig, buf, O_BUFLEN)); // When comparing, ignore the SA_RESTORER flag on Linux
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the flag comments in this block.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a typo.

@tstuefe
Copy link
Member Author

tstuefe commented Feb 2, 2021

Hi Thomas,

I've taken a first pass at this. I like parts of it, perhaps most of it, but am having trouble seeing the before/after picture clearly.

A few comments below.

Thanks,
David

Thanks David.

I reduced the complexity of the patch somewhat by removing case (3) completely. I found that it adds very little useful functionality. So all that is left now is (1) the simplified handling of chained handlers and (2) the new logic for implementing the signal handler jni check.

I also changed/simplified SavedSignalHandlers somewhat. NSIG can be largeish (I believe 1024 on AIX) and that would have made the array-of-sigaction structures large too. Not that it really matters much. But I changed the class into a pointer array, with the sigaction structures living in C-Heap. Since this array is sparsely populated, that saves space. Also removes the need for the membership sigset, since a NULL slot would indicate its not set.

Further answers inline. Feel free to ask questions if something is unclear.

Thanks, Thomas

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Thomas,

Still a few minor comments and queries, but overall it is looking good.

Thanks,
David


// For signal-chaining:
// if chaining is active, chained_handlers contains all handlers which we
// did replace with our own and to which we must delegate.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/which we did replace/we replaced/

// Running under non-interactive shell, SHUTDOWN2_SIGNAL will be reassigned SIG_IGN
if (sig == SHUTDOWN2_SIGNAL && !isatty(fileno(stdin))) {
tty->print_cr("Running in non-interactive shell, %s handler is replaced by shell",
os::exception_name(sig, buf, O_BUFLEN));
os::exception_name(sig, buf, O_BUFLEN)); // When comparing, ignore the SA_RESTORER flag on Linux
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this comment mean in this context ???

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing. This, like the "flags" comment before, must have been accidental paste errors. I went through the patch again, but I hope this was the last one.

}
st->print(", flags=");
int flags = act->sa_flags;
// On Linux, hide the SA_RESTORE flag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: SA_RESTORE -> SA_RESTORER

}
// Print established signal handler for this signal.
// - if this signal handler was installed by us and is chained to a pre-established user handler
// it did replace, print that one too.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/it did replace/it replaced/

Comment on lines -1389 to -1392
if ((int)sa.sa_flags != get_our_sigflags(sig)) {
st->print(
", flags was changed from " PTR32_FORMAT ", consider using jsig library",
get_our_sigflags(sig));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear to me that this check still remains somewhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this check since I did not think it very useful. We have the checking code with Xcheck:jni, which does the same check.

But I am now reconsidering. If we run without Xcheck, it still is useful. I'll re-add this test.

Comment on lines -59 to -60
static int resettedSigflags[NUM_SIGNALS];
static address resettedSighandler[NUM_SIGNALS];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not clear what happened to these ?? (I'm not clear how they were used in the first place. :( )

Copy link
Member Author

@tstuefe tstuefe Feb 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they were used for path (3). When printing signal handlers, we do this little test to check - similar to what we do in Xcheck:jni - to print out if someone changed the handler under us. See:

address rh = VMError::get_resetted_sighandler(sig);

and
if ((int)sa.sa_flags != get_our_sigflags(sig)) {

.
But when printing signal handlers as part of a hs-err file, we replaced the hotspot signal handlers with the secondary crash handler and would trigger this test. I think this whole logic is needed just to prevent this.

But I actually think this whole exercise was pointless. Since resettedHandler and resettedFlags are set before the secondary crash handler is installed. And that only happens if the primary hotspot hanlder had been invoked and we do error reporting. So I think resettedHandler can only ever have one of two values: the address of our primary handler if we are in error handling. Or NULL outside of error handling.

@tstuefe
Copy link
Member Author

tstuefe commented Feb 3, 2021

I added a printout to print_signal_handler to print out a message if the expected handler changed. This is the same logic, somewhat abridged, applied at Xcheck:jni.

Only here, it fires whenever we print signal handlers independently from Xcheck:jni. So it gets also displayed in hs-err files or as part of a VM.info jcmd.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks for the updates.

One query for further improvement below. :)

Thanks,
David

src/hotspot/os/posix/signals_posix.cpp Outdated Show resolved Hide resolved
src/hotspot/os/posix/signals_posix.cpp Outdated Show resolved Hide resolved
@openjdk
Copy link

openjdk bot commented Feb 5, 2021

@tstuefe This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8260485: Simplify and unify handler vectors in Posix signal code

Reviewed-by: dholmes, gziemski

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 29 new commits pushed to the master branch:

  • 9cf4f90: 8261473: Shenandoah: Add breakpoint support
  • c4664e6: 8261940: Fix references to IOException in BigDecimal javadoc
  • 0e9c5ae: 8075909: [TEST_BUG] The regression-swing case failed as it does not have the 'Open' button when select 'subdir' folder with NimbusLAF
  • e9f3aab: 8261912: Code IfNode::fold_compares_helper more defensively
  • fd098e7: 8261838: Shenandoah: reconsider heap region iterators memory ordering
  • f94a845: 8261600: NMT: Relax memory order for updating MemoryCounter and fix racy updating of peak values
  • 1a7adc8: 8260416: Remove unused method ReferenceProcessor::is_mt_processing_set_up()
  • 3a21e1d: 8260653: Unreachable nodes keep speculative types alive
  • b695c7e: 8261925: ProblemList com/sun/jdi/AfterThreadDeathTest.java on Linux
  • 97e1657: 8261846: [JVMCI] c2v_iterateFrames can get out of sync with the StackFrameStream
  • ... and 19 more: https://git.openjdk.java.net/jdk/compare/d547e1a84723e94813fc600c884c9cfee0ed8f57...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Feb 5, 2021
@tstuefe
Copy link
Member Author

tstuefe commented Feb 5, 2021

Looks good. Thanks for the updates.

One query for further improvement below. :)

Thanks,
David

Thanks for approval, David. Unfortunately I still have a strange build problem on one of our s390 boxes, so this is not done yet.

@tstuefe
Copy link
Member Author

tstuefe commented Feb 5, 2021

Hi David,

sorry, its not yet done.

Changes in this version:

  1. I found that on zlinux, sigaction.sa_flags is actually a long unsigned int. Which surprised me since this is clearly not posix compatible. Therefore I rewrote the code, adding a new function get_sanitized_sa_flags, which hides the correct casting and also contains the stripping away of unwanted flags set by the kernel (SA_RESTORER). As a side effect this cleanly factors out the SA_RESTORER handling which is nice.

  2. You requested that print_single_signal_handler should print the flag difference when spotting a change. I agree and expanded on this: now print_single_signal_handler will print out the full expected handler information if it spots a difference (see below for examples). This contains more information that just the flags, and is actually less code.

  3. Then I found that further code could be folded: We have a detailed printout for Xcheck:jni in check_signal_handler, but then we also print all signal handlers as part of that printout, which now (2) contains a verbose description of the found differences. Therefore the detailed printout in check_signal_handler was not needed anymore and could be removed. That also allows me to factor out the "compare sigaction structures semantically" logic into one function, called compare_handler_info. That again safes some code.

Now the logic is like this:

  • If Xcheck:jni is not set, we still store the expected signal handler info. When printing signal handlers, e.g. as part of a hs_err file, in os::print_signal_handlers, we print a description of any changed handlers to nudge the supporter into the right direction.

  • If Xcheck:jni is active, we will do our periodic checks; upon finding a difference, we do not elaborate but call os::print_signal_handlers() which will elaborate for us.

The final adjustment I had to make was to separate the "expected handler" info from the "check when Xcheck:jni is active" logic. That is because when Xcheck:jni is active and we find a mismatch, we want to disable checking for that signal in check_signal_handler but still see the difference in future printouts when calling os::print_signal_handlers, e.g. if later we crash and write a hs-err file.

I am really sorry for the added review work, but I think its a worthwhile change and complexity gets further reduced.

I thought about adding tests but testing this is quite difficult. So I tested manually by overwriting some of the hotspot signal handlers and check how the VM reacts.

First a hs-err file, good case:

659 Signal Handlers:
660    SIGSEGV: crash_handler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
661     SIGBUS: crash_handler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
662     SIGFPE: crash_handler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
663    SIGPIPE: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
664    SIGXFSZ: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
665     SIGILL: crash_handler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
666    SIGUSR2: SR_handler in libjvm.so, mask=00000000000000000000000000000000, flags=SA_RESTART|SA_SIGINFO
667     SIGHUP: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
668     SIGINT: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
669    SIGTERM: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
670    SIGQUIT: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
671    SIGTRAP: crash_handler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO

Bad cases (I changed handlers for SIGFPE, SIGPIPE, SIGXFSZ and SIGUSR2):

hs-err file:

658 Signal Handlers:
659    SIGSEGV: crash_handler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
660     SIGBUS: crash_handler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
661     SIGFPE: badwolf in libjvm.so, mask=10000000000000000000000000000000, flags=none
662   *** Handler was modified!
663   *** Expected: javaSignalHandler in libjvm.so, mask=11100100110111111111111111111110, flags=SA_RESTART|SA_SIGINFO
664    SIGPIPE: badwolf in libjvm.so, mask=10000000000000000000000000000000, flags=none
665   *** Handler was modified!
666   *** Expected: javaSignalHandler in libjvm.so, mask=11100100110111111111111111111110, flags=SA_RESTART|SA_SIGINFO
667    SIGXFSZ: badwolf in libjvm.so, mask=10000000000000000000000000000000, flags=none
668   *** Handler was modified!
669   *** Expected: javaSignalHandler in libjvm.so, mask=11100100110111111111111111111110, flags=SA_RESTART|SA_SIGINFO
670     SIGILL: crash_handler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
671    SIGUSR2: badwolf in libjvm.so, mask=10000000000000000000000000000000, flags=none
672   *** Handler was modified!
673   *** Expected: SR_handler in libjvm.so, mask=00000000000000000000000000000000, flags=SA_RESTART|SA_SIGINFO
674     SIGHUP: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
675     SIGINT: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
676    SIGTERM: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
677    SIGQUIT: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
678    SIGTRAP: crash_handler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO

jcmd VM.info:

...
Signal Handlers:
   SIGSEGV: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
    SIGBUS: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
    SIGFPE: badwolf in libjvm.so, mask=00000111001010000101001100010001, flags=none
  *** Handler was modified!
  *** Expected: javaSignalHandler in libjvm.so, mask=11100100110111111111111111111110, flags=SA_RESTART|SA_SIGINFO
   SIGPIPE: badwolf in libjvm.so, mask=00000111001010000101001100010001, flags=none
  *** Handler was modified!
  *** Expected: javaSignalHandler in libjvm.so, mask=11100100110111111111111111111110, flags=SA_RESTART|SA_SIGINFO
   SIGXFSZ: badwolf in libjvm.so, mask=00000111001010000101001100010001, flags=none
  *** Handler was modified!
  *** Expected: javaSignalHandler in libjvm.so, mask=11100100110111111111111111111110, flags=SA_RESTART|SA_SIGINFO
    SIGILL: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
   SIGUSR2: badwolf in libjvm.so, mask=00000111001010000101001100010001, flags=none
  *** Handler was modified!
  *** Expected: SR_handler in libjvm.so, mask=00000000000000000000000000000000, flags=SA_RESTART|SA_SIGINFO
    SIGHUP: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
    SIGINT: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
   SIGTERM: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
   SIGQUIT: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
   SIGTRAP: SIG_DFL, mask=00000000000000000000000000000000, flags=none

-Xcheck:jni output:

Warning: SIGFPE handler modified!                                                                                                                                                                                                                                                    
Signal Handlers:
   SIGSEGV: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
    SIGBUS: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
    SIGFPE: badwolf in libjvm.so, mask=11100100010111111101111111111110, flags=none
  *** Handler was modified!
  *** Expected: javaSignalHandler in libjvm.so, mask=11100100110111111111111111111110, flags=SA_RESTART|SA_SIGINFO
   SIGPIPE: badwolf in libjvm.so, mask=11100100010111111101111111111110, flags=none
  *** Handler was modified!
  *** Expected: javaSignalHandler in libjvm.so, mask=11100100110111111111111111111110, flags=SA_RESTART|SA_SIGINFO
   SIGXFSZ: badwolf in libjvm.so, mask=11100100010111111101111111111110, flags=none
  *** Handler was modified!
  *** Expected: javaSignalHandler in libjvm.so, mask=11100100110111111111111111111110, flags=SA_RESTART|SA_SIGINFO
    SIGILL: javaSignalHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
   SIGUSR2: badwolf in libjvm.so, mask=11100100010111111101111111111110, flags=none
  *** Handler was modified!
  *** Expected: SR_handler in libjvm.so, mask=00000000000000000000000000000000, flags=SA_RESTART|SA_SIGINFO
    SIGHUP: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
    SIGINT: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
   SIGTERM: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
   SIGQUIT: UserHandler in libjvm.so, mask=11100100010111111101111111111110, flags=SA_RESTART|SA_SIGINFO
   SIGTRAP: SIG_DFL, mask=00000000000000000000000000000000, flags=none
Consider using jsig library.
...

Thanks, Thomas

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still seems okay.

One query below.

Thanks,
David

src/hotspot/os/posix/signals_posix.cpp Outdated Show resolved Hide resolved
@tstuefe
Copy link
Member Author

tstuefe commented Feb 9, 2021

Still seems okay.

One query below.

Thanks,
David

Thank you David.

@tstuefe
Copy link
Member Author

tstuefe commented Feb 10, 2021

Gentle ping..

@gerard-ziemski
Copy link

hi Thomas, I'm interested in reviewing your fix and will work on it as soon as I'm done with #2403

(tomorrow?)

@gerard-ziemski
Copy link

I'm starting the review...

Copy link

@gerard-ziemski gerard-ziemski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not get as far as I hoped today, I'll review more tomorrow.

Comment on lines +112 to +115
bool check_signal_number(int sig) const {
assert(sig > 0 && sig < NSIG, "invalid signal number %d", sig);
return sig > 0 && sig < NSIG;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have

 assert(sig > 0 && sig < NSIG, "invalid signal number %d", sig);
 return sig > 0 && sig < NSIG;

here in SavedSignalHandlers::check_signal_number() and

#if defined(__APPLE__)
  return sig >= 1 && sig < NSIG;
#else

in is_valid_signal().

Can we combine those, something like:

// Returns true if signal number is valid.
static bool is_valid_signal(int sig) {
  assert(sig > 0 && sig < NSIG, "invalid signal number %d", sig);
  if (sig > 0 && sig < NSIG) {
    // Use sigaddset to check for signal validity.
    sigset_t set;
    sigemptyset(&set);
    if (sigaddset(&set, sig) == -1 && errno == EINVAL) {
      return false;
    } else {
      return true;
    }
  } else {
    return false;
  }
}

so then we can drop SavedSignalHandlers::check_signal_number() altogether?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to keep this. SavedSignalHandlers::check_signal_number() is just a simple bounds check to prevent memory overwriters, with an added assert for goot measure in debug builds. Its proximity to the array it guards makes this clear. SavedSignalHandlers gets only fed with know values in this file, so an assert is sufficient.

is_valid_signal() OTOH does a runtime check for signal validity. It is more expensive but only used in analysis situations. It pulls its signal number from a sigaction structure, which may be corrupt and contain whatever, so the runtime check makes more sense to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, SavedSignalHandler::_sa[NSIG] could be reduced in size to something more sensible, since we only ever feed "normal" signals into it. Same reason as NUM_IMPORTANT_SIGS. But we only talk a few kbyte here, so I did not bother.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have preferred to have a single function to check whether the signal is valid, otherwise we have 2 separate specialized API that almost do the same thing. Any future reader of the code will probably wonder again.

For code readability I'd prefer to have just one API, or at least refactor the more complex one in terms of the simpler one, and give them meaningful names to differentiate them, but OK.

@@ -357,7 +411,7 @@ struct sigaction* get_chained_signal_action(int sig) {
}
if (actp == NULL) {
// Retrieve the preinstalled signal handler from jvm
actp = get_preinstalled_handler(sig);
actp = const_cast<struct sigaction*>(chained_handlers.get(sig));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must SavedSignalHandlers::get() have the const struct in const struct sigaction* get(int sig) const signature?

If it was just struct sigaction* get(int sig) const then we wouldn't need this awkward cast.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a stickler to const since its very expressive and prevents stupid errors. const struct sigaction* get() const makes perfect sense since at no time you are supposed to modify the underlying sigaction structure. Once it is saved, it should only be used to read from (eg to print, or to extract the handler address and call it in call_chained_handler().

But old code was not const correct. The way to start being const correct without having to rewrite everything is to make new code const correct and cauterize at the interface between new and old code with these ugly but at least expressive casts.

Side note, even the sigaction function is const correct: https://pubs.opengroup.org/onlinepubs/007904875/functions/sigaction.html
see the act input parameter designating the new handler, its const. Only the output parameter oact is nonconst.

I will see if I can make the rest of the code const correct too wrt the sigaction structures. That way we can get rid of the cast and have better code as well.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can do that, then that would be ideal.

@@ -1203,16 +1217,17 @@ void set_signal_handler(int sig) {
struct sigaction oldAct;
sigaction(sig, (struct sigaction*)NULL, &oldAct);

// Query the current signal handler. Needs to be a separate operation
// from installing a new handler since we need to honor AllowUserSignalHandlers.
void* oldhand = get_signal_handler(&oldAct);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a pre-existing issue, but I dislike the "old" in oldhand and oldAct. It makes it sound like it is a cached or previous value, which is especially confusing when in the comment we refer to it as the current one. Any chance we can clean this up in this fix?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it stems from the Posix interface calling the parameters act and oact and ppl are used to that naming:

int sigaction(int sig, const struct sigaction *restrict act,
       struct sigaction *restrict oact);

That said, we use a bunch of different names, like oact, oldAct or oldSigAct, which I disliked. I wanted to unify those namings but refrained since I wanted to keep the patch small. Its difficult enough as it is.

Since this would be an easy change in itself, can we keep it for a different RFE? As for name, lets say currentAct?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love if it we could follow up on it.

Comment on lines 816 to 817
return this_handler == expected_handler &&
this_flags == expected_flags;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use brackets here like:

  return ((this_handler == expected_handler) &&
           (this_flags == expected_flags));

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add the inside brackets, but don't see the need for the outside ones, there is nothing to bracket off against.

@gerard-ziemski
Copy link

General comment: I get weird results when I try to run java -XX:ErrorHandlerTest=2 -version

With the proposed changes, I get:

oracle@dhcp-10-154-126-49 jdk % ./build/xcode/build/jdk/bin/java -XX:ErrorHandlerTest=2 -version
# To suppress the following error report, specify this argument
# after -XX: or in .hotspotrc:  SuppressErrorAt=/vmError.cpp:1785
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  


[Too many errors, abort]
zsh: abort      ./build/xcode/build/jdk/bin/java -XX:ErrorHandlerTest=2 -version

with several empty lines, which I deleted for compactness, whereas the original code produces:

oracle@Oracles-MacBook-Pro-16 jdk_orig % ./build/xcode/build/jdk/bin/java -XX:ErrorHandlerTest=2 -version
# To suppress the following error report, specify this argument
# after -XX: or in .hotspotrc:  SuppressErrorAt=/vmError.cpp:1785
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/Volumes/Work/review/2251/jdk_orig/src/hotspot/share/utilities/vmError.cpp:1785), pid=69509, tid=7427
#  guarantee(how == 0) failed: test guarantee
#
# JRE version: OpenJDK Runtime Environment (17.0) (fastdebug build 17-internal+0-adhoc.oracle.jdkorig)
# Java VM: OpenJDK 64-Bit Server VM (fastdebug 17-internal+0-adhoc.oracle.jdkorig, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, bsd-amd64)
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /Volumes/Work/review/2251/jdk_orig/hs_err_pid69509.log
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#
zsh: abort      ./build/xcode/build/jdk/bin/java -XX:ErrorHandlerTest=2 -version

so something is off.

@tstuefe
Copy link
Member Author

tstuefe commented Feb 17, 2021

java -XX:ErrorHandlerTest=2 -version

I cannot reproduce it (macos, fastdebug). What build is this? The error is weird since "too many errors, abort" should have been preceded by any number of secondary error printouts.

These kind of tests (let it crash and check if error handling works) are also executed as part of tier1 hotspot (runtime/ErrorHandling) and seem to run fine in GA actions.

@gerard-ziemski
Copy link

java -XX:ErrorHandlerTest=2 -version

I cannot reproduce it (macos, fastdebug). What build is this? The error is weird since "too many errors, abort" should have been preceded by any number of secondary error printouts.

These kind of tests (let it crash and check if error handling works) are also executed as part of tier1 hotspot (runtime/ErrorHandling) and seem to run fine in GA actions.

I'll look into why I see this and report back...

@tstuefe
Copy link
Member Author

tstuefe commented Feb 17, 2021

java -XX:ErrorHandlerTest=2 -version

I cannot reproduce it (macos, fastdebug). What build is this? The error is weird since "too many errors, abort" should have been preceded by any number of secondary error printouts.
These kind of tests (let it crash and check if error handling works) are also executed as part of tier1 hotspot (runtime/ErrorHandling) and seem to run fine in GA actions.

I'll look into why I see this and report back...

Okay, thank you. Note that I'm off the rest of the week, so I won't be able to reply until Monday.

@gerard-ziemski
Copy link

It looks like I see the issue only with Xcode build hotspot (using the actual Xcode IDE), the build produced using the command line make/compiler works fine. Digging in deeper...

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still good to me.

Thanks,
David

@tstuefe
Copy link
Member Author

tstuefe commented Feb 18, 2021

Still good to me.

Thanks,
David

Thanks, David

@gerard-ziemski
Copy link

I'm going to give thumbs up and not hold this up any longer, even though I haven't figured out yet why I see the issue with my Xcode built hotspot.

Whatever the issue, which is most likely due to how I build it, it can be handled either on my side, or if it's JDK then we can fix it in a followup.

cheers

@tstuefe
Copy link
Member Author

tstuefe commented Feb 19, 2021

Thanks Gerard and David!

About const correctness, I looked into this. Looked mostly straightforward, apart from the one fact that inside call_chained_handler, we modify the handed over sigaction structure. Which is really yucky. Especially if you consider that this function is called also for the libjsig case, where the sigaction structure is handed in from the libjsig, so we essentially modify memory in libjsig.

We do this for two reasons, one is to just have a holder for a temporary sigset, which would be trivial to fix with a temporary local variable. And then, to switch off chaining for SA_NODEFER (mimicking one shot semantics). And while looking into that, I saw that one shot semantics can never have worked reliably. I guess its like David said in other places, its a best effort thing.

The whole investigation turned out too deep for this RFE, so I leave that for another day (https://bugs.openjdk.java.net/browse/JDK-8262006).

@gerard-ziemski : I also created https://bugs.openjdk.java.net/browse/JDK-8262007 to track the renaming you wanted. Left it unassigned, feel free to grab that one.

Cheers, Thomas

/integrate

@openjdk openjdk bot closed this Feb 19, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Feb 19, 2021
@openjdk
Copy link

openjdk bot commented Feb 19, 2021

@tstuefe Since your change was applied there have been 36 commits pushed to the master branch:

  • c99eeb0: 8260858: Implementation specific property xsltcIsStandalone for XSLTC Serializer
  • 5caf686: 8261644: NMT: Simplifications and cleanups
  • ed93bc9: 8196301: java/awt/print/PrinterJob/Margins.java times out
  • 7e78c77: 8261905: Move implementation of OopStorage num_dead related functions
  • 78cde64: 8261860: Crash caused by lambda proxy class loaded in Shutdown hook
  • c158413: 8261098: Add clhsdb "findsym" command
  • 0c31d5b: 8261977: Fix comment for getPrefixed() in canonicalize_md.c
  • 9cf4f90: 8261473: Shenandoah: Add breakpoint support
  • c4664e6: 8261940: Fix references to IOException in BigDecimal javadoc
  • 0e9c5ae: 8075909: [TEST_BUG] The regression-swing case failed as it does not have the 'Open' button when select 'subdir' folder with NimbusLAF
  • ... and 26 more: https://git.openjdk.java.net/jdk/compare/d547e1a84723e94813fc600c884c9cfee0ed8f57...master

Your commit was automatically rebased without conflicts.

Pushed as commit 7e2c909.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@tstuefe tstuefe deleted the JDK-8260485-signal-handler-improvements branch March 6, 2021 05:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated
3 participants