Skip to content

JDK-8257828: SafeFetch may crash if invoked in non-JavaThreads #1695

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

tstuefe
Copy link
Member

@tstuefe tstuefe commented Dec 8, 2020

In our primary hotspot signal handlers, SafeFetch handling is limited to JavaThread objects:

  JavaThread* thread = NULL;
...
  if(t->is_Java_thread()) {
    thread = (JavaThread*)t;
  }
...
  if (info != NULL && uc != NULL && thread != NULL) {
    pc = (address) os::Linux::ucontext_get_pc(uc);
    if (StubRoutines::is_safefetch_fault(pc)) {

As a result of this, using SafeFetch may crash non-JavaThreads if the location is invalid. E.g. using SafeFetch inside a VMOperation may crash the VM.

This is unfortunate since SafeFetch is used for os::is_readable_pointer() which explicitly promises to not crash. It is used e.g. in os::print_hex_dump(). There is also no reason why SafeFetch would not work for non-JavaThreads. In fact, SafeFetch handling for the secondary signal handler works just fine for all threads.


The patch makes handling of SafeFetch faults independent on whether the crashing thread is a JavaThread (indeed, whether we have a current Thread at all). This had been the case for AIX and Linux ppc, s390 before, since we already fixed this issue for our platform, so we know this works.

I also hauled the SafeFetch handling out of the platform dependent part of the signal handler into the generic signal handler. This removes some duplicate coding.

To be consistent, I moved the SafeFetch handling for Zero up into the generic signal handler too. Zero did not have a problem, but this reduces code.

I added a gtest which reproduces the issue and used that to check that the patch works.

Thanks, Thomas


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8257828: SafeFetch may crash if invoked in non-JavaThreads

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/1695/head:pull/1695
$ git checkout pull/1695

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 8, 2020

👋 Welcome back stuefe! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Dec 8, 2020

@tstuefe The following label will be automatically applied to this pull request:

  • hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-runtime hotspot-runtime-dev@openjdk.org label Dec 8, 2020
@tstuefe tstuefe marked this pull request as ready for review December 8, 2020 14:38
@openjdk openjdk bot added the rfr Pull request is ready for review label Dec 8, 2020
@mlbridge
Copy link

mlbridge bot commented Dec 8, 2020

Webrevs

Copy link
Contributor

@TheRealMDoerr TheRealMDoerr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing it!
Good that we have signals_posix.cpp, now.
Seems like os::min_page_size() is an address which is not readable on any OS we have. So test looks also good to me.

@openjdk
Copy link

openjdk bot commented Dec 9, 2020

@tstuefe This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8257828: SafeFetch may crash if invoked in non-JavaThreads

Reviewed-by: mdoerr, kbarrett, coleenp, dholmes

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 79 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 9, 2020
Copy link

@kimbarrett kimbarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (info != NULL && uc != NULL && thread != NULL) {

So the cause of the problem is the thread != NULL in that test. And just
removing all of those would fix the problem.

But I agree that eliminating all the (near) copied variants and merging this
functionality into signals_posix is good.

This is unfortunate since SafeFetch is used [...]

Another place this hits is OopStorage, which uses SafeFetchN in two places.
Correct code shouldn't have a problem with one of them because the access
won't be invalid (if code is correct). But the other could be a problem,
though I think not seen in practice.

Might there be any similar problem on Windows?

This might be kind of related?
https://bugs.openjdk.java.net/browse/JDK-8185734

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Moving this to the posix signal handler is so nice.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Thomas,
This looks good - I like the common handling (though I dislike the need for the zero case :( ).
Thanks,
David

@tstuefe
Copy link
Member Author

tstuefe commented Dec 12, 2020

Hi,

I made this work on Windows too (thanks @kimbarrett for reminding me). This involved fixing HandleException, which is
roughly the pendant of ucontext_set_pc() for Windows. I also combined dumplicate coding for the three Windows architectures.

I excluded the os_safefetch_negative gtest for Windows, that won't work as long as we miss SEH for gtests, see https://bugs.openjdk.java.net/browse/JDK-8185734. This is not a big deal though since the os_safefetch_negative_at_safepoint gtest still works, which tests SafeFetch in the VMThread. I experimented with locally adding a SEH catcher around the SafeFetch call in the test, but I removed that coding because it introduced too much complexity for little gain.

I also fixed a small issue for AIX where the zero page is readable so we need another guaranteed-to-be-invalid address. Since we need invalid addresses in a number of places in text code, I plan to centralize this definition at some point.

Thanks, Thomas

Copy link

@kimbarrett kimbarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New Windows changes look okay to me, but I'm not an expert in that area.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One query, but changes can go in as-is.
Thanks,
David

Comment on lines +2257 to +2258
if (thread != nullptr && thread->is_Java_thread()) {
thread->as_Java_thread()->set_saved_exception_pc((address)(DWORD_PTR)exceptionInfo->ContextRecord->PC_NAME);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a full fix? If the thread is not a JavaThread we no longer hit a problem in as_Java_thread() but is doing nothing for a non-JavaThread actually the right thing to do?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The saved_exception_pc mechanism is only implemented for JavaThreads. But all threads go through the code below, where we change the pc in the context structure.

@mlbridge
Copy link

mlbridge bot commented Dec 14, 2020

Mailing list message from David Holmes on hotspot-runtime-dev:

On 14/12/2020 3:24 pm, Thomas Stuefe wrote:

On Mon, 14 Dec 2020 04:44:43 GMT, David Holmes <dholmes at openjdk.org> wrote:

Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:

Make SafeFetch work on Windows + AIX fix

src/hotspot/os/windows/os_windows.cpp line 2258:

2256: // Save pc in thread
2257: if (thread != nullptr && thread->is_Java_thread()) {
2258: thread->as_Java_thread()->set_saved_exception_pc((address)(DWORD_PTR)exceptionInfo->ContextRecord->PC_NAME);

Is this a full fix? If the thread is not a JavaThread we no longer hit a problem in as_Java_thread() but is doing nothing for a non-JavaThread actually the right thing to do?

The saved_exception_pc mechanism is only implemented for JavaThreads. But all threads go through the code below, where we change the pc in the context structure.

My query is whether it makes sense for only JavaThreads to have this
saved (C++?) exception pc? What role does it play and why would it not
be applicable to all threads? (Sorry bit of a digression but trying to
understand the bigger picture.)

Thanks,
David

@tstuefe
Copy link
Member Author

tstuefe commented Dec 14, 2020

Mailing list message from David Holmes on hotspot-runtime-dev:

On 14/12/2020 3:24 pm, Thomas Stuefe wrote:

On Mon, 14 Dec 2020 04:44:43 GMT, David Holmes wrote:

Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:
Make SafeFetch work on Windows + AIX fix

src/hotspot/os/windows/os_windows.cpp line 2258:

2256: // Save pc in thread
2257: if (thread != nullptr && thread->is_Java_thread()) {
2258: thread->as_Java_thread()->set_saved_exception_pc((address)(DWORD_PTR)exceptionInfo->ContextRecord->PC_NAME);

Is this a full fix? If the thread is not a JavaThread we no longer hit a problem in as_Java_thread() but is doing nothing for a non-JavaThread actually the right thing to do?

The saved_exception_pc mechanism is only implemented for JavaThreads. But all threads go through the code below, where we change the pc in the context structure.

My query is whether it makes sense for only JavaThreads to have this
saved (C++?) exception pc? What role does it play and why would it not
be applicable to all threads? (Sorry bit of a digression but trying to
understand the bigger picture.)

Thanks,
David

IIUC, JavaThread::_saved_exception_pc contains the pc from which a safepoint was triggered, and to which we may want to return once the safepoint was handled. Safepoint handling only works for java threads, so limiting this to JavaThread makes sense.

@mlbridge
Copy link

mlbridge bot commented Dec 14, 2020

Mailing list message from David Holmes on hotspot-runtime-dev:

On 14/12/2020 6:34 pm, Thomas Stuefe wrote:

On Mon, 14 Dec 2020 04:46:25 GMT, David Holmes <dholmes at openjdk.org> wrote:

Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:

Make SafeFetch work on Windows + AIX fix

One query, but changes can go in as-is.
Thanks,
David

_Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on [hotspot-runtime-dev](mailto:hotspot-runtime-dev at openjdk.java.net):_

On 14/12/2020 3:24 pm, Thomas Stuefe wrote:

On Mon, 14 Dec 2020 04:44:43 GMT, David Holmes <dholmes at openjdk.org> wrote:

Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:
Make SafeFetch work on Windows + AIX fix

src/hotspot/os/windows/os_windows.cpp line 2258:

2256: // Save pc in thread
2257: if (thread != nullptr && thread->is_Java_thread()) {
2258: thread->as_Java_thread()->set_saved_exception_pc((address)(DWORD_PTR)exceptionInfo->ContextRecord->PC_NAME);

Is this a full fix? If the thread is not a JavaThread we no longer hit a problem in as_Java_thread() but is doing nothing for a non-JavaThread actually the right thing to do?

The saved_exception_pc mechanism is only implemented for JavaThreads. But all threads go through the code below, where we change the pc in the context structure.

My query is whether it makes sense for only JavaThreads to have this
saved (C++?) exception pc? What role does it play and why would it not
be applicable to all threads? (Sorry bit of a digression but trying to
understand the bigger picture.)

Thanks,
David

IIUC, JavaThread::_saved_exception_pc contains the pc from which a safepoint was triggered, and to which we may want to return once the safepoint was handled. Safepoint handling only works for java threads, so limiting this to JavaThread makes sense.

I see now that it relates to the safepoint polling page exception.

Thanks,
David
-----

@tstuefe
Copy link
Member Author

tstuefe commented Dec 15, 2020

/integrate

@openjdk openjdk bot closed this Dec 15, 2020
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Dec 15, 2020
@openjdk
Copy link

openjdk bot commented Dec 15, 2020

@tstuefe Since your change was applied there have been 99 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

Pushed as commit 3ab1dfe.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@tstuefe tstuefe deleted the JDK-8257828-safefetch-in-non-java-threads branch January 15, 2021 07:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-runtime hotspot-runtime-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

5 participants