Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8260341: CDS dump VM init code does not check exceptions #2494

Conversation

iklam
Copy link
Member

@iklam iklam commented Feb 10, 2021

When CDS dumping is enabled, special initialization happens during VM init. However, many of these calls do not properly check for exception. Instead, they rely on the implicit knowledge that metaspace::allocate() will exit the VM when allocation fails during CDS dumping. This makes the code hard to understand and tightly coupled to metaspace::allocate().

The fix is: all code that makes allocation should be using CHECK macros, so each block of code can be individually understood without considering the behavior of metaspace::allocate().

I added TRAPS to a bunch of CDS-related functions that are called during VM init. In some cases, I changed Thread* THREAD to TRAPS. This also eliminated a few Thread* THREAD = Thread::current() calls.

The "root" of these calls, such as MetaspaceShared::prepare_for_dumping(), now follows this pattern:

  EXCEPTION_MARK;
  ClassLoader::initialize_shared_path(THREAD);
  if (HAS_PENDING_EXCEPTION) {
    java_lang_Throwable::print(PENDING_EXCEPTION, tty);
    vm_exit_during_initialization("ClassLoader::initialize_shared_path() failed unexpectedly");
  }

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8260341: CDS dump VM init code does not check exceptions

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/2494/head:pull/2494
$ git checkout pull/2494

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 10, 2021

👋 Welcome back iklam! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Feb 10, 2021

@iklam The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Feb 10, 2021
@iklam iklam marked this pull request as ready for review February 10, 2021 05:14
@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 10, 2021
@mlbridge
Copy link

mlbridge bot commented Feb 10, 2021

Webrevs

@hseigel
Copy link
Member

hseigel commented Feb 10, 2021

Hi Ioi,
Do you avoid using CHECK if it's the last line of a function? For example, why is THREAD used instead of CHECK at line 1506?
Thanks, Harold

1503 void ClassLoader::initialize_module_path(TRAPS) {
1504 if (Arguments::is_dumping_archive()) {
1505 ClassLoaderExt::setup_module_paths(CHECK);
1506 FileMapInfo::allocate_shared_path_table(THREAD);
1507 }
1508 }

@iklam
Copy link
Member Author

iklam commented Feb 10, 2021

Hi Ioi,
Do you avoid using CHECK if it's the last line of a function? For example, why is THREAD used instead of CHECK at line 1506?
Thanks, Harold

1503 void ClassLoader::initialize_module_path(TRAPS) {
1504 if (Arguments::is_dumping_archive()) {
1505 ClassLoaderExt::setup_module_paths(CHECK);
1506 FileMapInfo::allocate_shared_path_table(THREAD);
1507 }
1508 }

I thought it was a commonly used coding convention, but I could only find a few cases where the code wasn't written by me :-(

I will go back to CHECK, since the C++ compiler will elide the last CHECK anyway: in both cases, gcc compiles the last call to a direct branch to FileMapInfo::allocate_shared_path_table (i.e., a tail call).

Using CHECK makes the code easier to maintain (if you add new code below it).

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What Harold said. We use THREAD if the call is the last call in the function because I thought there was a tail call problem with one of the compilers once. I think this was the bug I was thinking about and it's only in the return statement:
https://bugs.openjdk.java.net/browse/JDK-6889002
If you verified that the HAS_PENDING_EXCEPTION check evaporates, that's fine then, and better to have CHECK. As you say, someone might add some code after it.

@openjdk
Copy link

openjdk bot commented Feb 10, 2021

@iklam This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8260341: CDS dump VM init code does not check exceptions

Reviewed-by: coleenp, hseigel

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been no new commits pushed to the master branch. If another commit should be pushed before you perform the /integrate command, your PR will be automatically rebased. If you prefer to avoid any potential automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Feb 10, 2021
Copy link
Member

@hseigel hseigel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good. Thanks!
Harold

@iklam
Copy link
Member Author

iklam commented Feb 10, 2021

What Harold said. We use THREAD if the call is the last call in the function because I thought there was a tail call problem with one of the compilers once. I think this was the bug I was thinking about and it's only in the return statement:
https://bugs.openjdk.java.net/browse/JDK-6889002
If you verified that the HAS_PENDING_EXCEPTION check evaporates, that's fine then, and better to have CHECK. As you say, someone might add some code after it.

I think the problem in JDK-6889002 is only with using CHECK in a return statement like this that produces unreachable source code.

    return foobar(1, 2, 3, CHECK_0);

->

    return foobar(1, 2, 3, THREAD);
    if (HAS_PENDING_EXCEPTION) return 0;

The cases that I have changed in this PR are functions with void returns.

void f(TRAPS)  {
    g(1, 2, 3, CHECK);
}

would be expanded to

void f(TRAPS)  {
    g(1, 2, 3, THREAD);
    if (HAS_PENDING_EXCEPTION) return;
}

I suppose any C++ compiler that can compile HotSpot will elide the if line.

@iklam
Copy link
Member Author

iklam commented Feb 11, 2021

Thanks @hseigel and @coleenp for the review!
/integrate

@openjdk openjdk bot closed this Feb 11, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Feb 11, 2021
@openjdk
Copy link

openjdk bot commented Feb 11, 2021

@iklam Pushed as commit adca84c.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@tstuefe
Copy link
Member

tstuefe commented Feb 11, 2021

Could we then get rid of the special CDS handling in Metaspace::allocate():

if (result == NULL) {
if (DumpSharedSpaces) {
// CDS dumping keeps loading classes, so if we hit an OOM we probably will keep hitting OOM.
// We should abort to avoid generating a potentially bad archive.
vm_exit_during_cds_dumping(err_msg("Failed allocating metaspace object type %s of size " SIZE_FORMAT ". CDS dump aborted.",
MetaspaceObj::type_name(type), word_size * BytesPerWord),
err_msg("Please increase MaxMetaspaceSize (currently " SIZE_FORMAT " bytes).", MaxMetaspaceSize));
}
report_metadata_oome(loader_data, word_size, type, mdtype, THREAD);
assert(HAS_PENDING_EXCEPTION, "sanity");
return NULL;
}

?

@iklam
Copy link
Member Author

iklam commented Feb 11, 2021

Could we then get rid of the special CDS handling in Metaspace::allocate():

if (result == NULL) {
if (DumpSharedSpaces) {
// CDS dumping keeps loading classes, so if we hit an OOM we probably will keep hitting OOM.
// We should abort to avoid generating a potentially bad archive.
vm_exit_during_cds_dumping(err_msg("Failed allocating metaspace object type %s of size " SIZE_FORMAT ". CDS dump aborted.",
MetaspaceObj::type_name(type), word_size * BytesPerWord),
err_msg("Please increase MaxMetaspaceSize (currently " SIZE_FORMAT " bytes).", MaxMetaspaceSize));
}
report_metadata_oome(loader_data, word_size, type, mdtype, THREAD);
assert(HAS_PENDING_EXCEPTION, "sanity");
return NULL;
}

I filed https://bugs.openjdk.java.net/browse/JDK-8261551 . We need to fix JDK-8261480 first, since we still have some inappropriate use of THREAD during dumping. I split the work into several steps so that the review can be easier.

@tstuefe
Copy link
Member

tstuefe commented Feb 11, 2021

Could we then get rid of the special CDS handling in Metaspace::allocate():

if (result == NULL) {
if (DumpSharedSpaces) {
// CDS dumping keeps loading classes, so if we hit an OOM we probably will keep hitting OOM.
// We should abort to avoid generating a potentially bad archive.
vm_exit_during_cds_dumping(err_msg("Failed allocating metaspace object type %s of size " SIZE_FORMAT ". CDS dump aborted.",
MetaspaceObj::type_name(type), word_size * BytesPerWord),
err_msg("Please increase MaxMetaspaceSize (currently " SIZE_FORMAT " bytes).", MaxMetaspaceSize));
}
report_metadata_oome(loader_data, word_size, type, mdtype, THREAD);
assert(HAS_PENDING_EXCEPTION, "sanity");
return NULL;
}

I filed https://bugs.openjdk.java.net/browse/JDK-8261551 . We need to fix JDK-8261480 first, since we still have some inappropriate use of THREAD during dumping. I split the work into several steps so that the review can be easier.

Okay, thanks.

@mlbridge
Copy link

mlbridge bot commented Feb 17, 2021

Mailing list message from David Holmes on hotspot-dev:

On 11/02/2021 4:13 am, Ioi Lam wrote:

On Wed, 10 Feb 2021 14:33:50 GMT, Harold Seigel <hseigel at openjdk.org> wrote:

Hi Ioi,
Do you avoid using CHECK if it's the last line of a function? For example, why is THREAD used instead of CHECK at line 1506?
Thanks, Harold

1503 void ClassLoader::initialize_module_path(TRAPS) {
1504 if (Arguments::is_dumping_archive()) {
1505 ClassLoaderExt::setup_module_paths(CHECK);
1506 FileMapInfo::allocate_shared_path_table(THREAD);
1507 }
1508 }

I thought it was a commonly used coding convention, but I could only find a few cases where the code wasn't written by me :-(

It is something we have been gradually fixing. I'm made a number of
requests to remove CHECK from code that will return anyway.

- https://github.com/openjdk/jdk/blame/b9d4211bc1aa92e257ddfe86c7a2b4e4e60598a0/src/hotspot/share/prims/jvm.cpp#L311
- https://github.com/openjdk/jdk/blame/f03e839e481f905358ce7d95a5d1f5179e7f46fe/src/hotspot/share/classfile/javaClasses.cpp#L2415

I will go back to `CHECK`, since the C++ compiler will elide the last `CHECK` anyway: in both cases, gcc compiles the last call to a direct branch to FileMapInfo::allocate_shared_path_table (i.e., a tail call).

Do all our C++ compilers do this? If they do that is great, but if they
only may perhaps sometimes do then not so great.

Using `CHECK` makes the code easier to maintain (if you add new code below it).

I prefer not to rely on the C++ compiler to remove the exception
checking logic and use THREAD in such cases.

David
-----

@mlbridge
Copy link

mlbridge bot commented Feb 17, 2021

Mailing list message from Ioi Lam on hotspot-dev:

Converting this from a PR discussion
(https://git.openjdk.java.net/jdk/pull/2494) to a regular mail.

What are people's opinion of:

??? void bar(TRAPS);

??? void foo(TRAPS) {
? ?? ? bar(CHECK);
??? }

vs

??? void foo(TRAPS) {
? ?? ? bar(THREAD);
??? }

There's no mention of this in
https://github.com/openjdk/jdk/blob/master/doc/hotspot-style.md

Advantage of CHECK:

- More readable -- you don't need to ask yourself:
? does the callee need a THREAD, or is the callee a TRAPS function?

- More maintainable. You don't accidentally miss a check if you add new
? code below

??? void foo(TRAPS) {
? ?? ? bar(THREAD);
? ? ?? baz();?????? // adding a new call ....
??? }

Note that we MUST use THREAD when returning a value (see
https://bugs.openjdk.java.net/browse/JDK-6889002)

??? int x(TRAPS);

??? int y(TRAPS) {
?????? return x(THREAD);
??? }

so there's inconsistency. However, the compiler will given an error if
you add code below the THREAD. So we don't have the maintenance issue as
void functions:

??? int Y(TRAPS) {
?????? return X(THREAD);
?????? baz();
??? }

Disadvantage of CHECK:

- It's not guaranteed that the C compiler will elide it. The code gets
pre-processed to

??? inlined bool ThreadShadow::has_pending_exception() const {
?? ?? ? return _pending_exception != NULL;
??? }

??? void foo(Thread*? __the_thread__) {
? ????? bar(__the_thread__);
??????? if (((ThreadShadow*)__the_thread__)->has_pending_exception())
return;
??? }

Is it safe to assume that any C compiler that can efficiently compile
HotSpot will always elide the "if" line?

I am a little worried about the maintenance issue. If we really want to
avoid the CHECK, I would prefer to have a new macro like:

??? void foo(TRAPS) {
?????? bar(CHECK_AT_RETURN);
??? }

which will be preprocessed to

??? void foo(....) {
?????? bar(_thread__); return;
??? }

So you can't accidentally add code below it.

Thanks
?-Ioi

@mlbridge
Copy link

mlbridge bot commented Feb 18, 2021

Mailing list message from David Holmes on hotspot-dev:

Hi Ioi,

CHECK at the end of a void function

This isn't really about void functions, nor check at the end. It is
about using CHECK/CHECK_* on a call that is immediately followed by a
return from the current function as it degenerates to:

if (EXCEPTION_OCCURRED)
return;
else
return;

On 18/02/2021 8:16 am, Ioi Lam wrote:

Thanks for doing that.

What are people's opinion of:

??? void bar(TRAPS);

??? void foo(TRAPS) {
? ?? ? bar(CHECK);
??? }

vs

??? void foo(TRAPS) {
? ?? ? bar(THREAD);
??? }

There's no mention of this in
https://github.com/openjdk/jdk/blob/master/doc/hotspot-style.md

Advantage of CHECK:

- More readable -- you don't need to ask yourself:
? does the callee need a THREAD, or is the callee a TRAPS function?

But you also don't need to ask yourself that because it doesn't matter
when the next action is to return anyway.

- More maintainable. You don't accidentally miss a check if you add new
? code below

??? void foo(TRAPS) {
? ?? ? bar(THREAD);
? ? ?? baz();?????? // adding a new call ....
??? }

True but I would argue that you need to think about the behaviour of bar
when adding baz() regardless. This might be wrong:

bar(CHECK);
baz(); // <= critical code that must always be executed no matter what!

Note that we MUST use THREAD when returning a value (see
https://bugs.openjdk.java.net/browse/JDK-6889002)

??? int x(TRAPS);

??? int y(TRAPS) {
?????? return x(THREAD);
??? }

I think this is an anti-pattern and we should prefer the more explicit:

RetType ret = x(CHECK_*);
return ret;

if we want to emphasize use of CHECK.

so there's inconsistency. However, the compiler will given an error if
you add code below the THREAD. So we don't have the maintenance issue as
void functions:

??? int Y(TRAPS) {
?????? return X(THREAD);
?????? baz();
??? }

Don't quite follow that as you wouldn't write anything after a return
statement anyway.

Disadvantage of CHECK:

- It's not guaranteed that the C compiler will elide it. The code gets
pre-processed to

??? inlined bool ThreadShadow::has_pending_exception() const {
?? ?? ? return _pending_exception != NULL;
??? }

??? void foo(Thread*? __the_thread__) {
? ????? bar(__the_thread__);
??????? if (((ThreadShadow*)__the_thread__)->has_pending_exception())
return;
??? }

Is it safe to assume that any C compiler that can efficiently compile
HotSpot will always elide the "if" line?

I've no idea, but if we can rely on it then I'm okay with always using
CHECK. It was the redundant code execution that was my concern.

I am a little worried about the maintenance issue. If we really want to
avoid the CHECK, I would prefer to have a new macro like:

??? void foo(TRAPS) {
?????? bar(CHECK_AT_RETURN);
??? }

which will be preprocessed to

??? void foo(....) {
?????? bar(_thread__); return;
??? }

So you can't accidentally add code below it.

I agree that a new macro might be preferable than just using THREAD.
Coming up with a short but meaningful name will be a problem. :) The
point is we are not actually checking anything in this case.

Thanks,
David

Thanks
?-Ioi

@mlbridge
Copy link

mlbridge bot commented Feb 18, 2021

Mailing list message from Coleen Phillimore on hotspot-dev:

My preference is to keep THREAD as an argument if you were going to use
CHECK for the last statement of a function.? You have to do this if
you're returning with a function that takes TRAPS. ie:
? return my_fun(THREAD);

For the most part, I don't think it matters if the compiler can optimize
it away.? Do our compilers optimize away the extra check for pending
exception?? I don't know if you answered this.

Lastly, please no, I don't want to see yet another macro for this
special case.

Coleen

On 2/17/21 8:55 PM, David Holmes wrote:

Hi Ioi,

CHECK at the end of a void function

This isn't really about void functions, nor check at the end. It is
about using CHECK/CHECK_* on a call that is immediately followed by a
return from the current function as it degenerates to:

if (EXCEPTION_OCCURRED)
? return;
else
? return;

On 18/02/2021 8:16 am, Ioi Lam wrote:

Thanks for doing that.

What are people's opinion of:

???? void bar(TRAPS);

???? void foo(TRAPS) {
?? ?? ? bar(CHECK);
???? }

vs

???? void foo(TRAPS) {
?? ?? ? bar(THREAD);
???? }

There's no mention of this in
https://github.com/openjdk/jdk/blob/master/doc/hotspot-style.md

Advantage of CHECK:

- More readable -- you don't need to ask yourself:
?? does the callee need a THREAD, or is the callee a TRAPS function?

But you also don't need to ask yourself that because it doesn't matter
when the next action is to return anyway.

- More maintainable. You don't accidentally miss a check if you add new
?? code below

???? void foo(TRAPS) {
?? ?? ? bar(THREAD);
?? ? ?? baz();?????? // adding a new call ....
???? }

True but I would argue that you need to think about the behaviour of
bar when adding baz() regardless. This might be wrong:

bar(CHECK);
baz(); // <= critical code that must always be executed no matter what!

Note that we MUST use THREAD when returning a value (see
https://bugs.openjdk.java.net/browse/JDK-6889002)

???? int x(TRAPS);

???? int y(TRAPS) {
??????? return x(THREAD);
???? }

I think this is an anti-pattern and we should prefer the more explicit:

RetType ret = x(CHECK_*);
return ret;

if we want to emphasize use of CHECK.

so there's inconsistency. However, the compiler will given an error
if you add code below the THREAD. So we don't have the maintenance
issue as void functions:

???? int Y(TRAPS) {
??????? return X(THREAD);
??????? baz();
???? }

Don't quite follow that as you wouldn't write anything after a return
statement anyway.

Disadvantage of CHECK:

- It's not guaranteed that the C compiler will elide it. The code
gets pre-processed to

???? inlined bool ThreadShadow::has_pending_exception() const {
??? ?? ? return _pending_exception != NULL;
???? }

???? void foo(Thread*? __the_thread__) {
?? ????? bar(__the_thread__);
???????? if
(((ThreadShadow*)__the_thread__)->has_pending_exception()) return;
???? }

Is it safe to assume that any C compiler that can efficiently compile
HotSpot will always elide the "if" line?

I've no idea, but if we can rely on it then I'm okay with always using
CHECK. It was the redundant code execution that was my concern.

I am a little worried about the maintenance issue. If we really want
to avoid the CHECK, I would prefer to have a new macro like:

???? void foo(TRAPS) {
??????? bar(CHECK_AT_RETURN);
???? }

which will be preprocessed to

???? void foo(....) {
??????? bar(_thread__); return;
???? }

So you can't accidentally add code below it.

I agree that a new macro might be preferable than just using THREAD.
Coming up with a short but meaningful name will be a problem. :) The
point is we are not actually checking anything in this case.

Thanks,
David

Thanks
??-Ioi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

4 participants