Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8257140: Crash in JvmtiTagMap::flush_object_free_events() #1539

Closed
wants to merge 3 commits into from

Conversation

@coleenp
Copy link
Contributor

@coleenp coleenp commented Dec 1, 2020

The JvmtiTagMap can be deleted while it is posting. The JvmtiEnv is still on the list but the env is "disposed" of and the tag_map is deleted to save memory until the JvmtiEnv can be reclaimed at a safepoint and taken off the list.

There's a check JvmtiEnvBase::check_for_periodic_clean_up() that determines whether the JvmtiEnv can be taken off the list and that check looks for whether threads are in the middle of JvmtiEnvIterator which sets Thread::jvmti_env_iteration_count. So this part is safe for the JvmtiTagMap changes.

The fix is to clear the JvmtiTagMapTable of the entries inside the table lock to save memory but leave JvmtiTagMap intact so that we can load a valid pointer when iterating through JvmtiEnvIterator. We could also not do this at all and let the JvmtiTagMap destructor delete the table when the JvmtiEnv is also deleted, but we don't know what customer situation led to this code in the first place.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8257140: Crash in JvmtiTagMap::flush_object_free_events()

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/1539/head:pull/1539
$ git checkout pull/1539

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Dec 1, 2020

👋 Welcome back coleenp! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr label Dec 1, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Dec 1, 2020

@coleenp The following labels will be automatically applied to this pull request:

  • hotspot
  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@coleenp
Copy link
Contributor Author

@coleenp coleenp commented Dec 1, 2020

Test passed tier1-3 with no failures.

@mlbridge
Copy link

@mlbridge mlbridge bot commented Dec 1, 2020

Webrevs

@dcubed-ojdk
Copy link
Member

@dcubed-ojdk dcubed-ojdk commented Dec 1, 2020

Test passed tier1-3 with no failures.

Tier5 and Tier6 have more JPDA testing (JVM/TI, JDWP, JDI, ...)

@coleenp
Copy link
Contributor Author

@coleenp coleenp commented Dec 1, 2020

tier 4,5,6 pass with no failures.

Copy link
Contributor

@sspitsyn sspitsyn left a comment

Hi Coleen,

The JVMTI environment can be disposed with DisposeEnvironment:
https://docs.oracle.com/en/java/javase/15/docs/specs/jvmti.html#DisposeEnvironment
It seems to me that this operation is better to wait until posting is finished. Otherwise, the information is going to be lost in the report. Of course, we could blame the agent to call the DisposeEnvironment too early but then the agent needs a way to check if posting is finished.

Thanks,
Serguei

@mlbridge
Copy link

@mlbridge mlbridge bot commented Dec 2, 2020

Mailing list message from Coleen Phillimore on hotspot-dev:

On 12/1/20 7:12 PM, Serguei Spitsyn wrote:

On Tue, 1 Dec 2020 14:19:15 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

The JvmtiTagMap can be deleted while it is posting. The JvmtiEnv is still on the list but the env is "disposed" of and the tag_map is deleted to save memory until the JvmtiEnv can be reclaimed at a safepoint and taken off the list.

There's a check JvmtiEnvBase::check_for_periodic_clean_up() that determines whether the JvmtiEnv can be taken off the list and that check looks for whether threads are in the middle of JvmtiEnvIterator which sets Thread::jvmti_env_iteration_count. So this part is safe for the JvmtiTagMap changes.

The fix is to clear the JvmtiTagMapTable of the entries inside the table lock to save memory but leave JvmtiTagMap intact so that we can load a valid pointer when iterating through JvmtiEnvIterator. We could also not do this at all and let the JvmtiTagMap destructor delete the table when the JvmtiEnv is also deleted, but we don't know what customer situation led to this code in the first place.
Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:

- Merge branch 'master' into more-jvmti-table
- Rename JvmtiTagMap::clear() and fix JvmtiTagMapTable::clear() to null out deleted entries.
- 8257140: Crash in JvmtiTagMap::flush_object_free_events()
Hi Coleen,

The JVMTI environment can be disposed with DisposeEnvironment:
https://docs.oracle.com/en/java/javase/15/docs/specs/jvmti.html#DisposeEnvironment
It seems to me that this operation is better to wait until posting is finished. Otherwise, the information is going to be lost in the report. Of course, we could blame the agent to call the DisposeEnvironment too early but then the agent needs a way to check if posting is finished.

When the event is disposed, it calls set_event_callbacks(), which calls
flush_object_free_events, which will flush the remaining object free
events for this environment.? So the events aren't lost.? The bug was
that after that, the JvmtiEnv is marked as DISPOSED but it's still on
the list of JvmtiEnvIterator events that we walk until the next
safepoint.? When the JvmtiEnv is disposed, it was deleting the tag_map
but leaving it on the list, so later calls to flush_object_free_events
was getting a bad tag_map pointer in the JvmtiEnv.

Coleen

@mlbridge
Copy link

@mlbridge mlbridge bot commented Dec 2, 2020

Mailing list message from Coleen Phillimore on hotspot-dev:

Including serviceability-dev.

On 12/1/20 9:03 PM, Coleen Phillimore wrote:

On 12/1/20 7:12 PM, Serguei Spitsyn wrote:

On Tue, 1 Dec 2020 14:19:15 GMT, Coleen Phillimore
<coleenp at openjdk.org> wrote:

The JvmtiTagMap can be deleted while it is posting.? The JvmtiEnv
is still on the list but the env is "disposed" of and the tag_map
is deleted to save memory until the JvmtiEnv can be reclaimed at a
safepoint and taken off the list.

There's a check JvmtiEnvBase::check_for_periodic_clean_up() that
determines whether the JvmtiEnv can be taken off the list and that
check looks for whether threads are in the middle of
JvmtiEnvIterator which sets Thread::jvmti_env_iteration_count.? So
this part is safe for the JvmtiTagMap changes.

The fix is to clear the JvmtiTagMapTable of the entries inside the
table lock to save memory but leave JvmtiTagMap intact so that we
can load a valid pointer when iterating through JvmtiEnvIterator.?
We could also not do this at all and let the JvmtiTagMap destructor
delete the table when the JvmtiEnv is also deleted, but we don't
know what customer situation led to this code in the first place.
Coleen Phillimore has updated the pull request with a new target
base due to a merge or a rebase. The incremental webrev excludes the
unrelated changes brought in by the merge/rebase. The pull request
contains three additional commits since the last revision:

? - Merge branch 'master' into more-jvmti-table
? - Rename JvmtiTagMap::clear() and fix JvmtiTagMapTable::clear() to
null out deleted entries.
? - 8257140: Crash in JvmtiTagMap::flush_object_free_events()
Hi Coleen,

The JVMTI environment can be disposed with DisposeEnvironment:
https://docs.oracle.com/en/java/javase/15/docs/specs/jvmti.html#DisposeEnvironment
It seems to me that this operation is better to wait until posting is
finished. Otherwise, the information is going to be lost in the
report. Of course, we could blame the agent to call the
DisposeEnvironment too early but then the agent needs a way to check
if posting is finished.

When the event is disposed, it calls set_event_callbacks(), which
calls flush_object_free_events, which will flush the remaining object
free events for this environment.? So the events aren't lost.? The bug
was that after that, the JvmtiEnv is marked as DISPOSED but it's still
on the list of JvmtiEnvIterator events that we walk until the next
safepoint.? When the JvmtiEnv is disposed, it was deleting the tag_map
but leaving it on the list, so later calls to flush_object_free_events
was getting a bad tag_map pointer in the JvmtiEnv.

Coleen

Copy link
Contributor

@sspitsyn sspitsyn left a comment

This is reasonable. Thank you for the explanation!
The fix looks good to me.

When the event is disposed, it calls set_event_callbacks()...
You, probably meant "environment" is disposed, not "event".
This correction is for other readers. :)

@openjdk
Copy link

@openjdk openjdk bot commented Dec 2, 2020

@coleenp This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8257140: Crash in JvmtiTagMap::flush_object_free_events()

Reviewed-by: sspitsyn, kbarrett

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 26 new commits pushed to the master branch:

  • cfb50a9: 8253916: ResourceExhausted/resexhausted001 crashes on Linux-x64
  • 287b829: 8254877: GCLogPrecious::_lock rank constrains what locks you are allowed to have when crashing
  • 1fd0ea7: 8256382: Use try_lock for hs_err EventLog printing
  • bff68f1: 8257533: legacy-jre-image includes jpackage and jlink tools
  • 9a60413: 8248736: [aarch64] runtime/signal/TestSigpoll.java failed "fatal error: not an ldr (literal) instruction."
  • e7ca0c4: 8257224: JDK-8251549 didn't update building.html
  • 7e37c7c: 8257471: fatal error: Fatal exception in JVMCI: Exception during JVMCI compiler initialization
  • 3e3745c: 8256008: UL does not report anything if disk writing fails
  • fb139cf: 8257467: [TESTBUG] -Wdeprecated-declarations is reported at sigset() in exesigtest.c
  • 9de283b: 8257505: nsk/share/test/StressOptions stressTime is scaled in getter but not when printed
  • ... and 16 more: https://git.openjdk.java.net/jdk/compare/eaf4db6b8b883b1ab5731ac62f590e39a7b67210...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Dec 2, 2020
@coleenp
Copy link
Contributor Author

@coleenp coleenp commented Dec 2, 2020

Thanks Serguei and Kim for the reviews.
/integrate

@openjdk openjdk bot closed this Dec 2, 2020
@openjdk openjdk bot added integrated and removed ready rfr labels Dec 2, 2020
@coleenp coleenp deleted the more-jvmti-table branch Dec 2, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Dec 2, 2020

@coleenp Since your change was applied there have been 26 commits pushed to the master branch:

  • cfb50a9: 8253916: ResourceExhausted/resexhausted001 crashes on Linux-x64
  • 287b829: 8254877: GCLogPrecious::_lock rank constrains what locks you are allowed to have when crashing
  • 1fd0ea7: 8256382: Use try_lock for hs_err EventLog printing
  • bff68f1: 8257533: legacy-jre-image includes jpackage and jlink tools
  • 9a60413: 8248736: [aarch64] runtime/signal/TestSigpoll.java failed "fatal error: not an ldr (literal) instruction."
  • e7ca0c4: 8257224: JDK-8251549 didn't update building.html
  • 7e37c7c: 8257471: fatal error: Fatal exception in JVMCI: Exception during JVMCI compiler initialization
  • 3e3745c: 8256008: UL does not report anything if disk writing fails
  • fb139cf: 8257467: [TESTBUG] -Wdeprecated-declarations is reported at sigset() in exesigtest.c
  • 9de283b: 8257505: nsk/share/test/StressOptions stressTime is scaled in getter but not when printed
  • ... and 16 more: https://git.openjdk.java.net/jdk/compare/eaf4db6b8b883b1ab5731ac62f590e39a7b67210...master

Your commit was automatically rebased without conflicts.

Pushed as commit 2508bc7.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
4 participants