Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8277919: OldObjectSample event causing bloat in the class constant pool in JFR recording #6801

Closed
wants to merge 3 commits into from

Conversation

mgronlun
Copy link

@mgronlun mgronlun commented Dec 10, 2021

Greetings,

Allocation heavy applications with OldObjectSample with stacktraces enabled will end up storing many duplicates of the same klass artefact, creating a considerable bloat in the class area of the recording constant pool. This is because of JDK-8233705, which can lead to multiple klass entries enqueued via the load barrier, in combination with an insufficient filter mechanism for the leak profiler artefacts. Leak profiler artefacts range over the entire set of artefacts enqueued in the previous epoch. A filtering mechanism similar to the one used for regular artefacts becomes necessary to avoid duplicates.

Thanks to @jbachorik for reporting and doing the initial analysis.

Markus


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8277919: OldObjectSample event causing bloat in the class constant pool in JFR recording

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/6801/head:pull/6801
$ git checkout pull/6801

Update a local copy of the PR:
$ git checkout pull/6801
$ git pull https://git.openjdk.java.net/jdk pull/6801/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 6801

View PR using the GUI difftool:
$ git pr show -t 6801

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/6801.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 10, 2021

👋 Welcome back mgronlun! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Dec 10, 2021

@mgronlun The following label will be automatically applied to this pull request:

  • hotspot-jfr

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-jfr hotspot-jfr-dev@openjdk.org label Dec 10, 2021
@mgronlun mgronlun marked this pull request as ready for review Dec 10, 2021
@openjdk openjdk bot added the rfr Pull request is ready for review label Dec 10, 2021
@mlbridge
Copy link

mlbridge bot commented Dec 10, 2021

Webrevs

@@ -202,7 +202,7 @@ static void prepare_for_resolution() {

static bool stack_trace_precondition(const ObjectSample* sample) {
assert(sample != NULL, "invariant");
return sample->has_stack_trace_id() && !sample->is_dead();
return sample->has_stack_trace_id() && !sample->is_dead() && !sample->stacktrace().valid();
Copy link

@jbachorik jbachorik Dec 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious - why is !sample->stacktrace().valid() required here?
Is it because of the change from writer.copy() to writer.move()?

Copy link
Author

@mgronlun mgronlun Dec 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Jaroslav, so this is another optimization to reduce the size:

The previous writer.copy() usage was because there needed to be a mechanism to ensure that the newest sample candidates also had their stacktraces serialized in the current chunk. Object samples are emitted (as part of recording dump/stop) before on_rotation(). The old object events would be emitted, but their stacktraces would be serialized in the next on_rotation() call. These "new candidate" stacktraces are not really needed in every chunk but only those that emit object samples. A way to reduce the number of stacktraces written in every chunk is to let the ObjectSampleCheckpoint::write() first create the needed stacktrace blobs right there before serializing them. Then, it will only lay down the blobs for the needed candidates, and the automatic serialization of object sample stacktraces (via writer.copy() which installs the data into the checkpoint queue) can be avoided (turned into writer.move()). With this change, the upcoming on_rotation() will have to navigate state correctly since some of these new candidates already have stacktrace blobs associated (if there was a previous object sample emit call), and the system would assert. Hence the need to check if a stacktrace blob has already been created via the .valid() call.

Copy link
Author

@mgronlun mgronlun Dec 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to get this fixed in JDK18, so I will withdraw this PR and recreate it for JDK18 instead. Stay tuned.

@openjdk
Copy link

openjdk bot commented Dec 12, 2021

@mgronlun This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8277919: OldObjectSample event causing bloat in the class constant pool in JFR recording

Reviewed-by: jbachorik

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 23 new commits pushed to the master branch:

  • db68a0c: 8276766: Enable jar and jmod to produce deterministic timestamped content
  • 6eb6ec0: 8278525: Additional -Wnonnull errors happen with GCC 11
  • 81c56c7: 8278456: Define jtreg jdk_desktop test group time-based sub-tasks for use by headful testing.
  • 61736f8: Merge
  • 0602f4c: 8277621: ARM32: multiple fastdebug failures with "bad AD file" after JDK-8276162
  • 3df8dc4: 8278538: Test langtools/jdk/javadoc/tool/CheckManPageOptions.java fails after the manpage was updated
  • ed5d53a: 8273179: Update nroff pages in JDK 18 before RC
  • afd065b: 8278415: [TESTBUG] vmTestbase/nsk/stress/stack/stack018.java fails with "java.lang.Error: TEST_RFE"
  • 4f594e6: 8278381: [GCC 11] Address::make_raw() does not initialize rspec
  • 8eb453b: 8277072: ObjectStreamClass caches keep ClassLoaders alive
  • ... and 13 more: https://git.openjdk.java.net/jdk/compare/6dae52f8e3993d529033147de8e34ad1e7d48c53...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 12, 2021
@mgronlun mgronlun closed this Dec 13, 2021
@mgronlun
Copy link
Author

mgronlun commented Dec 13, 2021

Withdrawn in favor of openjdk/jdk18#15

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-jfr hotspot-jfr-dev@openjdk.org ready Pull request is ready to be integrated rfr Pull request is ready for review
2 participants