Skip to content

Conversation

@tstuefe
Copy link
Member

@tstuefe tstuefe commented Apr 29, 2021

There is a lot of duplicate coding when it comes to the consumption of Metaspace Statistics.

Metaspace offers statistical APIs via MetaspaceUtils::(reserved|committed|used)_(words|bytes) for either the whole metaspace or non-class/class space separately. But many callers need some sort of combination of these values, which they keep in data holder structures which all are named "Metaspace" something and all do a very similar job. In particular, we have:

  • MetaspaceSizesSnapshot (used for gc logs)
  • MetaspaceSnapshot (used in NMT)
  • MetaspaceSizes, MetaspaceSummary, JFRMetaspaceSummary (JFR)
  • MetaspaceCounters, CompressedClassSpaceCounters, MetaspacePerfCounters (jstat performance counters)
  • CompressedKlassSpacePool and MetaspacePool (used for MXBeans)

As much as possible coding should be unified.

In addition to that, all these callers share a common problem, in that retrieving individual statistical values via MetaspaceUtils::(reserved|committed|used)_(words|bytes) may yield inconsistent values. For example, "reserved" < "committed", or "committed" < "used". This is the cause of a variety of rare intermittent test errors in different areas, e.g. Performance counters (JDK-8163413, JDK-8153323), the gc log, the MemoryPoolMXBean and NMT (JDK-8237872).


This patch introduces two new data holder structures:

  • MetaspaceStats holds reserved/committed/used counter
  • MetaspaceCombinedStats holds an expression of these counters for class- and non-class metaspace each, as well as total counter.

Furthermore, the patch introduces two new APIs in MetaspaceUtils:

  • MetaspaceStats get_statistics(type)
  • MetaspaceCombinedStats get_combined_statistics()
    The former is used to get statistics for either non-class-space or class space; the latter gets statistics for the whole, including total values. Both APIs guarantee consistent values - reserved >= committed >= used - and never lock.

The patch uses these new data holders and these APIs to consolidate, clean up and simplify a lot of caller code, in addition to making that code resistant against inconsistent statistics:

  • GC Log: class MetaspaceSizesSnapshot in metaspace/metaspaceSizesSnapshot.cpp/hpp had been used to hold metaspace statistics. Its default constructor sneakishly queried all values. MetaspaceSizesSnapshot has neem removed, caller code rewritten for MetaspaceCombinedStats and explicit value querying.

  • NMT had class MetaspaceSnapshot to keep metaspace statistics and to print out baseline diffs.

    • MetaspaceSizesSnapshot was removed and replaced with MetaspaceCombinedStats.
    • MemSummaryDiffReporter::print_metaspace_diff() has been modified: fixed a potential div-by-zero, removed the "free" statistics which is meaningless, and massaged the code a bit.
    • Similar fixes have been done in MemSummaryReporter::report_metadata() (which, if backported, should fix JDK-8237872).
  • jstat & co: Metaspace performance counters are realized by MetaspaceCounters and CompressedClassSpaceCounters, implementation resides in MetaspacePerfCounters (metaspace/metaspaceCounters.hpp/cpp).

    • I completely removed CompressedClassSpaceCounters since there is no need for two APIs, class space is part of metaspace.
    • I simplified the counter coding a lot. I think the complexity was not needed.
    • The Metaspace counters are now retrieved in a consistent manner. This should take care of JDK-8163413, JDK-8153323 and similar bugs.
  • MetaspaceMemoryPool, CompressedClassSpaceMemoryPool used in MxBeans: I changed the implementation to return consistent values.

  • JFR reports metaspace allocations (which, confusingly, uses a different terminology: "data_space" == non-class portion of metaspace). It used MetaspaceSummary to hold the statistics, which were composed of MetaspaceSizes.

    • I completely removed MetaspaceSizes
    • I rewrote MetaspaceSummary to use MetaspaceCombinedStats
  • MetaspaceUtils::print_on() has been used to print metaspace statistics (by GCs). Function has been corrected to print consistent values.

  • Added a simple gtest for the new APIs

Tests:

  • manually executed hotspot tier1 tests
  • explicitly tested runtime/Metaspace, gc/metaspace, runtime/NMT
  • SAP tests ran for several weeks on all our platforms

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/3786/head:pull/3786
$ git checkout pull/3786

Update a local copy of the PR:
$ git checkout pull/3786
$ git pull https://git.openjdk.java.net/jdk pull/3786/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 3786

View PR using the GUI difftool:
$ git pr show -t 3786

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/3786.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 29, 2021

👋 Welcome back stuefe! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Apr 29, 2021

@tstuefe The following labels will be automatically applied to this pull request:

  • hotspot
  • shenandoah

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added hotspot hotspot-dev@openjdk.org shenandoah shenandoah-dev@openjdk.org labels Apr 29, 2021
@tstuefe tstuefe force-pushed the JDK-8251392-consolidate-metaspace-statistics branch 10 times, most recently from 37e1f19 to decf5c3 Compare May 3, 2021 11:40
@tstuefe tstuefe force-pushed the JDK-8251392-consolidate-metaspace-statistics branch from decf5c3 to e758d29 Compare May 3, 2021 12:51
@tstuefe tstuefe marked this pull request as ready for review May 4, 2021 11:22
@tstuefe
Copy link
Member Author

tstuefe commented May 4, 2021

/cc hotspot-gc

@openjdk openjdk bot added rfr Pull request is ready for review hotspot-gc hotspot-gc-dev@openjdk.org labels May 4, 2021
@openjdk
Copy link

openjdk bot commented May 4, 2021

@tstuefe
The hotspot-gc label was successfully added.

@tstuefe
Copy link
Member Author

tstuefe commented May 4, 2021

@zhengyu123 : could you take a look at the NMT part of this change? Thanks!

@mlbridge
Copy link

mlbridge bot commented May 4, 2021

Webrevs

@tstuefe
Copy link
Member Author

tstuefe commented May 14, 2021

Gentle ping.

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Nice cleanup removing duplicated metaspace counting code!

@openjdk
Copy link

openjdk bot commented May 14, 2021

@tstuefe This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8251392: Consolidate Metaspace Statistics

Reviewed-by: coleenp, zgu

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 274 new commits pushed to the master branch:

  • a29612e: 8255661: TestHeapDumpOnOutOfMemoryError fails with EOFException
  • a555fd8: 8264734: Some SA classes could use better hashCode() implementation
  • 2313a21: 8266637: CHT: Add insert_and_get method
  • 7b736ec: 8266489: Enable G1 to use large pages on Windows when region size is larger than 2m
  • f422787: 8266073: Regression ~2% in Derby after 8261804
  • 02f895c: 8252685: APIs that require JavaThread should take JavaThread arguments
  • 2066f49: 8266764: [REDO] JDK-8255493 Support for pre-generated java.lang.invoke classes in CDS dynamic archive
  • 8c71144: 8265153: add time based test for ThreadMXBean.getThreadInfo() and ThreadInfo.getLockOwnerName()
  • 10cafd2: 8267153: Problemlist jdk/jfr/event/gc/collection/TestG1ParallelPhases.java to remove the noise from CI
  • f3fb5a4: 8266942: gtest/GTestWrapper.java os.iso8601_time_vm failed
  • ... and 264 more: https://git.openjdk.java.net/jdk/compare/343a4a76f273078f78897e8fb7e695bc2c111536...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label May 14, 2021
@tstuefe
Copy link
Member Author

tstuefe commented May 14, 2021

This looks good to me. Nice cleanup removing duplicated metaspace counting code!

Thank you Coleen!

Copy link
Contributor

@zhengyu123 zhengyu123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@tstuefe
Copy link
Member Author

tstuefe commented May 17, 2021

Looks good.

Thanks Zhengyu!

@tstuefe
Copy link
Member Author

tstuefe commented May 18, 2021

Okay, I waited but no more reviews are forthcoming. Our nightlies show no errors attributable to this change.

/integrate

@openjdk openjdk bot closed this May 18, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 18, 2021
@openjdk
Copy link

openjdk bot commented May 18, 2021

@tstuefe Since your change was applied there have been 287 commits pushed to the master branch:

  • 3e97b07: 8267116: Lanai: Incorrect AlphaComposite for VolatileImage graphics
  • cd1c17c: 8266404: Fatal error report generated with -XX:+CrashOnOutOfMemoryError should not contain suggestion to submit a bug report
  • 2effdd1: 8267112: JVMCI compiler modules should be kept upgradable
  • da4dfde: 8264777: Overload optimized FileInputStream::readAllBytes
  • 3b11d81: 8266742: Check W^X state on possible safepoint
  • 79b3944: 8266520: Revert to OpenGL as the default 2D rendering pipeline for macOS
  • 3c010a7: 8265705: aarch64: KlassDecodeMovk mode broken
  • cf97252: 8264561: javap get NegativeArraySizeException on bad instruction
  • b8856b1: 8263614: javac allows local variables to be accessed from a static context
  • ea36836: 8267236: Versioned platform link in TestMemberSummary.java
  • ... and 277 more: https://git.openjdk.java.net/jdk/compare/343a4a76f273078f78897e8fb7e695bc2c111536...master

Your commit was automatically rebased without conflicts.

Pushed as commit 554caf3.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@tstuefe tstuefe deleted the JDK-8251392-consolidate-metaspace-statistics branch June 29, 2021 05:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hotspot hotspot-dev@openjdk.org hotspot-gc hotspot-gc-dev@openjdk.org integrated Pull request has been integrated shenandoah shenandoah-dev@openjdk.org

Development

Successfully merging this pull request may close these issues.

3 participants