Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8264136: Active processor count may be underreported #3177

Closed

Conversation

@jbachorik
Copy link

@jbachorik jbachorik commented Mar 24, 2021

Current situation

In cgroups environments the available CPU resources are described by the minimal guaranteed amount and maximal allowed amount (see eg. this post).

The current active_processor_count computation makes the assumption that the minimal guaranteed amount of CPU resources translates to the number of available CPUs reported by the container. Unfortunately, this is not completely true and a container is free to use whatever CPUs are available leading to the actual CPU usage being higher than the reported number of available CPUs.

Just for the record, the algorithm is a bit more involved - it computes both values, the one based on the minimal guaranteed amount (if specified) as well as the one based on the maximal allowed amount (again, if specified) and then takes the lesser of the two. In reality, when both parts are set the minimal guaranteed amount will always be less or equal to the maximal allowed amount so, as a simplification, we can consider the minimal guaranteed amount to be the base for the available CPU count calculation if it is set.

Problematic behavior

For systems with 'elastic' setup where the minimal guaranteed amount and maximal allowed amount are not equal this definition of available CPUs can lead to misconfiguration of anything relying on the reported number of cores - eg. number of GC threads, compiler thread or the fork-join pool size.

Proposed fix

The proposed fix is to disregard the minimal guaranteed amount in the calculation when PreferContainerQuotaForCPUCount JVM flag is set to true (currently default). This would allow fallback to the original calculation based on the minimal guaranteed amount by specifying -XX:-PreferContainerQuotaForCPUCount.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8264136: Active processor count may be underreported

Download

To checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/3177/head:pull/3177
$ git checkout pull/3177

To update a local copy of the PR:
$ git checkout pull/3177
$ git pull https://git.openjdk.java.net/jdk pull/3177/head

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Mar 24, 2021

👋 Welcome back jbachorik! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

@openjdk openjdk bot commented Mar 24, 2021

@jbachorik The following label will be automatically applied to this pull request:

  • hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@jbachorik jbachorik force-pushed the DataDog:jb/cgroups_active_cpu_count branch from c87311c to f36ad0e Mar 24, 2021
@jbachorik jbachorik changed the title Initial attempt at fixing reported cpu count in cgroups 8264136: Active processor count may be underreported Mar 24, 2021
@jbachorik jbachorik marked this pull request as ready for review Mar 25, 2021
@openjdk openjdk bot added the rfr label Mar 25, 2021
@mlbridge
Copy link

@mlbridge mlbridge bot commented Mar 25, 2021

Webrevs

@jbachorik jbachorik closed this Apr 21, 2021
@jbachorik jbachorik deleted the DataDog:jb/cgroups_active_cpu_count branch Apr 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
2 participants