Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8257228: G1: SIGFPE in G1ConcurrentRefine::create(int*) due to buffers_to_cards overflow #1489

Closed
wants to merge 8 commits into from

Conversation

@DamonFool
Copy link
Member

@DamonFool DamonFool commented Nov 28, 2020

Hi all,

SIGFPE was observed by running:

java -XX:G1ConcRefinementThresholdStep=16G -XX:G1UpdateBufferSize=1G -version

The reason is that buffers_to_cards [1] returns 0 for 'step' due to overflow.
It would be better to add overflow check logic is it.

Testing:

  • tier1 on Linux/x64

Thanks.
Best regards,
Jie

[1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/g1/g1ConcurrentRefine.cpp#L235


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8257228: G1: SIGFPE in G1ConcurrentRefine::create(int*) due to buffers_to_cards overflow

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/1489/head:pull/1489
$ git checkout pull/1489

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Nov 28, 2020

/issue add JDK-8257228
/test
/label add hotspot-gc
/cc hotspot-gc

Loading

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Nov 28, 2020

👋 Welcome back jiefu! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

Loading

@openjdk openjdk bot added the rfr label Nov 28, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Nov 28, 2020

@DamonFool This issue is referenced in the PR title - it will now be updated.

Loading

@openjdk openjdk bot added the hotspot-gc label Nov 28, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Nov 28, 2020

@DamonFool
The hotspot-gc label was successfully added.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Nov 28, 2020

@DamonFool The hotspot-gc label was already applied.

Loading

@mlbridge
Copy link

@mlbridge mlbridge bot commented Nov 28, 2020

Loading

Copy link

@kimbarrett kimbarrett left a comment

This is only a problem when using values for some command line options that
are far out of the "normal" range. That's just a general problem; we have
far too many options, and some of them interact in interesting ways, so that
it's pretty much impossible to fully test or check for problem cases. And we
don't want to set artificially low limit values for individual options
because it's hard to know what some application might find useful. In this
case, it does look like we can reasonably do more checking though.

Loading

if (value == 0) return 0;
size_t res = value * G1UpdateBufferSize;

if (res < value || res < G1UpdateBufferSize) { // Check overflow

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not a correct multiplication overflow check.

Loading

if (res < value || res < G1UpdateBufferSize) { // Check overflow
vm_exit_during_initialization("buffers_to_cards overflow!");
}
return res;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buffers_to_cards is only used with command line options as arguments. So
it's not inappropriate to use vm_exit_during_initialization, even though
this function isn't obviously about initialization. Maybe it should be
changed to

// Convert configuration values in units of buffers to number of cards.
static size_t configuration_buffers_to_cards(size_t value, const char* value_name) {
... use value_name in the error message
}

An alternative would be to move the checking to constraint functions for
the command line options. That puts the checking far from the code where
it matters though, but maybe a comment in the conversion function referring
to the option constraints would be sufficient. This might provide better
reporting?

Loading

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Nov 30, 2020

/test

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Nov 30, 2020

@DamonFool you need to get approval to run the tests in tier1 for commits up until 76dec20

Loading

@openjdk openjdk bot added the test-request label Nov 30, 2020
@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Nov 30, 2020

Hi @kimbarrett ,

Thanks for your review and comments.

The mul-overflow checking had been fixed.
The error msg had been refined.

For the sake of maintenance, I didn't use constraint function.

There are quite a few flags involved in it such as:

G1UpdateBufferSize
ParallelGCThreads
G1ConcRefinementThresholdStep
G1ConcRefinementGreenZone
G1ConcRefinementYellowZone
G1ConcRefinementRedZone

Maybe, you can add the constraint function for each of them this time.
But, when a new flag is used in the future, the constraint function may be missed.
Also, it seems a little strange to add this kind of constraints to flags like ParallelGCThreads.
So I prefer to fixing it in buffers_to_cards.

What do you think?

Thanks.
Best regards,
Jie

Loading

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Nov 30, 2020

/test

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Nov 30, 2020

@DamonFool you need to get approval to run the tests in tier1 for commits up until e3ea5de

Loading


public class TestBuffersToCardsOverflow {
public static void main(String... args) throws Exception {
ProcessTools.executeTestJava("-XX:G1ConcRefinementThresholdStep=16G",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this will fail to start on a 32bit platform, because the threshold step exceeds the possible range.

Loading

Copy link
Member Author

@DamonFool DamonFool Nov 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!
Fixed. Thanks.

Loading

@kimbarrett
Copy link

@kimbarrett kimbarrett commented Nov 30, 2020

The mul-overflow checking had been fixed.
The error msg had been refined.

These updates look okay.

For the sake of maintenance, I didn't use constraint function.

There are quite a few flags involved in it such as:

[...]

Maybe, you can add the constraint function for each of them this time.
But, when a new flag is used in the future, the constraint function may be missed.
Also, it seems a little strange to add this kind of constraints to flags like ParallelGCThreads.
So I prefer to fixing it in buffers_to_cards.

The constraint function could be on G1UpdateBufferSize, checking each of
those other values for being not too large for its given value. But I agree
that a constraint function is far away, so the improved buffers_to_cards
does seem better.

Loading

Copy link
Contributor

@tschatzl tschatzl left a comment

Lgtm

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Nov 30, 2020

@DamonFool This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8257228: G1: SIGFPE in G1ConcurrentRefine::create(int*) due to buffers_to_cards overflow

Reviewed-by: kbarrett, tschatzl

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 1 new commit pushed to the master branch:

  • fe5cccc: 8254631: Better support ALPN byte wire values in SunJSSE

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

Loading

@openjdk openjdk bot added the ready label Nov 30, 2020
@tschatzl
Copy link
Contributor

@tschatzl tschatzl commented Nov 30, 2020

(Just trying out what happens if I try to issue a test request)

/test

Loading

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Dec 2, 2020

Lgtm

Thanks @tschatzl for your review.

@kimbarrett , are you OK with this change?
Thanks.

Loading

@@ -243,15 +252,17 @@ static size_t calc_min_yellow_zone_size() {

static size_t calc_init_green_zone() {
size_t green = G1ConcRefinementGreenZone;
char* name = (char*) "G1ConcRefinementGreenZone";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the type of name to const char* and eliminate the cast here and on line 258.

Loading

Copy link
Member Author

@DamonFool DamonFool Dec 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing!
It gets compiled without the casts by just adding 'const'.

Updated. Thanks.

Loading

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Dec 2, 2020

/test

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Dec 2, 2020

@DamonFool you need to get approval to run the tests in tier1 for commits up until efa0946

Loading

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Dec 2, 2020

/integrate

Loading

@openjdk openjdk bot closed this Dec 2, 2020
@openjdk openjdk bot added integrated and removed ready rfr labels Dec 2, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Dec 2, 2020

@DamonFool Since your change was applied there has been 1 commit pushed to the master branch:

  • fe5cccc: 8254631: Better support ALPN byte wire values in SunJSSE

Your commit was automatically rebased without conflicts.

Pushed as commit f2a0988.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Loading

@DamonFool DamonFool deleted the JDK-8257228 branch Dec 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
3 participants