Skip to content

Conversation

@jianyesun
Copy link

@jianyesun jianyesun commented Sep 19, 2023

Hi all,

This pull request contains a backport of commit b256989eb34a32c8f03be448c0645baeb5192a01 from the openjdk/jdk11u-dev repository.

As reported by issue : https://bugs.openjdk.org/browse/JDK-8316278 . We found the indexing method of PtrQueue's buf is not correct when converting an integer of type size_t to type int, then calling the method PtrQueue::byte_index_to_index .
The key problem is this way of using:

size_t i=0;    _buf[byte_index_to_index((int)i)] = NULL;  

The variable i of size_t type cannot be converted directly to an int type . Other than that, the return value of the function byte_index_to_index is the index of the array _buf, and it should be non-negative. So it should be a type of size_t.
Currently we have found 2 issues related to this problem, https://bugs.openjdk.org/browse/JDK-8308169 and https://bugs.openjdk.org/browse/JDK-8303961. They are all triggered by a special size number of buf, like '-XX:G1UpdateBufferSize=512M' or '-XX:G1SATBBufferSize=500m'
We also added a test case.
Please review this PR. Thanks.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • JDK-6899049 needs maintainer approval

Issue

  • JDK-6899049: G1: Clean up code in ptrQueue.[ch]pp and ptrQueue.inline.hpp (Enhancement - P5)

Contributors

  • Yulong Liu <liuyulong35@huawei.com>
  • Kun Wang <wangkun49@huawei.com>
  • Jianye Sun <sunjianye@huawei.com>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk8u-dev.git pull/374/head:pull/374
$ git checkout pull/374

Update a local copy of the PR:
$ git checkout pull/374
$ git pull https://git.openjdk.org/jdk8u-dev.git pull/374/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 374

View PR using the GUI difftool:
$ git pr show -t 374

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk8u-dev/pull/374.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Sep 19, 2023

👋 Welcome back jianyesun! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@jianyesun
Copy link
Author

It triggers the jcheck error again... It shows this:

The commit message does not reference any issue. To add an issue reference to this PR, edit the title to be of the format issue number: message.

This PR's title seems ok, do you know why? @jerboaa

@jerboaa
Copy link
Contributor

jerboaa commented Sep 19, 2023

@jianyesun PR title needs to be Backport b256989eb34a32c8f03be448c0645baeb5192a01 (provided the sha is correct).

@jianyesun jianyesun changed the title Backport <b256989eb34a32c8f03be448c0645baeb5192a01> 8316278: Fix the indexing method of PtrQueue's buf when a large integer value overflow Backport b256989eb34a32c8f03be448c0645baeb5192a01 8316278: Fix the indexing method of PtrQueue's buf when a large integer value overflow Sep 19, 2023
@jianyesun jianyesun changed the title Backport b256989eb34a32c8f03be448c0645baeb5192a01 8316278: Fix the indexing method of PtrQueue's buf when a large integer value overflow Backport b256989eb34a32c8f03be448c0645baeb5192a01 8316278 : Fix the indexing method of PtrQueue's buf when a large integer value overflow Sep 19, 2023
@jianyesun jianyesun changed the title Backport b256989eb34a32c8f03be448c0645baeb5192a01 8316278 : Fix the indexing method of PtrQueue's buf when a large integer value overflow Backport b256989eb34a32c8f03be448c0645baeb5192a01 8316278: Fix the indexing method of PtrQueue's buf when a large integer value overflow Sep 19, 2023
@jerboaa
Copy link
Contributor

jerboaa commented Sep 19, 2023

@jianyesun You are changing the title to something like Backport b256989eb34a32c8f03be448c0645baeb5192a01 8316278: Fix the indexing method of PtrQueue's buf when a large integer value overflow. It should be Backport b256989eb34a32c8f03be448c0645baeb5192a01. I.e. drop 8316278: ... from the title.

The bots will do the rest.

@jianyesun jianyesun changed the title Backport b256989eb34a32c8f03be448c0645baeb5192a01 8316278: Fix the indexing method of PtrQueue's buf when a large integer value overflow Backport b256989eb34a32c8f03be448c0645baeb5192a01 Sep 19, 2023
@openjdk openjdk bot changed the title Backport b256989eb34a32c8f03be448c0645baeb5192a01 6899049: G1: Clean up code in ptrQueue.[ch]pp and ptrQueue.inline.hpp Sep 19, 2023
@openjdk
Copy link

openjdk bot commented Sep 19, 2023

This backport pull request has now been updated with issue and summary from the original commit.

@openjdk openjdk bot added backport Port of a pull request already in a different code base rfr Pull request is ready for review labels Sep 19, 2023
@jianyesun
Copy link
Author

@jianyesun You are changing the title to something like Backport b256989eb34a32c8f03be448c0645baeb5192a01 8316278: Fix the indexing method of PtrQueue's buf when a large integer value overflow. It should be Backport b256989eb34a32c8f03be448c0645baeb5192a01. I.e. drop 8316278: ... from the title.

The bots will do the rest.

Finally done. Thank you for your patience and guidance,bro~

@mlbridge
Copy link

mlbridge bot commented Sep 19, 2023

Webrevs

@openjdk
Copy link

openjdk bot commented Sep 19, 2023

@jianyesun Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information.

@gnu-andrew
Copy link
Member

Looks like the new test is failing.

@gnu-andrew
Copy link
Member

Where does this test come from? I don't see it in the patch being backported.

@jianyesun
Copy link
Author

jianyesun commented Sep 20, 2023

Where does this test come from? I don't see it in the patch being backported.

We added it to check whether the problem is sovled. I saw there are errors like Insufficient Memory Error or could not reserve enough space for 2097152KB object heap. Maybe the resources of the CI environment are limited, the size of memory is less than 4G when the jvm starts with -Xmx4096m. When i change it to -Xmx2048m, it seems that the jvm cannot start or run with OOM.

Do you think it is necessary to add this test case ? If not, i will delete it. Or set the test not to execute in these three scenarios(Linux x86/windows x86/Windows x64) ?

@gnu-andrew
Copy link
Member

Where does this test come from? I don't see it in the patch being backported.

We added it to check whether the problem is sovled. I saw there are errors like Insufficient Memory Error or could not reserve enough space for 2097152KB object heap. Maybe the resources of the CI environment are limited, the size of memory is less than 4G when the jvm starts with -Xmx4096m. When i change it to -Xmx2048m, it seems that the jvm cannot start or run with OOM.

Do you think it is necessary to add this test case ? If not, i will delete it. Or set the test not to execute in these three scenarios(Linux x86/windows x86/Windows x64) ?

Even with a working test, this is not the place to include something new unless it is specific to 8u. If you do want to include it, it needs to be separated from your backport and proposed to https://github.com/openjdk/jdk under its own bug ID. Once included there, it can be backported to 8u, as you have with JDK-6899049 here. Not only do we not want tests being unique to the 8u repository, but changes to the main repository get attention from those who are experts in this field. Stable update trees are generally expected to get fixes that have already been reviewed, but might need some minor modification to work on an older version, and so don't get the same reviewer coverage.

As to the test case itself, it's not clear to me what it's trying to test. Is it the command-line options? Or the actual allocation? I see other cases in the HotSpot tests where a 2GB heap is used, but they only run the VM with -version (e.g. https://github.com/openjdk/jdk/blob/HEAD/test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java )

The failure isn't architecture-specific, so the appropriate exclusion would be to check the available memory before running the VM process. If you only want to check the arguments work, then running with -version should be enough.

@jianyesun
Copy link
Author

As to the test case itself, it's not clear to me what it's trying to test. Is it the command-line options? Or the actual allocation?

Well, as i described in issue JDK-8316278 , the crash happends when looking for a element with a special index in PtrQueue's buf during G1GC STAB processing. Therefore, the value of G1SATBBufferSize is related to the size of the heap space or the size of the remaining heap space. We added this test which is already reported by other reporter (ie.see JDK-8308169) just to check whether this problem will still be triggered in linux x64. Maybe the test doesn't quite fit.

Even with a working test, this is not the place to include something new unless it is specific to 8u. If you do want to include it, it needs to be separated from your backport and proposed to https://github.com/openjdk/jdk under its own bug ID. Once included there, it can be backported to 8u, as you have with JDK-6899049 here. Not only do we not want tests being unique to the 8u repository, but changes to the main repository get attention from those who are experts in this field. Stable update trees are generally expected to get fixes that have already been reviewed, but might need some minor modification to work on an older version, and so don't get the same reviewer coverage.

OK, i understand what you mean. Code's version management should really be considered. I have decided to delete the test.
By the way, this backport is clean. Indeed, the risk of index's type conversion is avoided. Therefore it is valuable. What do you think?

We will continue to pay attention to the reasons why other platforms failed to pass in the future. Thanks.

@jianyesun
Copy link
Author

jianyesun commented Sep 22, 2023

/contributor add Yulong Liu liuyulong35@huawei.com
/contributor add Kun Wang wangkun49@huawei.com
/contributor add Jianye Sun sunjianye@huawei.com

@openjdk
Copy link

openjdk bot commented Sep 22, 2023

@jianyesun
Contributor Yulong Liu <liuyulong35@huawei.com> successfully added.

@openjdk
Copy link

openjdk bot commented Sep 22, 2023

@jianyesun
Contributor Kun Wang <wangkun49@huawei.com> successfully added.

@openjdk
Copy link

openjdk bot commented Sep 22, 2023

@jianyesun
Contributor Jianye Sun <sunjianye@huawei.com> successfully added.

Copy link
Member

@gnu-andrew gnu-andrew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor unwanted differences have crept into the backport, but otherwise now looks mostly clean.

As to the test, if you do think it is worthwhile, please consider submitting it to openjdk/jdk.

DirtyCardQueue& dcq = t->dirty_card_queue();
if (dcq.size() != 0) {
void **buf = t->dirty_card_queue().get_buf();
void **buf = dcq.get_buf();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 11u version changes this to void** buf. Was there a reason for the difference here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the return type of get_buf which is defined for the first time is inappropriate, see here, first commited in here.
The 11u version changes the buf allocation by BufferNode::make_buffer_from_node(see here), however the jdk8 does not do that.
I can modify it to void** buf, should i do that?

// Initialize this queue to contain a null buffer, and be part of the
// given PtrQueueSet.
PtrQueue(PtrQueueSet* qset, bool perm = false, bool active = false);
PtrQueue(PtrQueueSet* qset, bool permanent = false, bool active = false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An extra space has sneaked in here between permanent and the = sign.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, i will fix it.

@jianyesun
Copy link
Author

Hi Andrew, would you please review the code changes again? Also, are there any other committers who can help review this MR? Please help me, thx~ @gnu-andrew

Copy link
Member

@phohensee phohensee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

White space mismatch in dirtyCardQueue.cpp, lines 209 and 210. dirtyCardQueue.hpp ine 72 also doesn't match, but Andrew allowed it so I will too. Other than that, lgtm.

@jianyesun
Copy link
Author

White space mismatch in dirtyCardQueue.cpp, lines 209 and 210. dirtyCardQueue.hpp ine 72 also doesn't match, but Andrew allowed it so I will too. Other than that, lgtm.

Sorry, something interrupted me. I fix it as you suggested, thank you very much.

@phohensee
Copy link
Member

linux-x86 looks to have a couple of tier1 test failures. The windows-x64 failure looks like an infrastructure issue.

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 5, 2023

@jianyesun This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 3, 2024

@jianyesun This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

@bridgekeeper bridgekeeper bot closed this Jan 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport Port of a pull request already in a different code base rfr Pull request is ready for review

Development

Successfully merging this pull request may close these issues.

4 participants