Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8253037: G1: Improve check for string dedup #981

Closed
wants to merge 4 commits into from

Conversation

kimbarrett
Copy link

@kimbarrett kimbarrett commented Nov 1, 2020

Please review this change to G1 evacuation's checking for string
deduplication. The old code would go out of line to the G1StringDedup
support code for every object when deduplication is enabled, only to usually
discover the object isn't a string. Instead we now have a simpl inline
test for the combination of dedup enabled and object is a string, and only
call out to the dedup support code when that's true. This eliminates some
work for every non-string (non-array) object when dedup is enabled.

The performance impact seems to be pretty small and hard to measure, since
enabling deduplication has other costs that seem to overwhelm the cost
here. I'm hoping to improve that with JDK-8254598.

Testing:
tier1 on Oracle supported platforms.
Performance testing with deduplication enabled.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/981/head:pull/981
$ git checkout pull/981

@kimbarrett kimbarrett changed the title new check 8253037: G1: Improve check for string dedupnew check Nov 1, 2020
@kimbarrett kimbarrett changed the title 8253037: G1: Improve check for string dedupnew check 8253037: G1: Improve check for string dedup Nov 1, 2020
@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Nov 1, 2020

👋 Welcome back kbarrett! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

@openjdk openjdk bot commented Nov 1, 2020

@kimbarrett The following label will be automatically applied to this pull request:

  • hotspot-gc

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-gc label Nov 1, 2020
@kimbarrett
Copy link
Author

@kimbarrett kimbarrett commented Nov 1, 2020

/summary Combine dedup enabled and is_string into a single test, using the already in-hand klass of the object.

@kimbarrett kimbarrett marked this pull request as ready for review Nov 1, 2020
@openjdk openjdk bot added the rfr label Nov 1, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Nov 1, 2020

@kimbarrett Setting summary to Combine dedup enabled and is_string into a single test, using the already in-hand klass of the object.

@mlbridge
Copy link

@mlbridge mlbridge bot commented Nov 1, 2020

Webrevs

Copy link
Member

@albertnetymk albertnetymk left a comment

Nit: maybe mentioning java_string in the method name could make the precondition more explicit, sth like enqueue_java_string_from_evacuation.

@kimbarrett
Copy link
Author

@kimbarrett kimbarrett commented Nov 1, 2020

Nit: maybe mentioning java_string in the method name could make the precondition more explicit, sth like enqueue_java_string_from_evacuation.

That seems redundant, since this function is in StringDedup.

@albertnetymk
Copy link
Member

@albertnetymk albertnetymk commented Nov 1, 2020

That seems redundant, since this function is in StringDedup.

OK, then the fact that its neighbor, enqueue_from_mark, doesn't have such precondition (must be java_string), but shares the same naming schema could be a bit surprising. Or, maybe it's better to add the same precondition for enqueue_from_mark as well?

_num_optional_regions(optional_cset_length),
_numa(g1h->numa()),
_obj_alloc_stat(NULL)
{
// Verify klass comparison with _string_klass_or_null is sufficient
Copy link
Contributor

@tschatzl tschatzl Nov 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is the correct place to check this property: it needs not only hold during evacuation but basically anywhere. Is it possible to change finality of a java class?

What would be the impact of java.lang.String not passing this check? From what I understand only that these subclasses of j.l.String are not deduplicated. This does not seem to be a breaking (but surprising) problem, only that dedup for all subclasses of j.l.String does not work.

I would probably move that check into initialization, printing a warning.

Copy link
Author

@kimbarrett kimbarrett Nov 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe finality can be changed in a relevant way. (Not counting changing the source code and recompiling, or perhaps patching the jar file.) If some project were to try to change String, I think they'd very quickly trip this assert and know something needed to be done. Depending on what the change was, it could effectively disable deduplication (for example, if String were changed to be abstract).

I don't see a good way to check this during initialization that doesn't put the check far away from the code that cares, which makes it much less helpful. I thought about putting the assert at the point of use, but that seemd a little excessive.

@@ -83,6 +84,8 @@ class G1ParScanThreadState : public CHeapObj<mtGC> {
// Size (in elements) of a partial objArray task chunk.
int _partial_objarray_chunk_size;
PartialArrayTaskStepper _partial_array_stepper;
// Used to check whether string dedup should be applied to an object.
Klass* _string_klass_or_null;
Copy link
Contributor

@tschatzl tschatzl Nov 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a note that we replicate java_lang_String::is_instance_inlined() here for performance reasons (skipping the NULL check), so that when it changes, maybe a grep will find this here?

Copy link
Author

@kimbarrett kimbarrett Nov 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think such a comment is better placed at the point of use, which is what I intend to do.

@@ -75,7 +75,8 @@ class G1StringDedup : public StringDedup {
// Enqueues a deduplication candidate for later processing by the deduplication
// thread. Before enqueuing, these functions apply the appropriate candidate
// selection policy to filters out non-candidates.
static void enqueue_from_mark(oop java_string, uint worker_id);
static void enqueue_from_mark(oop obj, uint worker_id);
Copy link
Contributor

@tschatzl tschatzl Nov 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Albert mentioned, it would be nice if the caller of this would also get this optimization as it seems straightforward to implement there as well for consistency.

While the caller does not get the Klass* directly, it already reads the mark oop, so getting the Klass there should be practically free too.

Copy link
Author

@kimbarrett kimbarrett Nov 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of compressed klasses, getting the klass is never completely trivial. However, since effectively the first thing done by enqueue_from_mark is to test for the obj being a String, hoisting that out to fullgc's mark_object seems reasonable. And that gives the consistency between the enqueue functions that Albert asked for. I'll add that to the PR after re-running tests.

Copy link
Contributor

@tschatzl tschatzl left a comment

(Not necessarily requesting changes, but feedback to my comments)

@openjdk
Copy link

@openjdk openjdk bot commented Nov 2, 2020

@kimbarrett This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8253037: G1: Improve check for string dedup

Combine dedup enabled and is_string into a single test, using the already in-hand klass of the object.

Reviewed-by: ayang, tschatzl

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been no new commits pushed to the master branch. If another commit should be pushed before you perform the /integrate command, your PR will be automatically rebased. If you prefer to avoid any potential automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Nov 2, 2020
@albertnetymk
Copy link
Member

@albertnetymk albertnetymk commented Nov 2, 2020

Thank you for the change, Kim.

@kimbarrett
Copy link
Author

@kimbarrett kimbarrett commented Nov 4, 2020

/integrate

@openjdk openjdk bot closed this Nov 4, 2020
@openjdk openjdk bot added integrated and removed ready rfr labels Nov 4, 2020
@kimbarrett kimbarrett deleted the string_dedup_check branch Nov 4, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Nov 4, 2020

@kimbarrett Pushed as commit 4b88119.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

openjdk-notifier bot referenced this issue Nov 4, 2020
Combine dedup enabled and is_string into a single test, using the already in-hand klass of the object.

Reviewed-by: ayang, tschatzl
@kimbarrett
Copy link
Author

@kimbarrett kimbarrett commented Nov 4, 2020

Thanks for reviewing @albertnetymk and @tschatzl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-gc integrated
3 participants