Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8324969: C2: prevent elimination of unbalanced coarsened locking regions #17697

Closed
wants to merge 10 commits into from

Conversation

vnkozlov
Copy link
Contributor

@vnkozlov vnkozlov commented Feb 3, 2024

The issue is that when we do nested locks elimination we don't check that synchronized region has both Lock and Unlock nodes.
Which can happen if locks coarsening optimization eliminated pair of Unlock/Lock nodes from adjacent locking regions before we check for nested locks.

Consider this code (all locks/unlocks use the same object):

    lock1(obj) { // outer synchronized region
         lock2(obj) {
             // nested synchronized region 1
         } unlock2(obj);
         lock3(obj) {
             // nested synchronized region 2
         } unlock3(obj);
    } unlock1(obj);

If lock3 directly follows unlock2 (no branches or safepoints) locks coarsening optimization will remove them:

    lock1(obj) { // outer synchronized region
         lock2(obj) {
             // nested synchronized region 1
         };
         {
             // nested synchronized region 2
         } unlock3(obj);
    } unlock1(obj);

Nested locks elimination code checks only Lock node in one region to find if it is nested (inside other lock region which use the same object) and then eliminate it. So we end up with not eliminated Unlock node in second nested region.

Why we don't hit this issue before? Normally nested locks elimination executed first and only then we do locks coarsening elimination. In the example above we eliminate all nested Lock and Unlock nodes, leaving only outer Lock and Unlock.

The additional factors which leads to the failure is fully unrolled loop around nested sync regions and some allocation to trigger Escape Analysis:

    lock1(obj) { // outer synchronized region
         Test var = new Test(); // Triggers EA
         for (I = 0; I < 3; i++) { // small iteration number to fully unroll
             lock2(obj) {
                 // nested synchronized region 1
             } unlock2(obj);
             lock3(obj) {
                 // nested synchronized region 2
             } unlock3(obj);
         }
    } unlock1(obj);

Before executing Escape Analysis we do loops optimization to simplify graph: compile.cpp#L2332
We also allow to fully unroll short loops (LoopOptsMaxUnroll) to remove merges from graph. It helps EA eliminate allocations.
Such unrolling creates several Lock and Unlock nodes per synchronized region. But nested locks elimination look for region with only one unique Lock node: callnode.cpp#L2117. So we skip first iteration of nested locks elimination and execute locks coarsening optimization which eliminates all inners Lock and Unlock nodes leaving only first Lock and last Unlock. On next round of macro nodes expansion we execute nested locks elimination and remove first Lock node leaving Unlock.

The fix is to use existing list of coarsened Lock/Unlock nodes pairs to flag synchronized regions (identified by BoxLockNode nodes) as Unbalanced to skip any further locks elimination in these regions.

New regression test was added. It has two test methods. One is based on original test from bug report. The other I wrote to show the issue clearly. Both triggers the assert.

Tested tier1-5, xcomp, stress


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8324969: C2: prevent elimination of unbalanced coarsened locking regions (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/17697/head:pull/17697
$ git checkout pull/17697

Update a local copy of the PR:
$ git checkout pull/17697
$ git pull https://git.openjdk.org/jdk.git pull/17697/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 17697

View PR using the GUI difftool:
$ git pr show -t 17697

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/17697.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 3, 2024

👋 Welcome back kvn! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 3, 2024
@openjdk
Copy link

openjdk bot commented Feb 3, 2024

@vnkozlov The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Feb 3, 2024
@mlbridge
Copy link

mlbridge bot commented Feb 3, 2024

@dean-long
Copy link
Member

Doesn't this problem also go away if lock coarsening turns the two regions into a single region?

// Unbalanced locking region.
// Can happen when locks coarsening optimization eliminated
// pair of Unlock/Lock nodes from adjacent locking regions.
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the assumption here that each region has its own BoxLockNode, i.e. we have at most one lock and at most one unlock per box? Or can regions share a box?
If it is shared: could we have an imbalance over multiple regions that share the BoxLock, but overall we have at least one lock and one unlock, and so we pass this test?

Do I understand this right: returning false here just means that we don't do the optimization of removing the lock/unlock from the region? I think it is in PhaseMacroExpand::mark_eliminated_box that we would eliminate the box.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the assumption here that each region has its own BoxLockNode, i.e. we have at most one lock and at most one unlock per box? Or can regions share a box? If it is shared: could we have an imbalance over multiple regions that share the BoxLock, but overall we have at least one lock and one unlock, and so we pass this test?

It is controlled by EliminateNestedLocks flag. By default the flag is ON and each synchronized region has its own BoxLockNode associated with it. The region can have one Lock node (before C2 transforms the graph) and multiple Unlock nodes (for each exit path). That is why I can't check matching numbers. I think it is possible to construct bytecode (jasm) with multiple Lock nodes for one region but in practice I never saw such case. W can have multiple Lock/Unlock nodes per region after C2 graph transformations (loop unroll/peelling) as in this case.

With EliminateNestedLocks flag switched off we don't do nested locks elimination and regions can share the same BoxLockNode node after Parse phase. We do only locks coarsening optimization and eliminate locks for not escaped objects.

Do I understand this right: returning false here just means that we don't do the optimization of removing the lock/unlock from the region? I think it is in PhaseMacroExpand::mark_eliminated_box that we would eliminate the box.

Actually the code which marks the box for nested locks is in mark_eliminated_locking_nodes() under EliminateNestedLocks flag. is_nested_lock_region() calls is_simple_lock_region() and will skip elimination for this case.

But yes, it also affects code in mark_eliminated_box which is called for locks with not-escaped objects. So we will not not eliminate those if we see not-balanced region (combined or not combined).

@eme64
Copy link
Contributor

eme64 commented Feb 7, 2024

Doesn't this problem also go away if lock coarsening turns the two regions into a single region?

If that works that would probably be a more elegant solution. Could there be an invariant that a box always has a lock and an unlock?

@vnkozlov
Copy link
Contributor Author

vnkozlov commented Feb 7, 2024

Doesn't this problem also go away if lock coarsening turns the two regions into a single region?

If that works that would probably be a more elegant solution. Could there be an invariant that a box always has a lock and an unlock?

Yes, this is good suggestion. Let me investigate how to implement it. It will allow to eliminate all lock/unlock nodes in nested combined region. The only concern for me is that I have to backport the fix into all releases and if it is too complex I would prefer to go with current "band-aid" fix and have separate RFE to implement this suggestion.

@vnkozlov
Copy link
Contributor Author

vnkozlov commented Feb 7, 2024

Doesn't this problem also go away if lock coarsening turns the two regions into a single region?

If that works that would probably be a more elegant solution. Could there be an invariant that a box always has a lock and an unlock?

Yes, this is good suggestion. Let me investigate how to implement it. It will allow to eliminate all lock/unlock nodes in nested combined region. The only concern for me is that I have to backport the fix into all releases and if it is too complex I would prefer to go with current "band-aid" fix and have separate RFE to implement this suggestion.

After looking on current code I think this will introduce more issues. In current state we have assumption that with EliminateNestedLocks flag ON we have only "one" Lock node per region. Depending when we start merging adjacent region it will break this assumption.

But I share @eme64 concern that current fix may not work if coarsening eliminates an Unlock node on one of paths but left Unlock nodes on other paths. The check will pass but we will have unbalanced lock/unlock on the first path.

I think I should just disable nested and not-escaped object lock eliminations if a coarsening elimination happens before it.

@vnkozlov
Copy link
Contributor Author

vnkozlov commented Feb 9, 2024

@eme64 and @dean-long I pushed new implementation. I used existing list of coarsened Lock/Unlock nodes pairs to flag synchronized regions (identified by BoxLockNode nodes) as unbalanced to skip any further locks elimination in these regions. Please look.

@vnkozlov
Copy link
Contributor Author

vnkozlov commented Feb 9, 2024

I did renaming BoxLockNode::Coarsened -> BoxLockNode::Unbalanced.
I plan to use it for JDK-8322743 to mark BoxLockNode coming from OSR entry.

@eme64
Copy link
Contributor

eme64 commented Feb 13, 2024

The fix is to add check for unbalanced synchronized region (locks or unlocks) are missing.

You should probably update the PR description now.

Copy link
Contributor

@eme64 eme64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach looks better now :)
Though it would be good if someone who has a deeper understanding of the locking code would have a look as well.

// Mark locking regions (identified by BoxLockNode) as coarsened
// if locks coarsening optimization removed lock/unlock from them.
// Such regions become unbalanced and we can't execute other
// locks elimination optimization on them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment suggests you mark boxes as "coarsened" if ...
But really you mark them as "unbalanced", right? Because you have the condition that the locks have to already be "coarsened".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated comment

if (size > 0) {
AbstractLockNode* alock = locks_list->at(0)->as_AbstractLock();
BoxLockNode* box = alock->box_node()->as_BoxLock();
if (alock->is_coarsened() && !box->is_unbalanced()) { // Not marked already
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that the position 0 box is already marked as unbalanced, but there are other this_box for positions j=1.. that are not yet marked but should be?
Is it really ok that we are noch cheching all-with-all but only first-with-all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I removed !box->is_unbalanced() check.

src/hotspot/share/opto/compile.cpp Show resolved Hide resolved
@vnkozlov
Copy link
Contributor Author

@eme64 I updated changes based on your comments.

Copy link
Contributor

@iwanowww iwanowww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@openjdk
Copy link

openjdk bot commented Feb 13, 2024

@vnkozlov This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8324969: C2: prevent elimination of unbalanced coarsened locking regions

Reviewed-by: epeter, vlivanov, dlong

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 5 new commits pushed to the master branch:

  • 4dd6c44: 8326529: JFR: Test for CompilerCompile events fails due to time out
  • 33f2382: 8325807: Shenandoah: Refactor full gc in preparation for generational mode changes
  • 419191c: 8325680: Uninitialised memory in deleteGSSCB of GSSLibStub.c:179
  • 349df0a: 8326726: Problem list Exhaustiveness.java due to 8326616
  • 552411f: 8326824: Test: remove redundant test in compiler/vectorapi/reshape/utils/TestCastMethods.java

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Feb 13, 2024
}
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this end up marking even the simplest lock coarsening cases as unbalanced?
Is there a reason the above check can't be done as part of coarsened_locks_consistent()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this end up marking even the simplest lock coarsening cases as unbalanced?

Yes, the simplest lock coarsening makes locking regions as unbalanced.
See second code example in PR's description. Coarsening eliminates Unlock in regions 1 and Lock in region 2. Both regions become unbalanced.

Is there a reason the above check can't be done as part of coarsened_locks_consistent()?

Unbalanced in coarsened_locks_consistent() is different from Unbalanced in mark_coarsened_boxes()
Or do you mean to recompile without lock coarsening when regions are unbalanced? But that will disable all coarsening, even valid one (for example, when no outer locking in my examples).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason the above check can't be done as part of coarsened_locks_consistent()?

Unbalanced in coarsened_locks_consistent() is different from Unbalanced in mark_coarsened_boxes() Or do you mean to recompile without lock coarsening when regions are unbalanced? But that will disable all coarsening, even valid one (for example, when no outer locking in my examples).

If we are simply making sure that all nodes in a group have the same BoxLock, then that check could be done earlier, in coarsened_locks_consistent(), or even as the nodes are added to the list. But I guess it's possible to get a premature answer is nodes are removed from the list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to mix these two methods. They do different things:

  • coarsened_locks_consistent() checks that graph is correct otherwise bailout compilation
  • mark_coarsened_boxes() marks BoxLock nodes to prevent further locks optimizations

It could be done but it will complicate code in coarsened_locks_consistent() and you would still need 2 iterations:

  • first, you need to check that all alock on locks_list are marked as coarsened and also set flag if they reference different Boxes
  • only then you do second round over them to mark all referenced Box as unbalanced

And if we bailout from compilation Boxes marking will be useless work.

@dean-long
Copy link
Member

Nested locks elimination code checks only Lock node in one region to find if it is nested (inside other lock region which use the same object) and then eliminate it. So we end up with not eliminated Unlock node in second nested region.

If lock coarsening just marked the Lock node as coarsened, but did not remove it, then Nested locks elimination would still find the Lock node and work correctly, right? Can't we just remove coarsened locks at a later time, after locks elimination?

@vnkozlov
Copy link
Contributor Author

Running with test from JDK-8322743 found that I missed box->is_unbalanced() check in Lock/UnlockNode::Ideal() where we also check for non escaped locked object.

@vnkozlov
Copy link
Contributor Author

Nested locks elimination code checks only Lock node in one region to find if it is nested (inside other lock region which use the same object) and then eliminate it. So we end up with not eliminated Unlock node in second nested region.

If lock coarsening just marked the Lock node as coarsened, but did not remove it, then Nested locks elimination would still find the Lock node and work correctly, right? Can't we just remove coarsened locks at a later time, after locks elimination?

I have old RFE for that JDK-8268571:

To avoid that I suggest to execute locks coarsening after EA and nested locks optimizations.

The issue here is that we don't know when we done with EA and nested locks optimizations.
Current case shows that:

  • after loop unrolling we can't do nested lock optimization because it is limited to regions with only one lock
  • we execute coarsening and it eliminated all but one lock in one regions and unlock in an other region
  • nested lock optimization now see only one lock in region and eliminate it which leads to this issue

My fix simple prevent further locks optimizations if coarsening happened.

@vnkozlov
Copy link
Contributor Author

Yes, we can do nested locks elimination before loop unrolling which will help in this case but may not in others.
And it is separate RFE.

@vnkozlov
Copy link
Contributor Author

Thank you, @eme64 and @iwanowww for reviews.
@dean-long, did I answered your questions and do you have other comments?

@dean-long
Copy link
Member

My concern is that a group of unbalanced locks is added to coarsened_locks (they have different BoxLockNodes), but later some of those locks are removed by remove_useless_coarsened_locks or remove_coarsened_lock. Then when mark_unbalanced_boxes finally runs, it incorrectly sees the group as balanced.

It is easy to check for unbalanced in add_coarsened_locks as the locks are added. I think this is the most conservative approach.

@vnkozlov
Copy link
Contributor Author

@shipilev people have different philosophies about that, it seems. Is there a "style guiede" for that? I prefer "what is being done" too. Makes it easier when browsing through the "git blame" later.

Suggestion: C2: prevent elimination of unballanced coarsened lock boxes

Good suggestion. I replace boxes with regions. That is what I used in code comments.

@vnkozlov vnkozlov changed the title 8324969: assert(false) failed: Non-balanced monitor enter/exit! 8324969: C2: prevent elimination of unbalanced coarsened locking regions Feb 14, 2024
@vnkozlov
Copy link
Contributor Author

My concern is that a group of unbalanced locks is added to coarsened_locks (they have different BoxLockNodes), but later some of those locks are removed by remove_useless_coarsened_locks or remove_coarsened_lock. Then when mark_unbalanced_boxes finally runs, it incorrectly sees the group as balanced.

It is easy to check for unbalanced in add_coarsened_locks as the locks are added. I think this is the most conservative approach.

I think it is premature and too conservative.

If a locking region (related lock/unlock nodes with other BoxLockNode) is removed because it is on dead path it can affect analysis of the rest of graph. I think it is normal. Yes, group could become balanced if nodes which are left points to the same BoxLockNode. I am not sure why it is the issue - the part of graph which made it unbalanced is gone before we check consistency, BoxLock marking and locks elimination.

That is why I am doing BoxLock nodes marking after verification of locks_list groups consistency and before we actually remove marked Lock/Unlock nodes from graph. At this point graph should be correct.

@iwanowww suggested that we can additionally verify in debug VM before eliminating coarsened Lock/Unlock nodes that information in coarsened_locks is correct: all paths from each Lock node have corresponding Unlock node which references same object and stack slot, but may be different BoxLock. We can do it as separate RFE.

@vnkozlov
Copy link
Contributor Author

Looking on cases for locks coarsening callnode.cpp#L1667 I don't see how it can become balanced. In all cases we have separate synchronization regions (s()) with different BoxLock nodes. Even if one branch of code is gone the rest will still have separate regions.

I can do one more experiment. Do BoxLock marking in add_coarsened_locks() as you suggested and compare it in mark_unbalanced_boxes().

@vnkozlov
Copy link
Contributor Author

@dean-long, I added verification code as I described in previous comment. Please look.
I am running tier1-5,xcopm,stress testing with it. So far did not hit new asserts.

@dean-long
Copy link
Member

Verification code looks correct.

@vnkozlov
Copy link
Contributor Author

Thank you, Dean

}
}
assert(box->is_unbalanced() == box->is_marked_unbalanced(),"inconsistency");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you even move this a scope out, i.e. a line down, so this check is even run with alock->is_coarsened() == false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. It will verify that we did not miss is_unbalanced() check when do EA or nested elimination to avoid them.

I will also move the assert at line 4984 under (box != this_box) to avoid duplication of check when box is the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving assert down does not work because Nested lock elimination and EA can overwrite Coarsened status for BoxLock before we run mark_unbalanced_boxes(). Such BoxLock will not be marked as Unbalanced and assert will fail.
I would like to keep current assert in place but add an other assert when we change status of BoxLock. I am working on it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, sounds good :)

@vnkozlov
Copy link
Contributor Author

New changes:

  • moved test to compiler/locks directory and renamed it because there was test there with the same name
  • introduced states for BoxLock node (and corresponding locking region)
  • added assert to verify that state change is allowed
  • added BoxLockNode::Identity() to correctly set state when node is "value numbered" by IGVN when EliminateNestedLocks flag is off. When the flag is off C2 assumes that we have one BoxLock node per monitor's stack slot - we can't have merges (Phis). So I can't modify BoxLockNode::hash() and cmp(). I decide to add Identity() to handle that. Note, when the flag is off we have only Regular, Coarsened and Unbalanced states (Local can be set during Macro nodes elimination phase just before it is marked as Eliminated so it will be not visible outside small scope of code).
  • I also tested OSR case (set its BoxLock to Unbalanced state). I will use it for #17331 after I push these changes.
  • I ran tier1-4,xcomp,stress (and now running tier5-7) - no new failures. I also ran performance testing which shows no significant changes.

Copy link
Contributor

@eme64 eme64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the _kind states, with the state transitions. Makes it a bit clearer :)

Would it make sense to add the _kind string to the node dump?
That would possibly allow for IR matching with regex, on a few examples.
And that would make me feel a bit safer about test coverage ;)

// corresponding pair of Lock/Unlock nodes - they are balanced.
void Compile::mark_unbalanced_boxes() {
int count = coarsened_count();
for (int i = 0; i < count; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of curiosity: Do you often move out the limit from the loop?
I guess this could help with performance. But coarsened_count is const, so I guess it would not make a change?

Another question: could you make mark_unbalanced_boxes const?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of curiosity: Do you often move out the limit from the loop?

Not always. Mostly when I have several usages or it is not constant. Here it is indirect load: compile.hpp#L726

could you make mark_unbalanced_boxes const?

Yes, I can.

if (EliminateNestedLocks) {
// We can mark whole locking region as Local only when only
// one object is used for locking.
box->set_local();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. The name of the method suggests that there are no side-effects. Not sure what would be a better alternative though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the box be nested at this point? If so, the assert inside set_local could trigger, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the box be nested at this point? If so, the assert inside set_local could trigger, right?

No, it can't be nested.
Check for nesting and its elimination is done in one place from which it does not escape macro.cpp#L2047
(Nested state is set only when is_nested_lock_region() returns true in the check).
And it will be overwritten immediately by box_node->set_eliminated().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also BoxLock can't be Local when we check for nesting because of (!alock->is_non_esc_obj()) check above.

}
return old_box;
}
return this;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For EliminateNestedLocks we now never have a hash, are only equal with this, and don't common nodes in Ideal either. Is that intended?

Are there any IR tests that verify that we common nodes in the cases where we expect it to common?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For EliminateNestedLocks we now never have a hash, are only equal with this, and don't common nodes in Ideal either. Is that intended?

It was since JDK 8. I only corrected style there: added {}. And replaced _is_eliminated with accessor call.

It is intended for nested locks elimination - to have only one Object per locking region: JDK-7125896@locknode.cpp

Are there any IR tests that verify that we common nodes in the cases where we expect it to common?

These optimizations were implemented before we had IR tests. We have some regression tests. During my experiments I added _kind to hash() and a lot of tests in Tier1 (not just compiler) failed when ran with -XX:-EliminateNestedLocks:

#  Internal Error (/workspace/open/src/hotspot/share/opto/parse1.cpp:2030), pid=71294, tid=28931
#  assert(!nocreate) failed: Cannot build a phi for a block already parsed.

So we have this verification that we common these nodes.

Copy link
Contributor Author

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eme64 Thank you for your comments. I hope I answered all your questions.

The only thing I may need to do is to add const to mark_unbalanced_boxes().

// corresponding pair of Lock/Unlock nodes - they are balanced.
void Compile::mark_unbalanced_boxes() {
int count = coarsened_count();
for (int i = 0; i < count; i++) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of curiosity: Do you often move out the limit from the loop?

Not always. Mostly when I have several usages or it is not constant. Here it is indirect load: compile.hpp#L726

could you make mark_unbalanced_boxes const?

Yes, I can.

if (EliminateNestedLocks) {
// We can mark whole locking region as Local only when only
// one object is used for locking.
box->set_local();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the box be nested at this point? If so, the assert inside set_local could trigger, right?

No, it can't be nested.
Check for nesting and its elimination is done in one place from which it does not escape macro.cpp#L2047
(Nested state is set only when is_nested_lock_region() returns true in the check).
And it will be overwritten immediately by box_node->set_eliminated().

}
return old_box;
}
return this;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For EliminateNestedLocks we now never have a hash, are only equal with this, and don't common nodes in Ideal either. Is that intended?

It was since JDK 8. I only corrected style there: added {}. And replaced _is_eliminated with accessor call.

It is intended for nested locks elimination - to have only one Object per locking region: JDK-7125896@locknode.cpp

Are there any IR tests that verify that we common nodes in the cases where we expect it to common?

These optimizations were implemented before we had IR tests. We have some regression tests. During my experiments I added _kind to hash() and a lot of tests in Tier1 (not just compiler) failed when ran with -XX:-EliminateNestedLocks:

#  Internal Error (/workspace/open/src/hotspot/share/opto/parse1.cpp:2030), pid=71294, tid=28931
#  assert(!nocreate) failed: Cannot build a phi for a block already parsed.

So we have this verification that we common these nodes.

if (EliminateNestedLocks) {
// We can mark whole locking region as Local only when only
// one object is used for locking.
box->set_local();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also BoxLock can't be Local when we check for nesting because of (!alock->is_non_esc_obj()) check above.

Copy link
Contributor

@eme64 eme64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thanks for the explanations :)

@vnkozlov
Copy link
Contributor Author

Thank you, @eme64, for review.

I will merge the latest JDK sources and do additional testing with it before integration.

@vnkozlov
Copy link
Contributor Author

New tier1-5 testing passed.

@vnkozlov
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Feb 28, 2024

Going to push as commit b938a5c.
Since your change was applied there have been 15 commits pushed to the master branch:

  • a93605f: 8326763: Consolidate print methods in ContiguousSpace
  • 41242cb: 8325762: Use PassFailJFrame.Builder.splitUI() in PrintLatinCJKTest.java
  • 5db50ac: 8326892: Remove unused PSAdaptiveSizePolicyResizeVirtualSpaceAlot develop flag
  • eb4b6fa: 8326590: Improve description of MarkStackSize[Max] flags
  • e7e8083: 8326781: G1ConcurrentMark::top_at_rebuild_start() should take a HeapRegion* not an uint
  • e6b3bda: 8326509: Clean up JNIEXPORT in Hotspot after JDK-8017234
  • 1ab6bd4: 8326135: Enhance adlc to report unused operands
  • 3b90ddf: 8326685: Linux builds not reproducible if two builds configured in different build folders
  • 9b1f1e5: 8326389: [test] improve assertEquals failure output
  • 6cad07c: 8325746: Refactor Loop Unswitching code
  • ... and 5 more: https://git.openjdk.org/jdk/compare/9f0e7da64e21237322e55ca4f0e3639fa5d1c4ed...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Feb 28, 2024
@openjdk openjdk bot closed this Feb 28, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Feb 28, 2024
@openjdk
Copy link

openjdk bot commented Feb 28, 2024

@vnkozlov Pushed as commit b938a5c.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

5 participants