-
Notifications
You must be signed in to change notification settings - Fork 415
Generate instanceOf checks in Block Versioner for index field objects and introduce throttling heuristics #4598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@vijaysun-omr - This is the Block Versioner fix we talked about earlier. |
|
I would like a review from @andrewcraik for this change. |
|
I would suggest making sure we have more than 3 bound checks being versioned against in the case when we add one instanceof (and maybe more than 4 bound checks in the case when we add 2 instanceofs). I have a hard time believing the block versioning tranformation would be a net win when replacing a bound check with an instanceof. |
One issue in implementing that suggestion is the fact that we rely on a future VP pass to remove unnecessary instanceof checks that this code is adding to the IL. So, we don't know how many of the new instanceof checks are going to survive VP and therefore can't make an accurate decision on when to abandon the opt for fear of making performance worse. |
|
While it's certainly possible that some change in the IL trees means that only a future VP pass is able to derive some information necessary for eliminating the instanceof, it is also the case that the checking could be done in the pass of VP that is doing block versioner transformation i.e. if we know that the instanceof is not needed, then why emit it in the first place (i.e. in most cases, we ought to have enough information in the current pass of VP. |
|
|
||
| temp.add(nextComparisonNode); | ||
|
|
||
| if (arrayIndex->_baseNode && arrayIndex->_instanceOfClass && !comp()->compileRelocatableCode() && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The !compileRelocatableCode seems suboptimal - if we can relocate the class we can do the opt. @dsouzai is there an easy way with the new infrastructure to allow this to work under AOT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// InstanceOf check for the object were we load the array index from
TR::Node *objectRefChild = arrayIndex->_baseNode->getFirstChild();
In order for this tree to have been created, I'm guessing we must have made a frontend query to know the instanceof relationship. If that's true, then the validation would've already been created for AOT. If using non-SVM AOT, then the frontend query would've returned NULL, in which case the compiler would not know any instanceof information.
As such, I'm not sure if the AOT guard is needed. That said, if there really is a concern, you can always make a call to the frontend to do the instanceof check, and if it passes, then go down the code path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reply to this @klangman ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The instanceOf-check generated below (one that existed before I changed anything here) has this AOT check as well. I can't say I fully understand why, it seems to me that the instanceOf-checks are needed for correctness. I would be happy to disable this for AOT, but I guess we might need someone from AOT land to help understand why we can get away without the instanceOf-check in the case of AOT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm - if the check is for correctness I can't see why they can be omitted for AOT without risking violating the correctness. Would you agree @dsouzai ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a phone call with @klangman to understand the full picture here, and we both agreed that the old code that didn't emit the instanceof checks under AOT was wrong. That said, I know why the code was guarded:
TR::Node *duplicateClassPtr = TR::Node::createWithSymRef(objectRefChild, TR::loadaddr, 0, comp()->getSymRefTab()->findOrCreateClassSymbol(objectRef->getSymbolReference()->getOwningMethodSymbol(comp()), -1, arrayLength->_instanceOfClass, false));
passes in a cpIndex of -1, which non SVM AOT can't handle. My guess is that whoever implemented this saw weird crashes under AOT and did the usual "I'm just gonna disable this under AOT".
Technically there should be no problem in SVM AOT; however, to keep the scope of this PR from getting bigger and bigger, both Kevin and I agreed that the best course for now is to prevent versioning if AOT && instanceOf checks were added.
In a future PR, we can enable versioning in AOT when the SVM is enabled.
|
Given the higher cost of an instanceof some kind of heuristic should be applied to make sure the benefit of versioning is high enough to make the instanceof worth it overall - as Vijay suggested. We should also have reasonable confidence that the test should pass at runtime when we create it. There should be tracing to document this choice and why. The name |
@andrewcraik It's the class of the object where the index is being loaded from. Maybe "indexObjInstanceOfClass" would be less confusing? |
|
how about |
|
@klangman any update on a rename so we can keep this moving forward since the issue it is addressing is real... |
@andrewcraik Other issues getting in my way. Hope to get back to this in a few days. |
|
@klangman any further updates on this? |
Still looking to find the time to get back to this one. |
|
@vijaysun-omr @andrewcraik Pushed changes to address your comments (I hope). It now does compile-time instanceOf checks, does not generate redundant instanceOf checks and aborts versioning of blocks where the cost of the required tests is to high. |
|
Thanks @klangman could you please update the commit and PR messages to reflect the changes that have been added in ? i.e. around the heuristic for when to allow the versioning to proceed. This was not part of the original change in this PR and so we do not have a description of it I believe. |
Done! |
|
Thanks, please update the commit message too. |
19d70e7 to
1c5c04a
Compare
Better now? |
|
@genie-omr build all |
|
I will wait for a review from @andrewcraik to see if his earlier comments have been addressed (and the overall logic change). |
|
friendly reminder, @andrewcraik to review this |
| TR::Node *instanceofNode = TR::Node::createWithSymRef(TR::instanceof, 2, 2, objectRefChild->duplicateTree(), duplicateClassPtr, comp()->getSymRefTab()->findOrCreateInstanceOfSymbolRef(comp()->getMethodSymbol())); | ||
| TR::Node *ificmpeqNode = TR::Node::createif(TR::ificmpeq, instanceofNode, TR::Node::create(objectRef, TR::iconst, 0, 0)); | ||
| comparisonNodes->add(ificmpeqNode); | ||
| requestOpt(OMR::localCSE, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be requested before we commit to doing the transform below?
| TR::Node *instanceofNode = TR::Node::createWithSymRef(TR::instanceof, 2, 2, indexObjectRefChild->duplicateTree(), duplicateClassPtr, comp()->getSymRefTab()->findOrCreateInstanceOfSymbolRef(comp()->getMethodSymbol())); | ||
| TR::Node *ificmpeqNode = TR::Node::createif(TR::ificmpeq, instanceofNode, TR::Node::create(arrayIndex->_baseNode, TR::iconst, 0, 0)); | ||
| temp.add(ificmpeqNode); | ||
| requestOpt(OMR::localCSE, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same should these be requested before we commit below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if requesting the opts are redundant now that this code will only add instanceOf checks when a compile time instanceOf is not possible or fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they were only added to clean up needless instanceofs then they may well be able to be omitted...
|
I think we should have a debug envvar to disable these additional tests for performance debugging purposes in case we see any regressions from these additional tests. It would also be good to have the scaling factor for the number of instanceof checks and number of null tests pulled out into static local variables which can be overridden with an env var if necessary. |
|
@klangman any update since I know you were keen to try and get this merged? |
I will move the opt requests so they they are only requested when we decide not to abort the compile. But what do you think we should do about the AOT case? Do we more data to decide? |
|
@genie-omr build all |
|
The question about AOT is not a case of performance - if the instance of test is not done we could end up reading past the end of an object/array which could a) cause the VM to crash if the object were allocated at the end of a page of memory without the next page being allocated or b) cause the VM to read past the end of an object/array which could be a security vulnerability. I can't think of a reason why we would need an instance of check when compiling in process and not need the same check when running AOT code - AOT needs more checks not less. |
Yes, I think Irwin and I agree on that. We are just suggesting that we abort versioning of the block when doing an AOT compile ( compileRelocatableCode() ) |
|
I am fine with that since anything performance critical should get recompiled. Any concerns @vijaysun-omr ? |
|
That sounds okay to me, since there is a decent chance of recompilation out of AOT as Andrew suggests but even otherwise, I feel we could take the approach of waiting for an actual opportunity (that I feel will be somewhat unlikely) where this optimization is important to do in AOT compilations before doing the work to make that combination work. |
|
OK.. Sounds like we are all on the same page regarding AOT. I'll make the changes (hopefully today). |
|
Oh... To be clear, the change will be, if we need instanceOf checks and compileRelocatableCode() then abort versioning the block. Without instanceOf checks there are no AOT issues to worry about. |
When versioning blocks, it's possible for the generated checks to load index variables from objects by duplicating code from within blocks that are protected by inlining guards. Therefore the object might not be of the expected type possibly resulting in loads that reach past the end of the object. In rare cases this load can result in a SEGV when the object is allocated at the very end of the heap. In addition, this change will abort versioning of blocks when the estimated cost of executing the required tests is higher then the savings from removing bound-checks. Signed-off-by: Kevin Langman <langman@ca.ibm.com>
|
@andrewcraik @vijaysun-omr Pushed the changes to abort versioning a block when it's an AOT compile and we would have needed to add an instanceOf check. |
|
I think that makes sense - over to @vijaysun-omr for final review |
|
@genie-omr build all |
1 similar comment
|
@genie-omr build all |
|
Looks comprehensive to me after all the reviews. Merging. |
When versioning blocks, it's possible for the generated checks to load
index variables from objects by duplicating code from within blocks that
are protected by inlining guards. Therefore the object might not be of
the expected type possibly resulting in loads that reach past the end
of the object. In rare cases this load can result in a SEGV when the
object is allocated at the very end of the heap. In addition, this change
will abort versioning of blocks when the estimated cost of executing the
required tests is higher then the savings from removing bound-checks.
Signed-off-by: Kevin Langman langman@ca.ibm.com