-
Notifications
You must be signed in to change notification settings - Fork 6.1k
8283726: x86 intrinsics for compare method in Integer and Long #7975
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Welcome back vamsi-parasa! A progress list of the required criteria for merging this PR into |
@vamsi-parasa The following labels will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command. |
This is both complicated and inefficient, I would suggest building the intrinsic in the IR graph so that the compiler can simplify |
Webrevs
|
do_intrinsic(_compare_i, java_lang_Integer, compare_name, int2_int_signature, F_S) \ | ||
do_intrinsic(_compare_l, java_lang_Long, compare_name, long2_int_signature, F_S) \ | ||
do_intrinsic(_compareUnsigned_i, java_lang_Integer, compare_unsigned_name, int2_int_signature, F_S) \ | ||
do_name( compare_unsigned_name, "compareUnsigned") \ | ||
do_intrinsic(_compareUnsigned_l, java_lang_Long, compare_unsigned_name, long2_int_signature, F_S) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating these methods as intrinsic will create a box around the underneath comparison logic, this shall prevent any regular constant folding which could have optimized out certain control paths, I would suggest to to handle constant folding for newly added nodes in associated Value routines.
class CompareSignedINode : public CompareNode { | ||
public: | ||
CompareSignedINode(Node* in1, Node* in2) : CompareNode(in1, in2) {} | ||
virtual int Opcode() const; | ||
}; | ||
|
||
//---------- CompareSignedLNode ----------------------------------------------------- | ||
class CompareSignedLNode : public CompareNode { | ||
public: | ||
CompareSignedLNode(Node* in1, Node* in2) : CompareNode(in1, in2) {} | ||
virtual int Opcode() const; | ||
}; | ||
|
||
//---------- CompareUnsignedINode ----------------------------------------------------- | ||
class CompareUnsignedINode : public CompareNode { | ||
public: | ||
CompareUnsignedINode(Node* in1, Node* in2) : CompareNode(in1, in2) {} | ||
virtual int Opcode() const; | ||
}; | ||
|
||
//---------- CompareUnsignedLNode ----------------------------------------------------- | ||
class CompareUnsignedLNode : public CompareNode { | ||
public: | ||
CompareUnsignedLNode(Node* in1, Node* in2) : CompareNode(in1, in2) {} | ||
virtual int Opcode() const; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intent here seems to be to enable further auto-vectorization of newly create IR nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is the intention.
for (int i = 0; i < BUFFER_SIZE; i++) { | ||
input1[i] = rng.nextInt(); | ||
if (mode.equals("equal")) { | ||
input2[i] = input1[i]; | ||
continue; | ||
} | ||
else input2[i] = rng.nextInt(); | ||
|
||
if (!mode.equals("mixed")) { | ||
boolean doSwap = (mode.equals("lessThanEqual") && input1[i] > input2[i]) || | ||
(mode.equals("greaterThanEqual") && input1[i] < input2[i]); | ||
if (doSwap) { | ||
int tmp = input1[i]; | ||
input1[i] = input2[i]; | ||
input2[i] = tmp; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic re-organization suggestion:-
for (int i = 0 ; i < BUFFER_SIZE; i++) {
input1[i] = rng.nextLong();
}
if (mode.equals("equals") {
GRADIANT = 0;
} else if (mode.equals("greaterThanEquals")) {
GRADIANT = 1;
} else {
assert mode.equals("lessThanEqual");
GRADIANT = -1;
}
for(int i = 0 ; i < BUFFER_SIZE; i++) {
input2[i] = input1[i] + i*GRADIANT;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The suggested refactoring is definitely elegant but one rare possibility is overflow due to the addition/subtraction. The swap logic doesn't have that problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The suggested refactoring is definitely elegant but one rare possibility is overflow due to the addition/subtraction. The swap logic doesn't have that problem.
Given that BUFFER_SIZE is 1024 can we not circumvent this problem by using rng.nextLong(4096) in initialization loop.
/* | ||
* Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved. | ||
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. | ||
* | ||
* This code is free software; you can redistribute it and/or modify it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can unify this benchmark along with integer compare micro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do the unification.
//----------Intrinsics for Compare function------------------------------------ | ||
|
||
instruct compareSignedI_rReg(rRegI dst, rRegI op1, rRegI op2, rRegI tmp, rFlagsReg cr) | ||
%{ | ||
match(Set dst (CompareSignedI op1 op2)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also include these patterns in x86_32.ad
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update x86_32.ad as well.
Thank you for the suggestion! This intrinsic uses 1 cmp instruction instead of two and shows ~10% improvement due to better branch prediction. Even without the intrinsic, the compiler is currently able to reduce it to x u< y but is still generating two cmp (unsigned) instructions as Integer.compareUnsigned(x, y) is implemented as x u< y? -1 : (x ==y ? 0 : 1). |
@vamsi-parasa But normally the result of the
It is because the compiler can recognise the pattern Thanks. |
I agree, what @merykitty is proposing is to add another transformation on top of newly generated IR nodes which can reduce the emitted JIT sequence, along with this constant operand handling for newly created IR nodes though Value routines will also be useful. |
@vamsi-parasa This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
@vamsi-parasa This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the |
Implements x86 intrinsics for compare() method in java.lang.Integer and java.lang.Long.
Progress
Issue
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/7975/head:pull/7975
$ git checkout pull/7975
Update a local copy of the PR:
$ git checkout pull/7975
$ git pull https://git.openjdk.java.net/jdk pull/7975/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 7975
View PR using the GUI difftool:
$ git pr show -t 7975
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/7975.diff