-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8351994: Enable Extended EVEX to REX2/REX demotion when src and dst are the same #24431
base: master
Are you sure you want to change the base?
Conversation
👋 Welcome back sparasa! A progress list of the required criteria for merging this PR into |
❗ This change is not yet ready to be integrated. |
@vamsi-parasa The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
bool Assembler::is_demotable(bool no_flags, int dst_enc, int nds_enc) { | ||
return (!no_flags && dst_enc == nds_enc); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vamsi-parasa , We are missing a case where dst_enc can be equal to src_enc; in that case, we can still demote EVEX to REX/REX2 encoding, along with a change in primary opcode if needed.
This will apply to all the commutative operations (ADD/ AND / OR / XOR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the proposal. This feature will be enabled in another PR using the JBS entry https://bugs.openjdk.org/browse/JDK-8354348.
@@ -13738,6 +13815,14 @@ int Assembler::simd_prefix_and_encode(XMMRegister dst, XMMRegister nds, XMMRegis | |||
} | |||
} | |||
|
|||
bool Assembler::is_demotable(bool no_flags, int dst_enc, int nds_enc, int src_enc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src_enc is not being used in this method so could be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the src_enc removed in the updated code.
@@ -17059,7 +17182,7 @@ void Assembler::esetzucc(Condition cc, Register dst) { | |||
assert(0 <= cc && cc < 16, "illegal cc"); | |||
InstructionAttr attributes(AVX_128bit, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false); | |||
// Encoding Format : eevex_prefix (4 bytes) | opcode_cc | modrm | |||
int encode = evex_prefix_and_encode_ndd(0, 0, dst->encoding(), VEX_SIMD_F2, /* MAP4 */VEX_OPCODE_0F_3C, &attributes); | |||
int encode = evex_prefix_and_encode_ndd(0, 0, dst->encoding(), VEX_SIMD_F2, /* MAP4 */VEX_OPCODE_0F_3C, &attributes); //TODO: check this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be demoted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the demotion disabled for esetzucc instruction in the updated code.
InstructionAttr *attributes, bool no_flags, bool use_prefixq) { | ||
// Demote RegRegReg instructions | ||
if (!no_flags && dst_enc == nds_enc) { | ||
return use_prefixq? prefixq_and_encode(dst_enc, src_enc) : prefix_and_encode(dst_enc, src_enc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit pick, need space before ? as below:
use_prefixq ? prefixq_and_encode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the space fixed in the updated code.
InstructionAttr *attributes, bool no_flags) { | ||
InstructionAttr *attributes, bool no_flags, bool use_prefixq) { | ||
// Demote RegRegReg instructions | ||
if (!no_flags && dst_enc == nds_enc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be replaced by call to is_demotable().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the updated code refactored to use is_demotable().
int Assembler::evex_prefix_and_encode_ndd(int dst_enc, int nds_enc, VexSimdPrefix pre, VexOpcode opc, | ||
InstructionAttr *attributes, bool no_flags, bool use_prefixq) { | ||
// Demote RegReg and RegRegImm instructions | ||
if (!no_flags && dst_enc == nds_enc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be replaced by call to is_demotable().
@@ -17182,7 +17178,7 @@ void Assembler::esetzucc(Condition cc, Register dst) { | |||
assert(0 <= cc && cc < 16, "illegal cc"); | |||
InstructionAttr attributes(AVX_128bit, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false); | |||
// Encoding Format : eevex_prefix (4 bytes) | opcode_cc | modrm | |||
int encode = evex_prefix_and_encode_ndd(0, 0, dst->encoding(), VEX_SIMD_F2, /* MAP4 */VEX_OPCODE_0F_3C, &attributes); //TODO: check this | |||
int encode = evex_prefix_and_encode_ndd(0, 0, dst->encoding(), VEX_SIMD_F2, /* MAP4 */VEX_OPCODE_0F_3C, &attributes, false, false, false); // demotion disabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding the demote flag, the better way to do this is by adding a single register evex_prefix_and encode_ndd:
evex_prefix_and_encode_ndd(dst->encoding(), VEX_SIMD_F2, /* MAP4 */VEX_OPCODE_0F_3C, &attributes);
The current scheme for Intel APX NDD code generation favors the emission of NDD instruction on APX-enabled targets, even if destination and source registers are the same. To prevent this, this PR extends the assembler layer to demote EEVEX to REX encoding if dst matches with source operands.
Progress
Issue
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24431/head:pull/24431
$ git checkout pull/24431
Update a local copy of the PR:
$ git checkout pull/24431
$ git pull https://git.openjdk.org/jdk.git pull/24431/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 24431
View PR using the GUI difftool:
$ git pr show -t 24431
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24431.diff
Using Webrev
Link to Webrev Comment