Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8351994: Enable Extended EVEX to REX2/REX demotion when src and dst are the same #24431

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

vamsi-parasa
Copy link
Contributor

@vamsi-parasa vamsi-parasa commented Apr 4, 2025

The current scheme for Intel APX NDD code generation favors the emission of NDD instruction on APX-enabled targets, even if destination and source registers are the same. To prevent this, this PR extends the assembler layer to demote EEVEX to REX encoding if dst matches with source operands.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8351994: Enable Extended EVEX to REX2/REX demotion when src and dst are the same (Sub-task - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24431/head:pull/24431
$ git checkout pull/24431

Update a local copy of the PR:
$ git checkout pull/24431
$ git pull https://git.openjdk.org/jdk.git pull/24431/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 24431

View PR using the GUI difftool:
$ git pr show -t 24431

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24431.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 4, 2025

👋 Welcome back sparasa! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Apr 4, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot added the rfr Pull request is ready for review label Apr 4, 2025
@openjdk
Copy link

openjdk bot commented Apr 4, 2025

@vamsi-parasa The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Apr 4, 2025
@mlbridge
Copy link

mlbridge bot commented Apr 4, 2025

Webrevs

bool Assembler::is_demotable(bool no_flags, int dst_enc, int nds_enc) {
return (!no_flags && dst_enc == nds_enc);
}

Copy link
Member

@jatin-bhateja jatin-bhateja Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vamsi-parasa , We are missing a case where dst_enc can be equal to src_enc; in that case, we can still demote EVEX to REX/REX2 encoding, along with a change in primary opcode if needed.

This will apply to all the commutative operations (ADD/ AND / OR / XOR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the proposal. This feature will be enabled in another PR using the JBS entry https://bugs.openjdk.org/browse/JDK-8354348.

@@ -13738,6 +13815,14 @@ int Assembler::simd_prefix_and_encode(XMMRegister dst, XMMRegister nds, XMMRegis
}
}

bool Assembler::is_demotable(bool no_flags, int dst_enc, int nds_enc, int src_enc) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src_enc is not being used in this method so could be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the src_enc removed in the updated code.

@@ -17059,7 +17182,7 @@ void Assembler::esetzucc(Condition cc, Register dst) {
assert(0 <= cc && cc < 16, "illegal cc");
InstructionAttr attributes(AVX_128bit, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
// Encoding Format : eevex_prefix (4 bytes) | opcode_cc | modrm
int encode = evex_prefix_and_encode_ndd(0, 0, dst->encoding(), VEX_SIMD_F2, /* MAP4 */VEX_OPCODE_0F_3C, &attributes);
int encode = evex_prefix_and_encode_ndd(0, 0, dst->encoding(), VEX_SIMD_F2, /* MAP4 */VEX_OPCODE_0F_3C, &attributes); //TODO: check this
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be demoted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the demotion disabled for esetzucc instruction in the updated code.

InstructionAttr *attributes, bool no_flags, bool use_prefixq) {
// Demote RegRegReg instructions
if (!no_flags && dst_enc == nds_enc) {
return use_prefixq? prefixq_and_encode(dst_enc, src_enc) : prefix_and_encode(dst_enc, src_enc);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit pick, need space before ? as below:
use_prefixq ? prefixq_and_encode

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the space fixed in the updated code.

InstructionAttr *attributes, bool no_flags) {
InstructionAttr *attributes, bool no_flags, bool use_prefixq) {
// Demote RegRegReg instructions
if (!no_flags && dst_enc == nds_enc) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be replaced by call to is_demotable().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the updated code refactored to use is_demotable().

int Assembler::evex_prefix_and_encode_ndd(int dst_enc, int nds_enc, VexSimdPrefix pre, VexOpcode opc,
InstructionAttr *attributes, bool no_flags, bool use_prefixq) {
// Demote RegReg and RegRegImm instructions
if (!no_flags && dst_enc == nds_enc) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be replaced by call to is_demotable().

@@ -17182,7 +17178,7 @@ void Assembler::esetzucc(Condition cc, Register dst) {
assert(0 <= cc && cc < 16, "illegal cc");
InstructionAttr attributes(AVX_128bit, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
// Encoding Format : eevex_prefix (4 bytes) | opcode_cc | modrm
int encode = evex_prefix_and_encode_ndd(0, 0, dst->encoding(), VEX_SIMD_F2, /* MAP4 */VEX_OPCODE_0F_3C, &attributes); //TODO: check this
int encode = evex_prefix_and_encode_ndd(0, 0, dst->encoding(), VEX_SIMD_F2, /* MAP4 */VEX_OPCODE_0F_3C, &attributes, false, false, false); // demotion disabled
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding the demote flag, the better way to do this is by adding a single register evex_prefix_and encode_ndd:
evex_prefix_and_encode_ndd(dst->encoding(), VEX_SIMD_F2, /* MAP4 */VEX_OPCODE_0F_3C, &attributes);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

3 participants