Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8270086: ARM32-softfp: Do not load CONSTANT_double using the condy helper methods in the interpreter #4767

Closed

Conversation

@mychris
Copy link
Member

@mychris mychris commented Jul 13, 2021

Hi,

please review this fix for the template interpreter for ARM32-softfp.

With the introduction of condy, the interpreter started to load double constants using the condy_helper, which is unnecessary. Also, the resolve_ldc implementation of the interpreter runtime might not be designed to load constant values. It should only load strings and condy.

void TemplateTable::ldc2_w() {
transition(vtos, vtos);
const Register Rtags = R2_tmp;
const Register Rindex = R3_tmp;
const Register Rcpool = R4_tmp;
const Register Rbase = R5_tmp;
__ get_unsigned_2_byte_index_at_bcp(Rindex, 1);
__ get_cpool_and_tags(Rcpool, Rtags);
const int base_offset = ConstantPool::header_size() * wordSize;
const int tags_offset = Array<u1>::base_offset_in_bytes();
__ add(Rbase, Rcpool, AsmOperand(Rindex, lsl, LogBytesPerWord));
Label Condy, exit;
#ifdef __ABI_HARD__
Label Long;
// get type from tags
__ add(Rtemp, Rtags, tags_offset);
__ ldrb(Rtemp, Address(Rtemp, Rindex));
__ cmp(Rtemp, JVM_CONSTANT_Double);
__ b(Long, ne);
__ ldr_double(D0_tos, Address(Rbase, base_offset));
__ push(dtos);
__ b(exit);
__ bind(Long);
#endif
__ cmp(Rtemp, JVM_CONSTANT_Long);
__ b(Condy, ne);
#ifdef AARCH64
__ ldr(R0_tos, Address(Rbase, base_offset));
#else
__ ldr(R0_tos_lo, Address(Rbase, base_offset + 0 * wordSize));
__ ldr(R1_tos_hi, Address(Rbase, base_offset + 1 * wordSize));
#endif // AARCH64
__ push(ltos);
__ b(exit);
__ bind(Condy);
condy_helper(exit);
__ bind(exit);
}

I refactored the ldc2_w bytecode imlpementation. Now, the ifdef between soft and hard float covers both types, CONSTANT_Long and CONSTANT_Double. I did this, because for the softfp, we can use a conditional cmp, since double and long are both placed on the stack in the same way. The same is also done for CONSTANT_Integer and CONSTANT_Float in the ldc bytecode implementation. Also, I think it is easier to follow the code this way.

Old ldc2_w implementation (before condy was implemented for ARM32):

void TemplateTable::ldc2_w() {
transition(vtos, vtos);
const Register Rtags = R2_tmp;
const Register Rindex = R3_tmp;
const Register Rcpool = R4_tmp;
const Register Rbase = R5_tmp;
__ get_unsigned_2_byte_index_at_bcp(Rindex, 1);
__ get_cpool_and_tags(Rcpool, Rtags);
const int base_offset = ConstantPool::header_size() * wordSize;
const int tags_offset = Array<u1>::base_offset_in_bytes();
__ add(Rbase, Rcpool, AsmOperand(Rindex, lsl, LogBytesPerWord));
#ifdef __ABI_HARD__
Label Long, exit;
// get type from tags
__ add(Rtemp, Rtags, tags_offset);
__ ldrb(Rtemp, Address(Rtemp, Rindex));
__ cmp(Rtemp, JVM_CONSTANT_Double);
__ b(Long, ne);
__ ldr_double(D0_tos, Address(Rbase, base_offset));
__ push(dtos);
__ b(exit);
__ bind(Long);
#endif
#ifdef AARCH64
__ ldr(R0_tos, Address(Rbase, base_offset));
#else
__ ldr(R0_tos_lo, Address(Rbase, base_offset + 0 * wordSize));
__ ldr(R1_tos_hi, Address(Rbase, base_offset + 1 * wordSize));
#endif // AARCH64
__ push(ltos);
#ifdef __ABI_HARD__
__ bind(exit);
#endif
}

Testing: ARMv7-A (hardfp) hotspot tier1, ARMv5TE (softfp) hotspot tier1


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8270086: ARM32-softfp: Do not load CONSTANT_double using the condy helper methods in the interpreter

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4767/head:pull/4767
$ git checkout pull/4767

Update a local copy of the PR:
$ git checkout pull/4767
$ git pull https://git.openjdk.java.net/jdk pull/4767/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 4767

View PR using the GUI difftool:
$ git pr show -t 4767

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4767.diff

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Jul 13, 2021

👋 Welcome back cgo! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jul 13, 2021

@mychris The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

Loading

@mlbridge
Copy link

@mlbridge mlbridge bot commented Jul 13, 2021

Webrevs

Loading

@dean-long
Copy link
Member

@dean-long dean-long commented Jul 13, 2021

/cc hotspot-runtime

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jul 13, 2021

@dean-long
The hotspot-runtime label was successfully added.

Loading

@mychris
Copy link
Member Author

@mychris mychris commented Jul 22, 2021

@dholmes-ora or @coleenp, could you maybe have a look at this, since you already looked into #4582?

Loading

@dholmes-ora
Copy link
Member

@dholmes-ora dholmes-ora commented Jul 22, 2021

Sorry @mychris I never noticed this PR mail. I have taken a look and the logic seems sounds to me but I'm no expert at writing ARM code so I can't really approve it in detail.

David

Loading

Copy link
Contributor

@shipilev shipilev left a comment

I see that this patch does, and I think it goes in the right direction. Let's follow the form of other arches TemplateTable::ldc2_w does, and what other places in ARM32 template interpreter do?

These are my suggestions:

  • Rename label Condy to NotLong
  • Rename label exit to Done
  • Handle only double path under !_ABI_HARD_PATH

Something like (drafting it blindly):

  Label NotDouble, NotLong, Done;
  __ cmp(Rtemp, JVM_CONSTANT_Double);
  __ b(NotDouble, ne);
#ifdef __ABI_HARD__
  __ ldr_double(D0_tos, Address(Rbase, base_offset));
#else
  __ ldr(R0_tos_lo, Address(Rbase, base_offset + 0 * wordSize));
  __ ldr(R1_tos_hi, Address(Rbase, base_offset + 1 * wordSize));
#endif
  __ push(dtos);
  __ b(Done);

  __ bind(NotDouble);
  __ cmp(Rtemp, JVM_CONSTANT_Long);
  __ b(NotLong, ne);
  __ ldr(R0_tos_lo, Address(Rbase, base_offset + 0 * wordSize));
  __ ldr(R1_tos_hi, Address(Rbase, base_offset + 1 * wordSize));
  __ push(ltos);
  __ b(Done);

  __ bind(NotLong);
  condy_helper(Done);
  __ bind(Done);

Loading

@shipilev
Copy link
Contributor

@shipilev shipilev commented Jul 26, 2021

TBH, I am not even sure that we need to check __ABI_HARD__ here instead of __SOFTFP__ here. Looking at TemplateTable::dload and load_category2_local, I would have expected __SOFTFP__...

Loading

@mychris
Copy link
Member Author

@mychris mychris commented Jul 26, 2021

Thanks for looking into this, Aleksey.

I will look into your suggestions. The reason I did a bigger #ifdef block was, because I think the problem was introduced because the softfp path was mixed into the hardfp one. For me, this makes the softfp path harder to understand, in contrast to my suggested solution. I know that having bigger #ifdef brings other problems with it. Do you think it is better to keep the #ifdef as small as possible and mix the softfp path into the hardfp?

TBH, I am not even sure that we need to check ABI_HARD here instead of SOFTFP here. Looking at TemplateTable::dload and load_category2_local, I would have expected SOFTFP...

I don't quite understand what you mean here. Both, ABI_HARD and SOFTFP should be macros which specify which ABI should be used, so for me, if __SOFTFP__ is 1, __ABI_HARD__ is 0, or the other way around, or am I missing something? Could you elaborate why you think this should be changed?

Loading

@shipilev
Copy link
Contributor

@shipilev shipilev commented Jul 26, 2021

AFAICS, the real problem is that "double" path is not implemented for "!ABI_HARD" path, so it slides right into condy helper, right? Which calls for fixing that part, and that part only. This also avoids touching the "long" path and relying on the "ltos" being fine for doubles. Other parts of template interpreter deal with this already, see TemplateTable::dload.

AFAIU, ABI_HARD and SOFTFP are defining different things. SOFTFP defines whether to use hardware instructions or software emulation to deal with FP. Whereas ABI_HARD covers the ABI parts, i.e. which registers to use to pass FP values between calls. Here we don't call anything, but only do loads. This reasoning seems to be consistent with what TemplateTable::dload does. Seeing how -mfloat-abi takes three values, I can see ABI_HARD and SOFTFP might be not mutually exclusive?

Loading

@mychris
Copy link
Member Author

@mychris mychris commented Jul 26, 2021

AFAIU, ABI_HARD and SOFTFP are defining different things. SOFTFP defines whether to use hardware instructions or software emulation to deal with FP. Whereas ABI_HARD covers the ABI parts, i.e. which registers to use to pass FP values between calls. Here we don't call anything, but only do loads. This reasoning seems to be consistent with what TemplateTable::dload does. Seeing how -mfloat-abi takes three values, I can see ABI_HARD and SOFTFP might be not mutually exclusive?

That makes sense.
I checked with my GCC:

  • -mfloat-abi=soft defines __SOFTFP__
  • -mfloat-abi=softfp defines neither __ARM_PCS_VFP, nor __SOFTFP__ (__ARM_PCS_VFP is used to define __ABI_HARD__ in globalDefinitions_arm.hpp)
  • -mfloat-abi=hard defines __ARM_PCS_VFP

My assumption was that __SOFTFP__ is defined for both, -mfloat-abi=soft and -mfloat-abi=softfp, which is not the case. Thanks for pointing that out.

This suggests that we need to be consistent within the template interpreter and always use __SOFTFP__ if we want to know which registers to use to pass floating point values inside the interpreter. __ABI_HARD__ has to be used if it is required to follow the native ABI.

Loading

@mychris
Copy link
Member Author

@mychris mychris commented Jul 27, 2021

I changed the code as suggested.

Now SOFTFP is used instead of ABI_HARD and the #ifdef is inside the JVM_CONSTANT_Double case only, loading double constants as needed.
Other changes are only renaming labels.

Loading

Copy link
Contributor

@shipilev shipilev left a comment

This looks fine to me, thanks! Assuming it passes tier1 on soft/hard-float targets, we are good to go.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jul 27, 2021

@mychris This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8270086: ARM32-softfp: Do not load CONSTANT_double using the condy helper methods in the interpreter

Reviewed-by: shade

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 209 new commits pushed to the master branch:

  • 072fe48: 8270901: Typo PHASE_CPP in CompilerPhaseType
  • d7b5cb6: 8271368: [BACKOUT] JDK-8266054 VectorAPI rotate operation optimization
  • ecd4455: 8266510: Nimbus JTree default tree cell renderer does not use selected text color
  • d994b93: 8266054: VectorAPI rotate operation optimization
  • ed1cb24: 8271118: C2: StressGCM should have higher priority than frequency-based policy
  • 9bc52af: 8271323: [TESTBUG] serviceability/sa/ClhsdbCDSCore.java fails with -XX:TieredStopAtLevel=1
  • 752b6df: 8261236: C2: ClhsdbJstackXcompStress test fails when StressGCM is enabled
  • a50161b: Merge
  • f1e15c8: 8271350: runtime/Safepoint tests use OutputAnalyzer::shouldMatch instead of shouldContaint
  • fbe28e4: 8270866: NPE in DocTreePath.getTreePath()
  • ... and 199 more: https://git.openjdk.java.net/jdk/compare/8973867fb9568a3a527b763c9ce10cebdfb306d0...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@shipilev) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

Loading

@openjdk openjdk bot added the ready label Jul 27, 2021
@mychris
Copy link
Member Author

@mychris mychris commented Jul 28, 2021

Yes, hotspot tier1 on ARMv7-A (hard fload) and ARMv5TE (soft float) are passing.
Windows build failure is unrelated.

Loading

@shipilev
Copy link
Contributor

@shipilev shipilev commented Jul 28, 2021

Yes, hotspot tier1 on ARMv7-A (hard fload) and ARMv5TE (soft float) are passing.
Windows build failure is unrelated.

OK, good. I'll be happy to sponsor this.

Loading

@mychris
Copy link
Member Author

@mychris mychris commented Jul 28, 2021

Thanks, but don't we need a second reviewer, or is this change trivial enough?

Loading

@shipilev
Copy link
Contributor

@shipilev shipilev commented Jul 28, 2021

Thanks, but don't we need a second reviewer, or is this change trivial enough?

For ARM-specific code, I don't think you would get more Oracle reviewers. Maybe @bulasevich wants to take a look.

Loading

@BorisUlasevich
Copy link

@BorisUlasevich BorisUlasevich commented Jul 28, 2021

(I am not a formal reviewer)

I like the change.

Though I do catch what was the original motivation, was it an incorrect behavior? Is there any test for that?

And, frankly, I do not quite understand (NotLong && NotDouble) case for ldc2_w bytecode template which is designed to get long or double. Is condy_helper ever called here?

Loading

@mychris
Copy link
Member Author

@mychris mychris commented Jul 28, 2021

Thanks for looking into this.

The motivation is an incorrect behavior. The ldc2_w bytecode is used to load 3 different types of constants, CONSTANT_Long, CONSTANT_Double, and CONSTANT_Dynamic. The condy_helper should only be used to load constants of type CONSTANT_Dynamic. Currently it also loads CONSTANT_Double constants on soft-float targets, which is wrong and should not be done.

For this specific faulty behavior, there is no test case and I don't think there can be one, as loading is done using the wrong path, but still succeeds.

I don't think there are any test cases for loading primitive constants (as this is done while loading the runtime and therefor tested alot without a specific test case). There are some tests for constant dynamics in test/hotspot/jtreg/runtime/condy.

Loading

@BorisUlasevich
Copy link

@BorisUlasevich BorisUlasevich commented Jul 28, 2021

Ok, thanks.
The change is good for me.

Loading

@mychris
Copy link
Member Author

@mychris mychris commented Jul 28, 2021

Thank you.
/integrate

Loading

@openjdk openjdk bot added the sponsor label Jul 28, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Jul 28, 2021

@mychris
Your change (at version 13c7824) is now ready to be sponsored by a Committer.

Loading

@shipilev
Copy link
Contributor

@shipilev shipilev commented Jul 28, 2021

/sponsor

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jul 28, 2021

Going to push as commit a066c7b.
Since your change was applied there have been 209 commits pushed to the master branch:

  • 072fe48: 8270901: Typo PHASE_CPP in CompilerPhaseType
  • d7b5cb6: 8271368: [BACKOUT] JDK-8266054 VectorAPI rotate operation optimization
  • ecd4455: 8266510: Nimbus JTree default tree cell renderer does not use selected text color
  • d994b93: 8266054: VectorAPI rotate operation optimization
  • ed1cb24: 8271118: C2: StressGCM should have higher priority than frequency-based policy
  • 9bc52af: 8271323: [TESTBUG] serviceability/sa/ClhsdbCDSCore.java fails with -XX:TieredStopAtLevel=1
  • 752b6df: 8261236: C2: ClhsdbJstackXcompStress test fails when StressGCM is enabled
  • a50161b: Merge
  • f1e15c8: 8271350: runtime/Safepoint tests use OutputAnalyzer::shouldMatch instead of shouldContaint
  • fbe28e4: 8270866: NPE in DocTreePath.getTreePath()
  • ... and 199 more: https://git.openjdk.java.net/jdk/compare/8973867fb9568a3a527b763c9ce10cebdfb306d0...master

Your commit was automatically rebased without conflicts.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jul 28, 2021

@shipilev @mychris Pushed as commit a066c7b.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
5 participants