Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8289925: Shared code shouldn't reference the platform specific method frame::interpreter_frame_last_sp() #9411

Conversation

reinrich
Copy link
Member

@reinrich reinrich commented Jul 7, 2022

The method frame::interpreter_frame_last_sp() is a platform method in the sense that it is not declared in a shared header file. It is declared and defined on some platforms though (x86 and aarch64 I think).

frame::interpreter_frame_last_sp() existed on these platforms before vm continuations (aka loom). Shared code that is part of the vm continuations implementation references it. This breaks the platform abstraction.

Using unextended_sp is problematic too because there are no guarantees by the platform abstraction layer for it. In fact unextended_sp < sp is possible on ppc64 and aarch64.

This fix changes the callers of is_sp_in_continuation()

static inline bool is_sp_in_continuation(const ContinuationEntry* entry, intptr_t* const sp) {
  return entry->entry_sp() > sp;
}

to pass the actual sp. This is correct because the following is true on all platforms:

a.sp() > E->entry_sp() > b.sp() > c.sp()

where a, b, c are stack frames in call order and E is a ContinuationEntry. a is the caller frame of the continuation entry frame that corresponds to E.

is_sp_in_continuation() will then return true for b.sp() and c.sp() and false for a.sp()

Testing: hotspot_loom and jdk_loom on x86_64 and aarch64.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8289925: Shared code shouldn't reference the platform specific method frame::interpreter_frame_last_sp()

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/9411/head:pull/9411
$ git checkout pull/9411

Update a local copy of the PR:
$ git checkout pull/9411
$ git pull https://git.openjdk.org/jdk pull/9411/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 9411

View PR using the GUI difftool:
$ git pr show -t 9411

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/9411.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 7, 2022

👋 Welcome back rrich! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jul 7, 2022

@reinrich The following label will be automatically applied to this pull request:

  • hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-runtime hotspot-runtime-dev@openjdk.org label Jul 7, 2022
@reinrich
Copy link
Member Author

reinrich commented Jul 7, 2022

/label hotspot-compiler

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Jul 7, 2022
@openjdk
Copy link

openjdk bot commented Jul 7, 2022

@reinrich
The hotspot-compiler label was successfully added.

@reinrich reinrich marked this pull request as ready for review July 8, 2022 13:31
@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 8, 2022
@mlbridge
Copy link

mlbridge bot commented Jul 8, 2022

Webrevs

@reinrich reinrich changed the title 8289925 Shared code shouldn't reference the platform specific method frame::interpreter_frame_last_sp() 8289925: Shared code shouldn't reference the platform specific method frame::interpreter_frame_last_sp() Jul 20, 2022
@openjdk openjdk bot changed the title 8289925: Shared code shouldn't reference the platform specific method frame::interpreter_frame_last_sp() 8289925: Shared code shouldn't reference the platform specific method frame::interpreter_frame_last_sp() Jul 20, 2022
@reinrich
Copy link
Member Author

Ping? The change is tiny.

Copy link
Contributor

@fisk fisk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@openjdk
Copy link

openjdk bot commented Jul 26, 2022

@reinrich This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8289925: Shared code shouldn't reference the platform specific method frame::interpreter_frame_last_sp()

Reviewed-by: eosterlund, dlong

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been no new commits pushed to the master branch. If another commit should be pushed before you perform the /integrate command, your PR will be automatically rebased. If you prefer to avoid any potential automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jul 26, 2022
@reinrich
Copy link
Member Author

Looks good.

That was prompt, thanks a lot! :)

@dean-long
Copy link
Member

Are you sure unextended_sp() returns the same thing as interpreter_frame_last_sp() on all platforms? I didn't think that was true for aarch64. Maybe what we need is a new shared API that will return what the continuation code expects, or promote interpreter_frame_last_sp() to be shared. It seems that all platforms implement it.
@theRealAph
@pron

@reinrich
Copy link
Member Author

Hi Dean,

thanks for looking at the PR.

Are you sure unextended_sp() returns the same thing as
interpreter_frame_last_sp() on all platforms? I didn't think that was true for
aarch64.

I'm no sure. The question is difficult to answer because it is a platform detail.

It is even hard to make assumptions about the unextended_sp that are true for
all platforms[1].

This was a major issue in the loom port to ppc. Shared code expects to find call
parameters at the caller's unextended_sp which is not the case on ppc.

IMHO uses of unextended_sp in shared code should be replaced with abstractions
as SharedRuntime::out_preserve_stack_slots().

Maybe what we need is a new shared API that will return what the
continuation code expects, or promote interpreter_frame_last_sp() to be
shared. It seems that all platforms implement it. @theRealAph @pron

I'd think any address within the frame is good for calling
Continuation::get_continuation_entry_for_sp().

Thanks, Richard.

[1] On ppc unextended_sp < sp is possible.
See https://github.com/reinrich/loom/blob/4b79c83284f18dd7193ecfe12e97e07674d34405/src/hotspot/cpu/ppc/continuationFreezeThaw_ppc.inline.hpp#L570-L639

@theRealAph
Copy link
Contributor

Are you sure unextended_sp() returns the same thing as interpreter_frame_last_sp() on all platforms? I didn't think that was true for aarch64. Maybe what we need is a new shared API that will return what the continuation code expects, or promote interpreter_frame_last_sp() to be shared.

Indeed. AArch64 makes a strong distinction between the machine SP and the interpreter's expression SP, and much of the difficulty with that port was because it was difficult to tell which was which in shared and x86 code. I wouldn't be able to guarantee that using unextended_sp() was always safe.

@dean-long
Copy link
Member

I'd think any address within the frame is good for calling
Continuation::get_continuation_entry_for_sp().

Given the fact that Continuation::is_frame_in_continuation() uses f.unextended_sp(), and it is called for interpreted frames, I would tentatively agree, but let's see what @pron says.

@dean-long
Copy link
Member

Looking at this some more, if unextended_sp < sp is possible and unextended_sp can actually point outside the current frame and into the next frame (according to continuationFreezeThaw_ppc.inline.hpp), then using unextended_sp doesn't seem quite right if it's not in the current frame. Maybe using something like ContinuationHelper::InterpretedFrame::frame_bottom() would be better.

@reinrich
Copy link
Member Author

Looking at this some more, if unextended_sp < sp is possible and unextended_sp can actually point outside the current frame and into the next frame (according to continuationFreezeThaw_ppc.inline.hpp), then using unextended_sp doesn't seem quite right if it's not in the current frame.

I think you are right.

Maybe using something like ContinuationHelper::InterpretedFrame::frame_bottom() would be better.

I'm currently out of office and not able to look into the code on a decent display. Will do as soon as I'm back.

Ad hoc I think the frame's actual sp (frame::sp()) should be used.

Using other pointers from the interpreter frame is problematic because the code is also called during deoptimization, where the interpreter frame is still skeletal, i.e. the expression stack pointer is not yet set as I learned when trying to use it.

@dean-long
Copy link
Member

It looks like the interpreter trims the stack before a call:

// Calc a precise SP for the call. The SP value we calculated in
// generate_fixed_frame() is based on the max_stack() value, so we would waste stack space
// if esp is not max. Also, the i2c adapter extends the stack space without restoring
// our pre-calced value, so repeating calls via i2c would result in stack overflow.
// Since esp already points to an empty slot, we just have to sub 1 additional slot
// to meet the abi scratch requirements.
// The max_stack pointer will get restored by means of the GR_Lmax_stack local in
// the return entry of the interpreter.
addi(Rscratch2, R15_esp, Interpreter::stackElementSize - frame::abi_reg_args_size);
clrrdi(Rscratch2, Rscratch2, exact_log2(frame::alignment_in_bytes)); // round towards smaller address
resize_frame_absolute(Rscratch2, Rscratch2, R0);

so it seems like using unextended_sp() is actually fine, and the drawing in continuationFreezeThaw_ppc.inline.hpp needs to be updated.

@reinrich
Copy link
Member Author

The interpreter allocates a new frame reserving space for the maximum expression stack. For a call it is trimmed to the current size of the stack. Consequently unextended_sp < sp is possible if max. stack is large.

@dean-long
Copy link
Member

The interpreter allocates a new frame reserving space for the maximum expression stack. For a call it is trimmed to the current size of the stack. Consequently unextended_sp < sp is possible if max. stack is large.

OK, if I'm not mistaken, aarch64 is doing the same thing. So that could mean that code like Continuation::is_frame_in_continuation() that uses unextended_sp is already broken on some platforms. For example, if an interpreted frame in the carrier thread had a very large max_stack, then is_frame_in_continuation() could return true when it should return false.

@reinrich
Copy link
Member Author

Just wanted to let you know that I'm stuck with just my phone for the next 2 weeks. I'll have a closer look when I'm back.
Thanks, Richard.

@dean-long
Copy link
Member

Sure, no problem.

@reinrich
Copy link
Member Author

The interpreter allocates a new frame reserving space for the maximum expression stack. For a call it is trimmed to the current size of the stack. Consequently unextended_sp < sp is possible if max. stack is large.

OK, if I'm not mistaken, aarch64 is doing the same thing.

I have had a look at aarch64 now. There the interpreter also pushes a new frame
reserving space for the maximum expression stack as on ppc64 but the frame is
not trimmed to the current size of the expression stack as on ppc64. So on
aarch64 unextended_sp < sp seems to be impossible.

So that could mean that code like Continuation::is_frame_in_continuation()
that uses unextended_sp is already broken on some platforms. For example, if
an interpreted frame in the carrier thread had a very large max_stack, then
is_frame_in_continuation() could return true when it should return false.

It is likely not yet broken but for ppc64 it will not work. The usage of
unextended_sp in shared code is problematic because there is no shared
specification for it.

Looking at the callers of is_sp_in_continuation() we find the the following is passed as parameter sp:

  1. frame::interpreter_frame_last_sp()
  2. frame::unextended_sp()
  3. frame::sp() - 2
  4. frame::sp()

Where 3. is a special case for continuation entry frames (see
Continuation::get_continuation_entry_for_entry_frame()). -2 is needed because
is_sp_in_continuation() would return false for the sp of the entry frame.

I'd like to change the callers of is_sp_in_continuation()

static inline bool is_sp_in_continuation(const ContinuationEntry* entry, intptr_t* const sp) {
  return entry->entry_sp() > sp;
}

to pass the actual sp. This is correct because the following is true on all
platforms:

a.sp() > E->entry_sp() > b.sp() > c.sp()

where a, b, c are stack frames in call order and E is a ContinuationEntry.
a is the caller frame of the continuation entry frame that corresponds to E.

is_sp_in_continuation() will then return true for b.sp() and c.sp() and
false for a.sp(). At least on ppc64 is_sp_in_continuation() can return true
for a.unextended_sp() because of the frame trimming described above. That's
why it should not be used. frame::interpreter_frame_last_sp() should not be used
as it is not declared and specified as a shared method.

@reinrich
Copy link
Member Author

hotspot_loom and jdk_loom tests on x86_64 and aarch64 still pass with the last commit fdb1409.
Tests with the ppc64 loom port succeeded as well.

@dean-long
Copy link
Member

Actually, for interpreted --> interpreted, aarch64 and ppc64 seem to do the same trimming. The difference is ppc64 does it in the caller, while aarch64 does it in the callee:

// Padding between locals and fixed part of activation frame to ensure

So this would mean for interpreted --> compiled, ppc64 does trimming and aarch64 does not.

I also noticed that on ppc64, the value of unextended_sp for interpreted frames in inconsistent. Whether or not it is the "max stack" value depends on who calls the frame constructor. Some Loom code and sender_for_compiled_frame() sets unextended_sp the same as sp, while only sender_for_interpreter_frame() uses the "max stack" value. Giving unextended_sp a consistent "canonical" value that is always inside the frame, no matter if the callee is interpreted or compiled, seems like it would make unextended_sp() valid for is_sp_in_continuation() as well.

@reinrich
Copy link
Member Author

Hi Dean,

I'll answer your questions below but actually I don't think it is necessary to
look into these platform details at all. The platform abstraction layer does not
specify unextended_sp >= sp. Consequently the callers of
is_sp_in_continuation() rely on undefined behaviour. The proposed change fixes
this.

Also if ppc64 would be changed (and maybe also aarch64 after
JDK-8289128?) to achieve that
unextended_sp always points into the frame then this would be just a property of
the port and, as such, not sufficient.

A dedicated RFE would be needed to establish unextended_sp >= sp as part of
the platform abstraction. Shared code should assert the property and it should
be covered by tests. I wouldn't be in favour of such a RFE though. It would limit
the degree of freedom without need.

Actually, for interpreted --> interpreted, aarch64 and ppc64 seem to do the same trimming. The difference is ppc64 does it in the caller, while aarch64 does it in the callee:

// Padding between locals and fixed part of activation frame to ensure

So this would mean for interpreted --> compiled, ppc64 does trimming and aarch64 does not.

I see, thanks.

I also noticed that on ppc64, the value of unextended_sp for interpreted
frames in inconsistent. Whether or not it is the "max stack" value depends on
who calls the frame constructor. Some Loom code and
sender_for_compiled_frame() sets unextended_sp the same as sp,

On x64 sender_for_compiled_frame() also sets unextended_sp == sp even though
this is not correct if the sender is an interpreted frame. It's not important
because the value is not used. At least it wasn't before Loom.

For the ppc64 Loom port I was forced to construct frames with unextended_sps
similar to x64 (e.g. pointing to the call parameters if interpreted) to satisfy
Loom shared code assumptions. Luckily it worked but IMHO that shared Loom code
breaks the platform abstraction and has to be fixed.

while only sender_for_interpreter_frame() uses the "max stack" value.

For good reason because if the sender is compiled then the sender_sp has to be
used as unextended_sp because that's the pointer to find 'stuff' in compiled
frames. If the sender happens to be an interpreted frame then the sender_sp will
be the "max stack" value. It would be really confusing though to use something
else. Especially if doing it only on ppc64 but not aarch64.

Giving unextended_sp a consistent "canonical" value that is always inside the
frame, no matter if the callee is interpreted or compiled, seems like it would
make unextended_sp() valid for is_sp_in_continuation() as well.

Not really as it would still rely on an undefined property which, for instance,
may be invalid on riscv.

Thanks, Richard.

@dean-long
Copy link
Member

Thinking about this some more, while sp() may work for all platforms, it seems like frame::id() is what the shared Loom code should really be using. Then we would have:

a.id() > E->to_frame().id() > b.id() > c.id()

The Loom code shouldn't really be assuming that stacks grow in a particular direction. That is what frame:is_older() is for. Unfortunately, I found that aarch64 has a problem with this. Given a frame f, we can't assert on aarch64 that

f.sender().is_older(f.id())

because aarch64 uses unextended_sp() for frame::id().

@reinrich
Copy link
Member Author

Thinking about this some more, while sp() may work for all platforms, it seems
like frame::id() is what the shared Loom code should really be using. Then we
would have:

a.id() > E->to_frame().id() > b.id() > c.id()

The Loom code shouldn't really be assuming that stacks grow in a particular
direction. That is what frame:is_older() is for.

With that we would have s.th. like

bool Continuation::is_frame_in_continuation(const ContinuationEntry* entry, const frame& f) {
  return entry->to_frame().is_older(f.id());
}

and replace all uses of is_sp_in_continuation() with it. Not sure if performance
could be a problem then.

!f.is_older(entry->frame_id());

with a platform dependent definition of frame_id() could be faster then.

Unfortunately, I found that aarch64 has a problem with this. Given a frame f,
we can't assert on aarch64 that

f.sender().is_older(f.id())

because aarch64 uses unextended_sp() for frame::id().

Yes indeed there seems to be an issue with is_older() on aarch64 because of the
trimming of interpreted frames.

Actually I assumed hotspot does not abstract over stack growth direction because
it did not back in the days when we've ported it to pa-risc where stacks grow
towards higher addresses (yes that's long ago :))

So there might be is_older() now but looking for instance at class StackOverflow
it is clear from the diagrams and the code that stack growth towards lower
addresses is assumed. I'm sure there's still much more shared code in hotspot
with that assumption.

Another example from Loom is the shared code related to JavaThread::_cont_fastpath
where growth towards lower addresses is assumed too.

We could duplicate is_sp_in_continuation() on the platforms or add a static
assertion on a new global definition of the stack growth direction if you
like. Not sure if it is worth it though. I'd tend towards keeping
is_sp_in_continuation() as it is and maybe create an JBS item for stack growth
issues in Loom.

@dean-long
Copy link
Member

Yes, there would be quite a bit of fan-out if we decided to go with frame::id(). There are a few places where we could call is_sp_in_continuation() with a frame instead of an id. Also, as you pointed out, there is still _cont_fastpath that uses an sp. That field as well as entry->_parent_cont_fastpath should probably be using frame::id(). A new JBS enhancement for that makes sense.

@reinrich
Copy link
Member Author

Ok. I have created https://bugs.openjdk.org/browse/JDK-8294284.

@dean-long
Copy link
Member

Ok. I have created https://bugs.openjdk.org/browse/JDK-8294284.

Thanks!

@dean-long
Copy link
Member

I think it would be good to also have this looked at by someone from hotspot-runtime. @coleenp, could you take a look?

@reinrich
Copy link
Member Author

Thanks for the review @dean-long

@reinrich
Copy link
Member Author

reinrich commented Oct 4, 2022

I think this ready to be integrated. I'll do so tomorrow if there are no further comments until then.
Thanks again for your reviews @fisk and @dean-long

@reinrich
Copy link
Member Author

reinrich commented Oct 5, 2022

/integrate

@openjdk
Copy link

openjdk bot commented Oct 5, 2022

Going to push as commit ee6c391.
Since your change was applied there have been 5 commits pushed to the master branch:

  • bd90c4c: 8282900: runtime/stringtable/StringTableCleaningTest.java verify unavailable at this moment
  • 979efd4: 8289004: investigate if SharedRuntime::get_java_tid parameter should be a JavaThread*
  • b9eeec2: 8294310: compare.sh fails on macos after JDK-8293550
  • 13a5000: 8294151: JFR: Unclear exception message when dumping stopped in memory recording
  • 8ebebbc: 8294368: Java incremental builds broken on Windows after JDK-8293116

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Oct 5, 2022
@openjdk openjdk bot closed this Oct 5, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Oct 5, 2022
@openjdk
Copy link

openjdk bot commented Oct 5, 2022

@reinrich Pushed as commit ee6c391.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@reinrich reinrich deleted the dont_use_interpreter_frame_last_sp_in_shared_code branch October 31, 2022 07:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org hotspot-runtime hotspot-runtime-dev@openjdk.org integrated Pull request has been integrated
4 participants