New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce size of code emitted for stack loads/stores from Air on ARM #10263
Reduce size of code emitted for stack loads/stores from Air on ARM #10263
Conversation
22f6fb0
to
7e54028
Compare
EWS run on previous version of this PR (hash 7e54028) |
7e54028
to
4909b66
Compare
EWS run on previous version of this PR (hash 4909b66) |
if (addr.isValidForm(opcode, width)) | ||
return addr; | ||
|
||
if (isARM64() || isARM_THUMB2()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not have arch condition here. If this is only allowed for these archs, then isValidForm should say false to wrong archs instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed those in da11283
GPRReg reg = extendedOffsetAddrRegister(); | ||
jit.move(CCallHelpers::TrustedImmPtr(offsetFromFP), reg); | ||
if (isARM64() || isARM_THUMB2()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
ASSERT(m_map[tmp].spillSlot->byteSize() == bytesForWidth(Width64)); | ||
m_jit->loadDouble(callFrameAddr(Air::MoveDouble, *m_jit, offset), reg.fpr()); | ||
auto src = callFrameAddr(Air::MoveDouble, *m_jit, offset, m_code.frameSize(), Width64); | ||
ASSERT(src.isAddr()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why this is correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It used to be an invariant because this patch isValidForm
rejected BaseIndex addresses for Air::MoveDouble. I've changed isValidIndexForm
in da11283 to only reject Air::MoveDouble and Air::MoveFloat on ARM_THUMB2 (because the MacroAssemblerARMv7 will try to use the addressTempRegister to materialize them.) Hence, I've replaced the ASSERTs with branches to handle BaseIndex addresses for loadDouble/storeDouble/loadVector/storeVector (those can and do occur on ARM64.)
} else { | ||
ASSERT(m_map[tmp].spillSlot->byteSize() == bytesForWidth(Width128)); | ||
m_jit->loadVector(callFrameAddr(Air::MoveDouble, *m_jit, offset), reg.fpr()); | ||
auto src = callFrameAddr(Air::MoveDouble, *m_jit, offset, m_code.frameSize(), Width128); | ||
ASSERT(src.isAddr()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
@@ -1396,7 +1412,7 @@ class Arg { | |||
case CallArg: | |||
return isValidAddrForm(opcode, offset(), width); | |||
case Index: | |||
return isValidIndexForm(scale(), offset(), width); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isValidIndexForm should also take op ode instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in da11283
This is really great! |
4909b66
to
da11283
Compare
Do I have to do something to kick off a new EWS job for da11283? IIRC last time I force-pushed to this branch it just started a new job automatically. |
@eugeneia it should start automatically, but if it fails for some reason, you can force push again and normally it does the trick. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r=me
} else { | ||
ASSERT(m_map[tmp].spillSlot->byteSize() == bytesForWidth(Width128)); | ||
m_jit->loadVector(callFrameAddr(Air::MoveDouble, *m_jit, offset), reg.fpr()); | ||
auto src = callFrameAddr(Air::MoveDouble, *m_jit, offset, m_code.frameSize(), Width128); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is pre-existing issue, but we should use MoveVector
(but it does not matter in terms of correctness I think).
} else { | ||
ASSERT(m_map[tmp].spillSlot->byteSize() == bytesForWidth(Width128)); | ||
m_jit->storeVector(reg.fpr(), callFrameAddr(Air::MoveDouble, *m_jit, offset)); | ||
auto dest = callFrameAddr(Air::MoveDouble, *m_jit, offset, m_code.frameSize(), Width128); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
return addr; | ||
|
||
// Try finding a valid offset from the stack pointer register instead. | ||
// (this trick is also done in Air::lowerStackArgs:126) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this line since this line number etc. is fragile.
@@ -571,7 +571,7 @@ class LowerToAir { | |||
Value* right = address->child(1); | |||
|
|||
auto tryIndex = [&] (Value* index, Value* base) -> Arg { | |||
std::optional<unsigned> scale = scaleForShl(index, offset, width); | |||
std::optional<unsigned> scale = scaleForShl(Air::Move, index, offset, width); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is Move
correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so? I chose Air::Move
because earlier in the function there is a comment:
// This function currently is currently only used for loads/stores, so
// using Air::Move is appropriate.
(and a similar choice).
Not sure though. I find it tricky to do it these mappings from Air opcodes to desired isValidForm
behavior.
@@ -598,7 +598,7 @@ class LowerToAir { | |||
// amount is greater than 1, then there isn't really anything smart that we could do here. | |||
// We avoid using baseless indexes because their encoding isn't particularly efficient. | |||
if (m_locked.contains(left) || !address->child(1)->isInt32(1) | |||
|| !Arg::isValidIndexForm(1, offset, width)) | |||
|| !Arg::isValidIndexForm(Air::Move, 1, offset, width)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is Move
correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
@@ -613,7 +613,7 @@ class LowerToAir { | |||
case WasmAddress: { | |||
WasmAddressValue* wasmAddress = address->as<WasmAddressValue>(); | |||
Value* pointer = wasmAddress->child(0); | |||
if (!Arg::isValidIndexForm(1, offset, width) || m_locked.contains(pointer)) | |||
if (!Arg::isValidIndexForm(Air::Move, 1, offset, width) || m_locked.contains(pointer)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is Move
correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
@@ -2325,7 +2325,7 @@ class LowerToAir { | |||
} | |||
|
|||
auto tryShl = [&] (Value* shl, Value* other) -> bool { | |||
std::optional<unsigned> scale = scaleForShl(shl, offset); | |||
std::optional<unsigned> scale = scaleForShl(leaOpcode, shl, offset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is leaOpcode
correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I think so? Not sure what the other candidates would be?
da11283
to
b17b385
Compare
EWS run on previous version of this PR (hash b17b385) |
b17b385
to
0a55d14
Compare
EWS run on current version of this PR (hash 0a55d14) |
https://bugs.webkit.org/show_bug.cgi?id=252469 Reviewed by Yusuke Suzuki. This changes callFrameAddr (AirAllocateRegistersAndStackAndGenerateCode.cpp) and its users to (which ever succeeds first) 1. generate an address offset from GPRInfo::callFrameRegister (unchanged) 2. generate an address offset from MacroAssembler::stackPointerRegister 3. load the offset into extendedOffsetAddrRegister and generate a BaseIndex [GPRInfo::callFrameRegister, extendedOffsetAddrRegister] 4. resort to computing the absolute address in extendedOffsetAddrRegister (this was the previous fallback after 1.) For 3. to work reliably we need to change Air::Arg::isValidForm to reject BaseIndex addressing when it is not supported on ARM. The result of these changes is a significant reduction in code size due to avoiding the fallback behavior in many cases (4). Code generated for the JetStream2 WASM benchmarks is ~30% smaller on ARMv7 and ~40% smaller on ARM64. * Source/JavaScriptCore/b3/air/AirAllocateRegistersAndStackAndGenerateCode.cpp: (JSC::B3::Air::callFrameAddr): (JSC::B3::Air::GenerateAndAllocateRegisters::flush): (JSC::B3::Air::GenerateAndAllocateRegisters::alloc): * Source/JavaScriptCore/b3/air/AirArg.h: (JSC::B3::Air::Arg::isValidAddrForm): (JSC::B3::Air::Arg::isValidForm const): Canonical link: https://commits.webkit.org/260680@main
0a55d14
to
6d88e26
Compare
Committed 260680@main (6d88e26): https://commits.webkit.org/260680@main Reviewed commits have been landed. Closing PR #10263 and removing active labels. |
6d88e26
0a55d14
π§ͺ jsc-armv7-tests