Skip to content
Permalink
Browse files
Add ExtendType to Air::Arg Index to fully utilize address computation…
… in memory instruction for ARM64

https://bugs.webkit.org/show_bug.cgi?id=227970

Reviewed by Saam Barati.

The pattern recognition of address computation in the instructions, e.g., Load
Resistor (LDR), Store Register (STR), etc., can benefit the instruction selector.
Then, the Air operand BaseIndex containing base, index, and scale is introduced
to Air opcode. However, the <extend> option of index address is not fully leveraged
in the previous implementation.

To fill that gap, this patch adds a new member, MacroAssembler::Extend, to the current
design of BaseIndex to trigger zero/sign extension on the Index address. And this is
enabled for Store/Load with valid index address and shift amount.

Maybe, the ideal approach is to introduce a decorator (Index@EXT) to the Air operand
to provide an extension opportunity for the specific form of the Air opcode.

Load Register (LDR) calculates an address from a base register value and an
offset register value, loads a word from memory, and writes it to a register.
The offset register value can optionally be shifted and extended.

Given B3 IR:
Int @0 = ArgumentReg(%x0)
Int @1 = Z/SExt32(Trunc(ArgumentReg(%x1)))
Int @2 = scale
Int @3 = Shl(@1, @2)
Int @4 = Add(@0, @3)
Int @5 = Load(@4, ControlDependent|Reads:Top)
Void@6 = Return(@5, Terminal)

// Old optimized AIR
Move               %x1, %x1, @1
Move (%x0,%x1,2^scale), %x0, @5
Ret                %x0,      @6

// New optimized AIR
Move (%x0,%x1,2^scale), %x0, @5
Ret                %x0,      @6

Store Register (STR) calculates an address from a base register value and an
offset register value, and stores a 32-bit word or a 64-bit doubleword to the
calculated address, from a register.

Given B3 IR:
Int @0 = value
Int @1 = ArgumentReg(%x0)
Int @2 = Z/SExt32(Trunc(ArgumentReg(%x1))
Int @3 = scale
Int @4 = Shl(@2, @3)
Int @5 = Add(@1, @4)
Void@6 = Store(@0, @5, ControlDependent|Writes:Top)
Void@7 = Return(@0, Terminal)

// Old optimized AIR
Move32    %x1,               %x1, @2
Store32  %xzr, (%x0,%x1,2^scale), @6
Move       $0,               %x0, @7
Ret32     %x0,                    @7

// New optimized AIR
Store32  %xzr, (%x0,%x1,2^scale), @6
Move       $0,               %x0, @7
Ret32     %x0,                    @7

* assembler/AbstractMacroAssembler.h:
(JSC::AbstractMacroAssembler::BaseIndex::BaseIndex):
* assembler/MacroAssemblerARM64.h:
(JSC::MacroAssemblerARM64::indexExtendType):
(JSC::MacroAssemblerARM64::load64):
(JSC::MacroAssemblerARM64::load32):
(JSC::MacroAssemblerARM64::load16):
(JSC::MacroAssemblerARM64::load16SignedExtendTo32):
(JSC::MacroAssemblerARM64::load8):
(JSC::MacroAssemblerARM64::load8SignedExtendTo32):
(JSC::MacroAssemblerARM64::store64):
(JSC::MacroAssemblerARM64::store32):
(JSC::MacroAssemblerARM64::store16):
(JSC::MacroAssemblerARM64::store8):
(JSC::MacroAssemblerARM64::loadDouble):
(JSC::MacroAssemblerARM64::loadFloat):
(JSC::MacroAssemblerARM64::storeDouble):
(JSC::MacroAssemblerARM64::storeFloat):
* b3/B3LowerToAir.cpp:
* b3/air/AirArg.h:
(JSC::B3::Air::Arg::index):
(JSC::B3::Air::Arg::asBaseIndex const):
* b3/testb3.h:
* b3/testb3_2.cpp:
(testLoadZeroExtendIndexAddress):
(testLoadSignExtendIndexAddress):
(testStoreZeroExtendIndexAddress):
(testStoreSignExtendIndexAddress):
(addBitTests):


Canonical link: https://commits.webkit.org/239755@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@280013 268f45cc-cd09-0410-ab3c-d52691b4dbfc
  • Loading branch information
Yijia Huang committed Jul 17, 2021
1 parent a407c84 commit 656ae80220b8eb273c4f1b17aa5a7f72faf69f35
Showing 7 changed files with 507 additions and 53 deletions.
@@ -1,3 +1,100 @@
2021-07-16 Yijia Huang <yijia_huang@apple.com>

Add ExtendType to Air::Arg Index to fully utilize address computation in memory instruction for ARM64
https://bugs.webkit.org/show_bug.cgi?id=227970

Reviewed by Saam Barati.

The pattern recognition of address computation in the instructions, e.g., Load
Resistor (LDR), Store Register (STR), etc., can benefit the instruction selector.
Then, the Air operand BaseIndex containing base, index, and scale is introduced
to Air opcode. However, the <extend> option of index address is not fully leveraged
in the previous implementation.

To fill that gap, this patch adds a new member, MacroAssembler::Extend, to the current
design of BaseIndex to trigger zero/sign extension on the Index address. And this is
enabled for Store/Load with valid index address and shift amount.

Maybe, the ideal approach is to introduce a decorator (Index@EXT) to the Air operand
to provide an extension opportunity for the specific form of the Air opcode.

Load Register (LDR) calculates an address from a base register value and an
offset register value, loads a word from memory, and writes it to a register.
The offset register value can optionally be shifted and extended.

Given B3 IR:
Int @0 = ArgumentReg(%x0)
Int @1 = Z/SExt32(Trunc(ArgumentReg(%x1)))
Int @2 = scale
Int @3 = Shl(@1, @2)
Int @4 = Add(@0, @3)
Int @5 = Load(@4, ControlDependent|Reads:Top)
Void@6 = Return(@5, Terminal)

// Old optimized AIR
Move %x1, %x1, @1
Move (%x0,%x1,2^scale), %x0, @5
Ret %x0, @6

// New optimized AIR
Move (%x0,%x1,2^scale), %x0, @5
Ret %x0, @6

Store Register (STR) calculates an address from a base register value and an
offset register value, and stores a 32-bit word or a 64-bit doubleword to the
calculated address, from a register.

Given B3 IR:
Int @0 = value
Int @1 = ArgumentReg(%x0)
Int @2 = Z/SExt32(Trunc(ArgumentReg(%x1))
Int @3 = scale
Int @4 = Shl(@2, @3)
Int @5 = Add(@1, @4)
Void@6 = Store(@0, @5, ControlDependent|Writes:Top)
Void@7 = Return(@0, Terminal)

// Old optimized AIR
Move32 %x1, %x1, @2
Store32 %xzr, (%x0,%x1,2^scale), @6
Move $0, %x0, @7
Ret32 %x0, @7

// New optimized AIR
Store32 %xzr, (%x0,%x1,2^scale), @6
Move $0, %x0, @7
Ret32 %x0, @7

* assembler/AbstractMacroAssembler.h:
(JSC::AbstractMacroAssembler::BaseIndex::BaseIndex):
* assembler/MacroAssemblerARM64.h:
(JSC::MacroAssemblerARM64::indexExtendType):
(JSC::MacroAssemblerARM64::load64):
(JSC::MacroAssemblerARM64::load32):
(JSC::MacroAssemblerARM64::load16):
(JSC::MacroAssemblerARM64::load16SignedExtendTo32):
(JSC::MacroAssemblerARM64::load8):
(JSC::MacroAssemblerARM64::load8SignedExtendTo32):
(JSC::MacroAssemblerARM64::store64):
(JSC::MacroAssemblerARM64::store32):
(JSC::MacroAssemblerARM64::store16):
(JSC::MacroAssemblerARM64::store8):
(JSC::MacroAssemblerARM64::loadDouble):
(JSC::MacroAssemblerARM64::loadFloat):
(JSC::MacroAssemblerARM64::storeDouble):
(JSC::MacroAssemblerARM64::storeFloat):
* b3/B3LowerToAir.cpp:
* b3/air/AirArg.h:
(JSC::B3::Air::Arg::index):
(JSC::B3::Air::Arg::asBaseIndex const):
* b3/testb3.h:
* b3/testb3_2.cpp:
(testLoadZeroExtendIndexAddress):
(testLoadSignExtendIndexAddress):
(testStoreZeroExtendIndexAddress):
(testStoreSignExtendIndexAddress):
(addBitTests):

2021-07-16 Saam Barati <sbarati@apple.com>

Grab the lock in FTL::Thunks::keyForSlowPathCallThunk
@@ -124,7 +124,13 @@ class AbstractMacroAssembler : public AbstractMacroAssemblerBase {
TimesEight,
ScalePtr = isAddress64Bit() ? TimesEight : TimesFour,
};


enum class Extend : uint8_t {
ZExt32,
SExt32,
None
};

struct BaseIndex;

static RegisterID withSwappedRegister(RegisterID original, RegisterID left, RegisterID right)
@@ -210,18 +216,23 @@ class AbstractMacroAssembler : public AbstractMacroAssemblerBase {
//
// Describes a complex addressing mode.
struct BaseIndex {
BaseIndex(RegisterID base, RegisterID index, Scale scale, int32_t offset = 0)
BaseIndex(RegisterID base, RegisterID index, Scale scale, int32_t offset = 0, Extend extend = Extend::None)
: base(base)
, index(index)
, scale(scale)
, offset(offset)
, extend(extend)
{
#if !CPU(ARM64)
ASSERT(extend == Extend::None);
#endif
}

RegisterID base;
RegisterID index;
Scale scale;
int32_t offset;
Extend extend;

BaseIndex withOffset(int32_t additionalOffset)
{
@@ -1367,6 +1367,17 @@ class MacroAssemblerARM64 : public AbstractMacroAssembler<Assembler> {
}

// Memory access operations:
Assembler::ExtendType indexExtendType(BaseIndex address)
{
switch (address.extend) {
case Extend::ZExt32:
return Assembler::UXTW;
case Extend::SExt32:
return Assembler::SXTW;
case Extend::None:
return Assembler::UXTX;
}
}

void load64(ImplicitAddress address, RegisterID dest)
{
@@ -1380,12 +1391,12 @@ class MacroAssemblerARM64 : public AbstractMacroAssembler<Assembler> {
void load64(BaseIndex address, RegisterID dest)
{
if (!address.offset && (!address.scale || address.scale == 3)) {
m_assembler.ldr<64>(dest, address.base, address.index, Assembler::UXTX, address.scale);
m_assembler.ldr<64>(dest, address.base, address.index, indexExtendType(address), address.scale);
return;
}

signExtend32ToPtr(TrustedImm32(address.offset), getCachedMemoryTempRegisterIDAndInvalidate());
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, Assembler::UXTX, address.scale);
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, indexExtendType(address), address.scale);
m_assembler.ldr<64>(dest, address.base, memoryTempRegister);
}

@@ -1479,12 +1490,12 @@ class MacroAssemblerARM64 : public AbstractMacroAssembler<Assembler> {
void load32(BaseIndex address, RegisterID dest)
{
if (!address.offset && (!address.scale || address.scale == 2)) {
m_assembler.ldr<32>(dest, address.base, address.index, Assembler::UXTX, address.scale);
m_assembler.ldr<32>(dest, address.base, address.index, indexExtendType(address), address.scale);
return;
}

signExtend32ToPtr(TrustedImm32(address.offset), getCachedMemoryTempRegisterIDAndInvalidate());
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, Assembler::UXTX, address.scale);
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, indexExtendType(address), address.scale);
m_assembler.ldr<32>(dest, address.base, memoryTempRegister);
}

@@ -1526,12 +1537,12 @@ class MacroAssemblerARM64 : public AbstractMacroAssembler<Assembler> {
void load16(BaseIndex address, RegisterID dest)
{
if (!address.offset && (!address.scale || address.scale == 1)) {
m_assembler.ldrh(dest, address.base, address.index, Assembler::UXTX, address.scale);
m_assembler.ldrh(dest, address.base, address.index, indexExtendType(address), address.scale);
return;
}

signExtend32ToPtr(TrustedImm32(address.offset), getCachedMemoryTempRegisterIDAndInvalidate());
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, Assembler::UXTX, address.scale);
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, indexExtendType(address), address.scale);
m_assembler.ldrh(dest, address.base, memoryTempRegister);
}

@@ -1570,12 +1581,12 @@ class MacroAssemblerARM64 : public AbstractMacroAssembler<Assembler> {
void load16SignedExtendTo32(BaseIndex address, RegisterID dest)
{
if (!address.offset && (!address.scale || address.scale == 1)) {
m_assembler.ldrsh<32>(dest, address.base, address.index, Assembler::UXTX, address.scale);
m_assembler.ldrsh<32>(dest, address.base, address.index, indexExtendType(address), address.scale);
return;
}

signExtend32ToPtr(TrustedImm32(address.offset), getCachedMemoryTempRegisterIDAndInvalidate());
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, Assembler::UXTX, address.scale);
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, indexExtendType(address), address.scale);
m_assembler.ldrsh<32>(dest, address.base, memoryTempRegister);
}

@@ -1601,12 +1612,12 @@ class MacroAssemblerARM64 : public AbstractMacroAssembler<Assembler> {
void load8(BaseIndex address, RegisterID dest)
{
if (!address.offset && !address.scale) {
m_assembler.ldrb(dest, address.base, address.index, Assembler::UXTX, address.scale);
m_assembler.ldrb(dest, address.base, address.index, indexExtendType(address), address.scale);
return;
}

signExtend32ToPtr(TrustedImm32(address.offset), getCachedMemoryTempRegisterIDAndInvalidate());
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, Assembler::UXTX, address.scale);
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, indexExtendType(address), address.scale);
m_assembler.ldrb(dest, address.base, memoryTempRegister);
}

@@ -1635,12 +1646,12 @@ class MacroAssemblerARM64 : public AbstractMacroAssembler<Assembler> {
void load8SignedExtendTo32(BaseIndex address, RegisterID dest)
{
if (!address.offset && !address.scale) {
m_assembler.ldrsb<32>(dest, address.base, address.index, Assembler::UXTX, address.scale);
m_assembler.ldrsb<32>(dest, address.base, address.index, indexExtendType(address), address.scale);
return;
}

signExtend32ToPtr(TrustedImm32(address.offset), getCachedMemoryTempRegisterIDAndInvalidate());
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, Assembler::UXTX, address.scale);
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, indexExtendType(address), address.scale);
m_assembler.ldrsb<32>(dest, address.base, memoryTempRegister);
}

@@ -1674,12 +1685,12 @@ class MacroAssemblerARM64 : public AbstractMacroAssembler<Assembler> {
void store64(RegisterID src, BaseIndex address)
{
if (!address.offset && (!address.scale || address.scale == 3)) {
m_assembler.str<64>(src, address.base, address.index, Assembler::UXTX, address.scale);
m_assembler.str<64>(src, address.base, address.index, indexExtendType(address), address.scale);
return;
}

signExtend32ToPtr(TrustedImm32(address.offset), getCachedMemoryTempRegisterIDAndInvalidate());
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, Assembler::UXTX, address.scale);
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, indexExtendType(address), address.scale);
m_assembler.str<64>(src, address.base, memoryTempRegister);
}

@@ -1781,12 +1792,12 @@ class MacroAssemblerARM64 : public AbstractMacroAssembler<Assembler> {
void store32(RegisterID src, BaseIndex address)
{
if (!address.offset && (!address.scale || address.scale == 2)) {
m_assembler.str<32>(src, address.base, address.index, Assembler::UXTX, address.scale);
m_assembler.str<32>(src, address.base, address.index, indexExtendType(address), address.scale);
return;
}

signExtend32ToPtr(TrustedImm32(address.offset), getCachedMemoryTempRegisterIDAndInvalidate());
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, Assembler::UXTX, address.scale);
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, indexExtendType(address), address.scale);
m_assembler.str<32>(src, address.base, memoryTempRegister);
}

@@ -1848,12 +1859,12 @@ class MacroAssemblerARM64 : public AbstractMacroAssembler<Assembler> {
void store16(RegisterID src, BaseIndex address)
{
if (!address.offset && (!address.scale || address.scale == 1)) {
m_assembler.strh(src, address.base, address.index, Assembler::UXTX, address.scale);
m_assembler.strh(src, address.base, address.index, indexExtendType(address), address.scale);
return;
}

signExtend32ToPtr(TrustedImm32(address.offset), getCachedMemoryTempRegisterIDAndInvalidate());
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, Assembler::UXTX, address.scale);
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, indexExtendType(address), address.scale);
m_assembler.strh(src, address.base, memoryTempRegister);
}

@@ -1876,12 +1887,12 @@ class MacroAssemblerARM64 : public AbstractMacroAssembler<Assembler> {
void store8(RegisterID src, BaseIndex address)
{
if (!address.offset && !address.scale) {
m_assembler.strb(src, address.base, address.index, Assembler::UXTX, address.scale);
m_assembler.strb(src, address.base, address.index, indexExtendType(address), address.scale);
return;
}

signExtend32ToPtr(TrustedImm32(address.offset), getCachedMemoryTempRegisterIDAndInvalidate());
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, Assembler::UXTX, address.scale);
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, indexExtendType(address), address.scale);
m_assembler.strb(src, address.base, memoryTempRegister);
}

@@ -2189,12 +2200,12 @@ class MacroAssemblerARM64 : public AbstractMacroAssembler<Assembler> {
void loadDouble(BaseIndex address, FPRegisterID dest)
{
if (!address.offset && (!address.scale || address.scale == 3)) {
m_assembler.ldr<64>(dest, address.base, address.index, Assembler::UXTX, address.scale);
m_assembler.ldr<64>(dest, address.base, address.index, indexExtendType(address), address.scale);
return;
}

signExtend32ToPtr(TrustedImm32(address.offset), getCachedMemoryTempRegisterIDAndInvalidate());
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, Assembler::UXTX, address.scale);
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, indexExtendType(address), address.scale);
m_assembler.ldr<64>(dest, address.base, memoryTempRegister);
}

@@ -2216,12 +2227,12 @@ class MacroAssemblerARM64 : public AbstractMacroAssembler<Assembler> {
void loadFloat(BaseIndex address, FPRegisterID dest)
{
if (!address.offset && (!address.scale || address.scale == 2)) {
m_assembler.ldr<32>(dest, address.base, address.index, Assembler::UXTX, address.scale);
m_assembler.ldr<32>(dest, address.base, address.index, indexExtendType(address), address.scale);
return;
}

signExtend32ToPtr(TrustedImm32(address.offset), getCachedMemoryTempRegisterIDAndInvalidate());
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, Assembler::UXTX, address.scale);
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, indexExtendType(address), address.scale);
m_assembler.ldr<32>(dest, address.base, memoryTempRegister);
}

@@ -2473,12 +2484,12 @@ class MacroAssemblerARM64 : public AbstractMacroAssembler<Assembler> {
void storeDouble(FPRegisterID src, BaseIndex address)
{
if (!address.offset && (!address.scale || address.scale == 3)) {
m_assembler.str<64>(src, address.base, address.index, Assembler::UXTX, address.scale);
m_assembler.str<64>(src, address.base, address.index, indexExtendType(address), address.scale);
return;
}

signExtend32ToPtr(TrustedImm32(address.offset), getCachedMemoryTempRegisterIDAndInvalidate());
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, Assembler::UXTX, address.scale);
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, indexExtendType(address), address.scale);
m_assembler.str<64>(src, address.base, memoryTempRegister);
}

@@ -2494,12 +2505,12 @@ class MacroAssemblerARM64 : public AbstractMacroAssembler<Assembler> {
void storeFloat(FPRegisterID src, BaseIndex address)
{
if (!address.offset && (!address.scale || address.scale == 2)) {
m_assembler.str<32>(src, address.base, address.index, Assembler::UXTX, address.scale);
m_assembler.str<32>(src, address.base, address.index, indexExtendType(address), address.scale);
return;
}

signExtend32ToPtr(TrustedImm32(address.offset), getCachedMemoryTempRegisterIDAndInvalidate());
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, Assembler::UXTX, address.scale);
m_assembler.add<64>(memoryTempRegister, memoryTempRegister, address.index, indexExtendType(address), address.scale);
m_assembler.str<32>(src, address.base, memoryTempRegister);
}

0 comments on commit 656ae80

Please sign in to comment.