Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve rotate then insert selected bits on Z #3212

Closed
fjeremic opened this issue Nov 19, 2018 · 3 comments · Fixed by #4298
Closed

Improve rotate then insert selected bits on Z #3212

fjeremic opened this issue Nov 19, 2018 · 3 comments · Fixed by #4298

Comments

@fjeremic
Copy link
Contributor

fjeremic commented Nov 19, 2018

While working on a miscellaneous item in OpenJ9 I had to use an RISBG instruction to perform an and and a shift. Since the RISBG instructions are slightly confusing in PoPs I wrote a quick Java program and compiled it to validate that I used the correct instruction. Here is the Java program I used and how I invoked it:

> cat Test.java
class Test
{
    public static void main(String[] args)
    {
        foo(5L);
        foo(5L);
    }

    public static long foo(long x)
    {
        return ((long)(x & 0xF0) >> 4L);
    }
}

>./bin/java -Xjit:'{Test.foo*}(count=0,tracefull,log=log.trace)' Test

It generated the following trees:

------------------------------
 n8n      (  0)  lreturn                                                                              [     0x3ff96804ce0] bci=[-1,7,11] rc=0 vc=130 vn=- li=2 udi=- nc=1
 n7n      (  0)    lshr (in GPR_0018) (highWordZero X>=0 cannotOverflow )                             [     0x3ff96804c90] bci=[-1,6,11] rc=0 vc=130 vn=- li=2 udi=- nc=2 flg=0x5100
 n5n      (  0)      land (in GPR_0016) (highWordZero X>=0 cannotOverflow )                           [     0x3ff96804bf0] bci=[-1,4,11] rc=0 vc=130 vn=- li=2 udi=- nc=2 flg=0x5100
 n10n     (  0)        ==>lRegLoad (in GPR_0016) (cannotOverflow SeenRealReference )
 n4n      (  0)        lconst 240 (highWordZero X!=0 X>=0 )                                           [     0x3ff96804ba0] bci=[-1,1,11] rc=0 vc=130 vn=- li=2 udi=- nc=0 flg=0x4104
 n6n      (  0)      iconst 4 (X!=0 X>=0 )                                                            [     0x3ff96804c40] bci=[-1,5,11] rc=0 vc=130 vn=- li=2 udi=- nc=0 flg=0x104
------------------------------

 [     0x3ff968930e0]   RISBGN  GPR_0016,GPR_0016,56,187,0
 [     0x3ff968af000]   SRAG    GPR_0018,GPR_0016,4
 [     0x3ff968af8e0]   ASSOCREGS
 [     0x3ff968af190]   RET
 POST:
 {GPR2:GPR_0018:R}

This sequence can be improved. I think the SRAG instruction is redundant as we can shift right and zero out everything else just with the RISBG instruction. This sequence should be improved to eliminate the SRAG instruction.

@dchopra001
Copy link
Contributor

dchopra001 commented Nov 19, 2018

I just took a quick look. The RISBGN instruction is generated by the land evaluator which tries to avoid doing a land operation (expensive on Z) by trying to do a rotate + insert operation.

This is only possible if the second child of the land operation is a constant and its binary representation contains a single sequence of consecutive 1s. For example, in the above example the binary representation of lconst 240 is 0x0F0 = 0000 1111 0000. This analysis is done by the following helper routine:

TR::Register *
OMR::Z::TreeEvaluator::tryToReplaceLongAndWithRotateInstruction(TR::Node * node, TR::CodeGenerator * cg)
{
if (cg->getS390ProcessorInfo()->supportsArch(TR_S390ProcessorInfo::TR_z10))
{
TR::Node * firstChild = node->getFirstChild();
TR::Node * secondChild = node->getSecondChild();
TR::Compilation *comp = cg->comp();
// Given the right case, transform land into RISBG as it may be better than discrete pair of ands
if (node->getOpCodeValue() == TR::land && secondChild->getOpCode().isLoadConst())

Where I think we should be capturing this new opportunity is inside lshr's evaluator, which calls the genericLongShiftSingle helper routine. Specifically, we can add a new case here:

if (secondChild->getOpCode().isLoadConst())
{
int32_t value = secondChild->getInt();
if (cg->getS390ProcessorInfo()->supportsArch(TR_S390ProcessorInfo::TR_z10))
{
// Generate RISBG for lshl + i2l sequence
if (node->getOpCodeValue() == TR::lshl)
{
if (firstChild->getOpCodeValue() == TR::i2l && firstChild->isSingleRefUnevaluated() && (firstChild->isNonNegative() || firstChild->getFirstChild()->isNonNegative()))
{
srcReg = cg->evaluate(firstChild->getFirstChild());
auto mnemonic = cg->getS390ProcessorInfo()->supportsArch(TR_S390ProcessorInfo::TR_zEC12) ? TR::InstOpCode::RISBGN : TR::InstOpCode::RISBG;
generateRIEInstruction(cg, mnemonic, node, trgReg, srcReg, (int8_t)(32-value), (int8_t)((63-value)|0x80), (int8_t)value);
node->setRegister(trgReg);
cg->decReferenceCount(firstChild->getFirstChild());
cg->decReferenceCount(firstChild);
cg->decReferenceCount(secondChild);
return trgReg;
}
}
}

Where if the lshr's first child is an AND operation (if (node->getOpCodeValue() == TR::land)), then we can evaluate the entire tree with a single RISBG(N) instruction. Most likely we can still use the tryToReplaceLongAndWithRotateInstruction routine but with a new field specifying the shift amount. We might also need to be careful with the type of shift (left vs right, logical vs arithmetic).

AidanHa added a commit to AidanHa/omr that referenced this issue Sep 12, 2019
Add a case in genericLongShiftSingle that would generate a RISBG command for a LSHR->LAND pattern

Fixes eclipse#3212

Signed-off-by: Aidan Ha <qbha@edu.uwaterloo.ca>
@AidanHa
Copy link
Contributor

AidanHa commented Sep 12, 2019

I realized the solution proposed above will only work if the value we want to shift (Eg: 4 in the above example) is less than 256, as the RISBG instruction only allows 8 bits for the "rotate" operand.

AidanHa added a commit to AidanHa/omr that referenced this issue Sep 12, 2019
Add a case in genericLongShiftSingle that would generate a RISBG command for a LSHR->LAND pattern

Fixes eclipse#3212

Signed-off-by: Aidan Ha <qbha@edu.uwaterloo.ca>
@fjeremic
Copy link
Contributor Author

I realized the solution proposed above will only work if the value we want to shift (Eg: 4 in the above example) is less than 256, as the RISBG instruction only allows 8 bits for the "rotate" operand.

Not quite, the RISBG instruction I5 field is 8 bits in width, but the documentation says only bits 2-7 are used (inclusive) so there are really only 6 bits for the shift which equates to a maximum rotate of 63. This makes sense because general purpose registers on z/Architecture are 64-bit wide, so one would never need to rotate more than 63 bits, since rotate will be a mod 64 operation.

AidanHa added a commit to AidanHa/omr that referenced this issue Sep 12, 2019
Add a case in genericLongShiftSingle that would generate a RISBG command for a LSHR->LAND pattern

Fixes eclipse#3212

Signed-off-by: Aidan Ha <qbha@edu.uwaterloo.ca>
AidanHa added a commit to AidanHa/omr that referenced this issue Sep 16, 2019
Add a case in genericLongShiftSingle that would generate a RISBG command for a LSHR->LAND pattern

Fixes eclipse#3212

Signed-off-by: Aidan Ha <qbha@edu.uwaterloo.ca>
AidanHa added a commit to AidanHa/omr that referenced this issue Sep 17, 2019
Add a case in genericLongShiftSingle that would generate a RISBG command for a LSHR->LAND pattern

Fixes eclipse#3212

Signed-off-by: Aidan Ha <qbha@edu.uwaterloo.ca>
AidanHa added a commit to AidanHa/omr that referenced this issue Sep 17, 2019
Add a case in genericLongShiftSingle that would generate a RISBG command for a LSHR->LAND pattern

Fixes eclipse#3212

Signed-off-by: Aidan Ha <qbha@edu.uwaterloo.ca>
AidanHa added a commit to AidanHa/omr that referenced this issue Sep 24, 2019
Add a case in genericLongShiftSingle that would generate a RISBG command for a LSHR->LAND pattern

Fixes eclipse#3212

Signed-off-by: Aidan Ha <qbha@edu.uwaterloo.ca>
AidanHa added a commit to AidanHa/omr that referenced this issue Sep 24, 2019
Add a case in genericLongShiftSingle that would generate a RISBG command for a LSHR->LAND pattern
Add a similar case for LSHL->LAND pattern
Add a similar case for the unsigned variants of the instruction(LUSHR and LUSHL)
Rename tryToReplaceLongAndWithRotateInstruction() to 'tryToReplaceShiftLandWithRotateInstruction()
Allocate target register properly in tryToReplaceShiftLandWithRotateInstruction()

Fixes eclipse#3212

Signed-off-by: Aidan Ha <qbha@edu.uwaterloo.ca>
AidanHa added a commit to AidanHa/omr that referenced this issue Sep 30, 2019
Add a case in genericLongShiftSingle that would generate
a RISBG command for a LSHR->LAND pattern

Add a similar case for LSHL->LAND pattern

Add a similar case for the unsigned variants of the
instruction(LUSHR and LUSHL)

Rename tryToReplaceLongAndWithRotateInstruction() to
tryToReplaceShiftLandWithRotateInstruction()

Allocate target register properly in
tryToReplaceShiftLandWithRotateInstruction()

Add respective Tril test for these changes

Fixes eclipse#3212

Signed-off-by: Aidan Ha <qbha@edu.uwaterloo.ca>

a

Signed-off-by: Aidan Ha <qbha@edu.uwaterloo.ca>
AidanHa added a commit to AidanHa/omr that referenced this issue Sep 30, 2019
Add a case in genericLongShiftSingle that would generate
a RISBG command for a LSHR->LAND pattern

Add a similar case for LSHL->LAND pattern

Add a similar case for the unsigned variants of the
instruction(LUSHR and LUSHL)

Rename tryToReplaceLongAndWithRotateInstruction() to
tryToReplaceShiftLandWithRotateInstruction()

Allocate target register properly in
tryToReplaceShiftLandWithRotateInstruction()

Add respective Tril test for these changes

Fixes eclipse#3212

Signed-off-by: Aidan Ha <qbha@edu.uwaterloo.ca>
AidanHa added a commit to AidanHa/omr that referenced this issue Oct 1, 2019
Add a case in genericLongShiftSingle that would generate
a RISBG command for a LSHR->LAND pattern

Add a similar case for LSHL->LAND pattern

Add a similar case for the unsigned variants of the
instruction(LUSHR and LUSHL)

Rename tryToReplaceLongAndWithRotateInstruction() to
tryToReplaceShiftLandWithRotateInstruction()

Allocate target register properly in
tryToReplaceShiftLandWithRotateInstruction()

Add respective Tril test for these changes

Remove LongAndRotateAsTest as those tests are
now a strict subset of the added tests in
ShiftAndRotateTest.

Fixes eclipse#3212

Signed-off-by: Aidan Ha <qbha@edu.uwaterloo.ca>
AidanHa added a commit to AidanHa/omr that referenced this issue Oct 1, 2019
Add a case in genericLongShiftSingle that would generate
a RISBG command for a LSHR->LAND pattern

Add a similar case for LSHL->LAND pattern

Add a similar case for the unsigned variants of the
instruction(LUSHR and LUSHL)

Rename tryToReplaceLongAndWithRotateInstruction() to
tryToReplaceShiftLandWithRotateInstruction()

Allocate target register properly in
tryToReplaceShiftLandWithRotateInstruction()

Add respective Tril test for these changes

Remove LongAndRotateAsTest as those tests are
now a strict subset of the added tests in
ShiftAndRotateTest.

Fixes eclipse#3212

Signed-off-by: Aidan Ha <qbha@edu.uwaterloo.ca>
AidanHa added a commit to AidanHa/omr that referenced this issue Oct 1, 2019
Add a case in genericLongShiftSingle that would generate
a RISBG command for a LSHR->LAND pattern

Add a similar case for LSHL->LAND pattern

Add a similar case for the unsigned variants of the
instruction(LUSHR and LUSHL)

Rename tryToReplaceLongAndWithRotateInstruction() to
tryToReplaceShiftLandWithRotateInstruction()

Allocate target register properly in
tryToReplaceShiftLandWithRotateInstruction()

Add respective Tril test for these changes

Remove LongAndRotateAsTest as those tests are
now a strict subset of the added tests in
ShiftAndRotateTest.

Fixes eclipse#3212

Signed-off-by: Aidan Ha <qbha@edu.uwaterloo.ca>
AidanHa added a commit to AidanHa/omr that referenced this issue Oct 4, 2019
Add a case in genericLongShiftSingle that would generate
a RISBG command for a LSHR->LAND pattern

Add a similar case for LSHL->LAND pattern

Add a similar case for the unsigned variants of the
instruction(LUSHR and LUSHL)

Rename tryToReplaceLongAndWithRotateInstruction() to
tryToReplaceShiftLandWithRotateInstruction()

Allocate target register properly in
tryToReplaceShiftLandWithRotateInstruction()

Add respective Tril test for these changes

Remove LongAndRotateAsTest as those tests are
now a strict subset of the added tests in
ShiftAndRotateTest.

Fixes eclipse#3212

Signed-off-by: Aidan Ha <qbha@edu.uwaterloo.ca>
oneturkmen pushed a commit to oneturkmen/omr that referenced this issue Oct 15, 2019
Add a case in genericLongShiftSingle that would generate
a RISBG command for a LSHR->LAND pattern

Add a similar case for LSHL->LAND pattern

Add a similar case for the unsigned variants of the
instruction(LUSHR and LUSHL)

Rename tryToReplaceLongAndWithRotateInstruction() to
tryToReplaceShiftLandWithRotateInstruction()

Allocate target register properly in
tryToReplaceShiftLandWithRotateInstruction()

Add respective Tril test for these changes

Remove LongAndRotateAsTest as those tests are
now a strict subset of the added tests in
ShiftAndRotateTest.

Fixes eclipse#3212

Signed-off-by: Aidan Ha <qbha@edu.uwaterloo.ca>
AidanHa added a commit to AidanHa/omr that referenced this issue Nov 4, 2019
Add a case in genericLongShiftSingle that would generate
a RISBG command for a LSHR->LAND pattern

Add a similar case for LSHL->LAND pattern

Add a similar case for the unsigned variants of the
instruction(LUSHR and LUSHL)

Rename tryToReplaceLongAndWithRotateInstruction() to
tryToReplaceShiftLandWithRotateInstruction()

Allocate target register properly in
tryToReplaceShiftLandWithRotateInstruction()

Add respective Tril test for these changes

Remove LongAndRotateAsTest as those tests are
now a strict subset of the added tests in
ShiftAndRotateTest.

Fixes eclipse#3212

Signed-off-by: Aidan Ha <qbha@edu.uwaterloo.ca>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants