Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MIPS] Sign-extend subwords when expanding atomic max/min #89246

Closed

Conversation

jdmitrovic-syrmia
Copy link

Rework of the #77072 PR.

In order for the following SLT instruction to work properly, we need to sign-extend appropriate subwords.

In addition, subwords must remain in the same position from before sign-extension.

Resolves #61881. Also, downstream bugs rust-lang/rust#100650 and rust-lang/rust#123772 are fixed.

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@brad0
Copy link
Contributor

brad0 commented Apr 19, 2024

cc @yingopq @topperc

@brad0
Copy link
Contributor

brad0 commented Apr 19, 2024

cc @wzssyqa

Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 3e64f8a4e74cdcaf5920879c86e7e0a827f6ec13 2e331854112b792feccb4eb2d536c2a27204874a -- llvm/lib/Target/Mips/MipsExpandPseudo.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/Target/Mips/MipsExpandPseudo.cpp b/llvm/lib/Target/Mips/MipsExpandPseudo.cpp
index 9bfef2a393..89d8a92dca 100644
--- a/llvm/lib/Target/Mips/MipsExpandPseudo.cpp
+++ b/llvm/lib/Target/Mips/MipsExpandPseudo.cpp
@@ -499,8 +499,7 @@ bool MipsExpandPseudo::expandAtomicBinOpSubword(
           BuildMI(loopMBB, DL, TII->get(Mips::AND), Incr)
               .addReg(Incr)
               .addReg(Mask);
-          BuildMI(loopMBB, DL, TII->get(Mips::CLZ), Scratch4)
-              .addReg(Mask);
+          BuildMI(loopMBB, DL, TII->get(Mips::CLZ), Scratch4).addReg(Mask);
           BuildMI(loopMBB, DL, TII->get(Mips::SLLV), OldVal)
               .addReg(OldVal)
               .addReg(Scratch4);

@@ -1118,6 +1118,8 @@ define i16 @test_max_16(ptr nocapture %ptr, i16 signext %val) {
; MIPSEL-NEXT: srav $7, $7, $10
; MIPSEL-NEXT: seh $2, $2
; MIPSEL-NEXT: seh $7, $7
; MIPSEL-NEXT: sllv $2, $2, $10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry that I don't understand it well.
seh does be sign-extended. The result of seh is good enough for slt.
Why do we need to extend them to 32bit value?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And sllv here may make the result incorrect.
The return value should be a signed int16, while with sllv it will be (sign int16)<<$10.
Note, $10 here contains the offset of a int16 in the a word, it may be 0 or 16.

I guess the reason we do it is that we have only ll, while no llb/llh.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry that I don't understand it well. seh does be sign-extended. The result of seh is good enough for slt. Why do we need to extend them to 32bit value?

Because slt compares signed integers. When comparing subwords, we need to take the sign of the subwords into consideration. When the subword isn't at the MSB spot, we get the result we didn't expect.

And sllv here may make the result incorrect.
The return value should be a signed int16, while with sllv it will be (sign int16)<<$10.
Note, $10 here contains the offset of a int16 in the a word, it may be 0 or 16.

Correct, $10 contains the offset. The code after my changes needs the subwords to be placed with the provided offset inside a word. That is what #77072 didn't do: the subword was shifted to the LSB spot and left there, causing unexpected behavior in the subsequent code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks.
While it seems there is another problem introduced by the previous patch (not your current):

if ptr is something like

struct xx {
    int16 a;
    int16 b;
}

our code will overwrite another halfword.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question, does atomicrmw need to support unaligned access?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question, does atomicrmw need to support unaligned access?

I don't believe so. According to the documentation, alignment field is always present for in-memory IR and default alignment is provided when the alignment field isn't present.

However, I'm unsure how your questions tie in with this PR.

@wzssyqa
Copy link
Contributor

wzssyqa commented Apr 19, 2024

Just for reference: there is

sllv    $7, $5, $10
...
srav    $7, $7, $10
seh     $7, $7

Maybe they are not needed at all. The ABI requires the arguments passed in registers to be well sign-extended.

Copy link
Contributor

@wzssyqa wzssyqa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to make the result wrong.

In order for the following `SLT` instruction to work properly,
we need to sign-extend appropriate subwords.

In addition, subwords must remain in the same position from
before sign-extension.

Resolves llvm#61881. Also, downstream bugs rust-lang/rust#100650 and
rust-lang/rust#123772 are fixed.
@wzssyqa
Copy link
Contributor

wzssyqa commented Apr 19, 2024

test_max_16.s.gz

Maybe this asm code is helpful.

@wzssyqa
Copy link
Contributor

wzssyqa commented Apr 19, 2024

And I find that the current code cannot work with big-endian.

#include <stdio.h>

short test_max_16 (short *a, short b);

struct xx {
        short a;
        short b;
} s = {0x1234, 0x5678};

int main() {
        short p = test_max_16 (&(s.b), 0x6789);
        printf("%hx, %hx\n", s.b, p);
}
5678, 5678

@wzssyqa
Copy link
Contributor

wzssyqa commented Apr 20, 2024

syq-89246.patch

@jdmitrovic-syrmia I figure out a patch. I have some test on both big-endian and little endian.
can you have a review?

@yingopq
Copy link
Contributor

yingopq commented Apr 22, 2024

syq-89246.patch

@jdmitrovic-syrmia I figure out a patch. I have some test on both big-endian and little endian. can you have a review?

The slt instruction requires both parameters to be sign extended.

@wzssyqa
Copy link
Contributor

wzssyqa commented Apr 22, 2024

syq-89246.patch
@jdmitrovic-syrmia I figure out a patch. I have some test on both big-endian and little endian. can you have a review?

The slt instruction requires both parameters to be sign extended.

Sure. So I sign-extend the StoreVal. And Incr has been sign-extend, since it is passed by a register: ABI requires it's sign-extended.

@yingopq
Copy link
Contributor

yingopq commented Apr 22, 2024

syq-89246.patch
@jdmitrovic-syrmia I figure out a patch. I have some test on both big-endian and little endian. can you have a review?

The slt instruction requires both parameters to be sign extended.

Sure. So I sign-extend the StoreVal. And Incr has been sign-extend, since it is passed by a register: ABI requires it's sign-extended.

You mean incr is sign extended after function MachineBasicBlock *MipsTargetLowering::emitAtomicBinaryPartword?

@wzssyqa
Copy link
Contributor

wzssyqa commented Apr 22, 2024

You mean incr is sign extended after function MachineBasicBlock *MipsTargetLowering::emitAtomicBinaryPartword?

I mean the caller of test_max_16 or similar should guarantee the 1st (count from 0, type short) of test_max_16 to be well sign-extended.

@wzssyqa
Copy link
Contributor

wzssyqa commented Apr 22, 2024

syq-89246-v2.patch.gz

This patch fixes some problem for big-endian.

@wzssyqa
Copy link
Contributor

wzssyqa commented Apr 22, 2024

test.c.gz

I test with this C code.

@wzssyqa
Copy link
Contributor

wzssyqa commented Apr 22, 2024

@jdmitrovic-syrmia can you have a review #89575

@wzssyqa wzssyqa closed this Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[MIPS] Incorrect expansion of sub-word signed atomic max
4 participants