Skip to content

Conversation

topperc
Copy link
Collaborator

@topperc topperc commented Sep 19, 2025

The simm32 base case only uses lui+addiw when necessary after 3d2650b

The worst case 8 instruction sequence doesn't leave a full 32 bits for the LUI+ADDI(W) after the 3 12-bit ADDI and SLLI pairs are created. So we will never generate LUI+ADDIW in the worst case sequence.

@llvmbot
Copy link
Member

llvmbot commented Sep 19, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Craig Topper (topperc)

Changes

The simm32 base case only uses lui+addiw when necessary after 3d2650b

The worst case 8 instruction sequence doesn't leave a full 32 bits for the LUI+ADDI(W) after the 3 12-bit ADDI and SLLI pairs are created. So we will never generate LUI+ADDIW in the worst case sequence.


Full diff: https://github.com/llvm/llvm-project/pull/159829.diff

1 Files Affected:

  • (modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVMatInt.cpp (+14-13)
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMatInt.cpp b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMatInt.cpp
index 787a38ffb7be6..0cd1d9c37279d 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMatInt.cpp
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMatInt.cpp
@@ -111,18 +111,18 @@ static void generateInstSeqImpl(int64_t Val, const MCSubtargetInfo &STI,
   assert(IsRV64 && "Can't emit >32-bit imm for non-RV64 target");
 
   // In the worst case, for a full 64-bit constant, a sequence of 8 instructions
-  // (i.e., LUI+ADDIW+SLLI+ADDI+SLLI+ADDI+SLLI+ADDI) has to be emitted. Note
-  // that the first two instructions (LUI+ADDIW) can contribute up to 32 bits
+  // (i.e., LUI+ADDI+SLLI+ADDI+SLLI+ADDI+SLLI+ADDI) has to be emitted. Note
+  // that the first two instructions (LUI+ADDI) can contribute up to 32 bits
   // while the following ADDI instructions contribute up to 12 bits each.
   //
   // On the first glance, implementing this seems to be possible by simply
-  // emitting the most significant 32 bits (LUI+ADDIW) followed by as many left
-  // shift (SLLI) and immediate additions (ADDI) as needed. However, due to the
-  // fact that ADDI performs a sign extended addition, doing it like that would
-  // only be possible when at most 11 bits of the ADDI instructions are used.
-  // Using all 12 bits of the ADDI instructions, like done by GAS, actually
-  // requires that the constant is processed starting with the least significant
-  // bit.
+  // emitting the most significant 32 bits (LUI+ADDI(W)) followed by as many
+  // left shift (SLLI) and immediate additions (ADDI) as needed. However, due to
+  // the fact that ADDI performs a sign extended addition, doing it like that
+  // would only be possible when at most 11 bits of the ADDI instructions are
+  // used. Using all 12 bits of the ADDI instructions, like done by GAS,
+  // actually requires that the constant is processed starting with the least
+  // significant bit.
   //
   // In the following, constants are processed from LSB to MSB but instruction
   // emission is performed from MSB to LSB by recursively calling
@@ -344,8 +344,9 @@ InstSeq generateInstSeq(int64_t Val, const MCSubtargetInfo &STI) {
 
   // Perform optimization with BSETI in the Zbs extension.
   if (Res.size() > 2 && STI.hasFeature(RISCV::FeatureStdExtZbs)) {
-    // Create a simm32 value for LUI+ADDIW by forcing the upper 33 bits to zero.
-    // Xor that with original value to get which bits should be set by BSETI.
+    // Create a simm32 value for LUI+ADDI(W) by forcing the upper 33 bits to
+    // zero. Xor that with original value to get which bits should be set by
+    // BSETI.
     uint64_t Lo = Val & 0x7fffffff;
     uint64_t Hi = Val ^ Lo;
     assert(Hi != 0);
@@ -372,8 +373,8 @@ InstSeq generateInstSeq(int64_t Val, const MCSubtargetInfo &STI) {
 
   // Perform optimization with BCLRI in the Zbs extension.
   if (Res.size() > 2 && STI.hasFeature(RISCV::FeatureStdExtZbs)) {
-    // Create a simm32 value for LUI+ADDIW by forcing the upper 33 bits to one.
-    // Xor that with original value to get which bits should be cleared by
+    // Create a simm32 value for LUI+ADDI(W) by forcing the upper 33 bits to
+    // one. Xor that with original value to get which bits should be cleared by
     // BCLRI.
     uint64_t Lo = Val | 0xffffffff80000000;
     uint64_t Hi = Val ^ Lo;

@topperc topperc merged commit f437309 into llvm:main Sep 19, 2025
7 of 9 checks passed
@topperc topperc deleted the pr/addiw branch September 19, 2025 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants