Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] Compress unrelaxable lui when RVC attribute is present #74715

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

nemanjai
Copy link
Member

@nemanjai nemanjai commented Dec 7, 2023

If the object being linked allows compressed instructions, relaxation is on and we are unable to relax the HI20 relocation, emit c.lui instead of lui if the immediate fits.

Nemanja Ivanovic added 2 commits December 7, 2023 14:15
If the object being linked allows compressed instructions,
relaxation is on and we are unable to relax the HI20
relocation, emit c.lui instead of lui if the immediate
fits.
@llvmbot
Copy link
Collaborator

llvmbot commented Dec 7, 2023

@llvm/pr-subscribers-lld-elf

Author: Nemanja Ivanovic (nemanjai)

Changes

If the object being linked allows compressed instructions, relaxation is on and we are unable to relax the HI20 relocation, emit c.lui instead of lui if the immediate fits.


Full diff: https://github.com/llvm/llvm-project/pull/74715.diff

2 Files Affected:

  • (modified) lld/ELF/Arch/RISCV.cpp (+29)
  • (added) lld/test/ELF/riscv-clui-relax-hi20.s (+46)
diff --git a/lld/ELF/Arch/RISCV.cpp b/lld/ELF/Arch/RISCV.cpp
index 898e3e45b9e72..ffe1ece274f66 100644
--- a/lld/ELF/Arch/RISCV.cpp
+++ b/lld/ELF/Arch/RISCV.cpp
@@ -650,8 +650,36 @@ static void relaxHi20Lo12(const InputSection &sec, size_t i, uint64_t loc,
   if (!gp)
     return;
 
+  bool noRelaxation = false;
   if (!isInt<12>(r.sym->getVA(r.addend) - gp->getVA()))
+    noRelaxation = true;
+
+  if (noRelaxation) {
+    if (r.type != R_RISCV_HI20 || !(config->eflags & EF_RISCV_RVC) ||
+        !isInt<18>(r.sym->getVA(r.addend)))
+      return;
+    // We have a HI20 reloc that fits in 18 bits but not in 12 bits
+    // and compressed instructions are enabled.
+    // Try to compress the lui to a c.lui.
+    const unsigned bits = config->wordsize * 8;
+    uint32_t rd = extractBits(read32le(sec.content().data() + r.offset), 11, 7);
+    uint32_t newInsn = 0x6001 | (rd << 7); // c.lui
+    int64_t imm = SignExtend64(r.sym->getVA(r.addend) + 0x800, bits) >> 12;
+
+    // The high immediate may be outside the range if the low needs to be
+    // negative. So the isInt<18> check above is not sufficient. Also,
+    // rd != 2 for c.lui.
+    if (rd == 2 || !isInt<6>(imm))
+      return;
+    if (imm == 0)
+      newInsn &= 0xDFFF;
+    uint16_t imm17 = extractBits(imm, 17, 17) << 12;
+    uint16_t imm16_12 = extractBits(imm, 16, 12) << 2;
+    remove = 2;
+    sec.relaxAux->relocTypes[i] = R_RISCV_RVC_LUI;
+    sec.relaxAux->writes.push_back(newInsn | imm17 | imm16_12);
     return;
+  }
 
   switch (r.type) {
   case R_RISCV_HI20:
@@ -830,6 +858,7 @@ void elf::riscvFinalizeRelax(int passes) {
           case R_RISCV_RELAX:
             // Used by relaxTlsLe to indicate the relocation is ignored.
             break;
+          case R_RISCV_RVC_LUI:
           case R_RISCV_RVC_JUMP:
             skip = 2;
             write16le(p, aux.writes[writesIdx++]);
diff --git a/lld/test/ELF/riscv-clui-relax-hi20.s b/lld/test/ELF/riscv-clui-relax-hi20.s
new file mode 100644
index 0000000000000..ef6284fd116b7
--- /dev/null
+++ b/lld/test/ELF/riscv-clui-relax-hi20.s
@@ -0,0 +1,46 @@
+# REQUIRES: riscv
+# RUN: rm -rf %t && split-file %s %t && cd %t
+
+# RUN: llvm-mc -filetype=obj -triple=riscv32-unknown-elf -mattr=+relax,+c a.s -o rv32.o
+# RUN: llvm-mc -filetype=obj -triple=riscv64-unknown-elf -mattr=+relax,+c a.s -o rv64.o
+
+# RUN: ld.lld --relax-gp --undefined=__global_pointer$ rv32.o lds -o rv32
+# RUN: ld.lld --relax-gp --undefined=__global_pointer$ rv64.o lds -o rv64
+# RUN: llvm-objdump -td -M no-aliases rv32 | FileCheck %s
+# RUN: llvm-objdump -td -M no-aliases rv64 | FileCheck %s
+
+# CHECK: 00002004 l       .data  {{0+}} foo
+# CHECK: 00020004 l       .far   {{0+}} bar
+# CHECK: 00001800 g       .sdata {{0+}} __global_pointer$
+
+# CHECK: <_start>:
+# CHECK: 09 65         c.lui   a0, 2
+# CHECK: 13 05 45 00   addi    a0, a0, 4
+# CHECK: 37 05 02 00   lui   a0, 32
+# CHECK: 13 05 45 00   addi    a0, a0, 4
+
+
+#--- a.s
+.global _start
+_start:
+  lui a0, %hi(foo)
+  addi a0, a0, %lo(foo)
+  lui a0, %hi(bar)
+  addi a0, a0, %lo(bar)
+
+.section .sdata,"aw"
+  .zero 32
+.section .data,"aw"
+foo:
+  .word 0
+.section .far,"aw"
+bar:
+  .word 0
+
+#--- lds
+SECTIONS {
+  .text : {*(.text) }
+  .sdata 0x1000 : { }
+  .data 0x2004 : { }
+  .far 0x20004 : { }
+}

@llvmbot
Copy link
Collaborator

llvmbot commented Dec 7, 2023

@llvm/pr-subscribers-lld

Author: Nemanja Ivanovic (nemanjai)

Changes

If the object being linked allows compressed instructions, relaxation is on and we are unable to relax the HI20 relocation, emit c.lui instead of lui if the immediate fits.


Full diff: https://github.com/llvm/llvm-project/pull/74715.diff

2 Files Affected:

  • (modified) lld/ELF/Arch/RISCV.cpp (+29)
  • (added) lld/test/ELF/riscv-clui-relax-hi20.s (+46)
diff --git a/lld/ELF/Arch/RISCV.cpp b/lld/ELF/Arch/RISCV.cpp
index 898e3e45b9e72..ffe1ece274f66 100644
--- a/lld/ELF/Arch/RISCV.cpp
+++ b/lld/ELF/Arch/RISCV.cpp
@@ -650,8 +650,36 @@ static void relaxHi20Lo12(const InputSection &sec, size_t i, uint64_t loc,
   if (!gp)
     return;
 
+  bool noRelaxation = false;
   if (!isInt<12>(r.sym->getVA(r.addend) - gp->getVA()))
+    noRelaxation = true;
+
+  if (noRelaxation) {
+    if (r.type != R_RISCV_HI20 || !(config->eflags & EF_RISCV_RVC) ||
+        !isInt<18>(r.sym->getVA(r.addend)))
+      return;
+    // We have a HI20 reloc that fits in 18 bits but not in 12 bits
+    // and compressed instructions are enabled.
+    // Try to compress the lui to a c.lui.
+    const unsigned bits = config->wordsize * 8;
+    uint32_t rd = extractBits(read32le(sec.content().data() + r.offset), 11, 7);
+    uint32_t newInsn = 0x6001 | (rd << 7); // c.lui
+    int64_t imm = SignExtend64(r.sym->getVA(r.addend) + 0x800, bits) >> 12;
+
+    // The high immediate may be outside the range if the low needs to be
+    // negative. So the isInt<18> check above is not sufficient. Also,
+    // rd != 2 for c.lui.
+    if (rd == 2 || !isInt<6>(imm))
+      return;
+    if (imm == 0)
+      newInsn &= 0xDFFF;
+    uint16_t imm17 = extractBits(imm, 17, 17) << 12;
+    uint16_t imm16_12 = extractBits(imm, 16, 12) << 2;
+    remove = 2;
+    sec.relaxAux->relocTypes[i] = R_RISCV_RVC_LUI;
+    sec.relaxAux->writes.push_back(newInsn | imm17 | imm16_12);
     return;
+  }
 
   switch (r.type) {
   case R_RISCV_HI20:
@@ -830,6 +858,7 @@ void elf::riscvFinalizeRelax(int passes) {
           case R_RISCV_RELAX:
             // Used by relaxTlsLe to indicate the relocation is ignored.
             break;
+          case R_RISCV_RVC_LUI:
           case R_RISCV_RVC_JUMP:
             skip = 2;
             write16le(p, aux.writes[writesIdx++]);
diff --git a/lld/test/ELF/riscv-clui-relax-hi20.s b/lld/test/ELF/riscv-clui-relax-hi20.s
new file mode 100644
index 0000000000000..ef6284fd116b7
--- /dev/null
+++ b/lld/test/ELF/riscv-clui-relax-hi20.s
@@ -0,0 +1,46 @@
+# REQUIRES: riscv
+# RUN: rm -rf %t && split-file %s %t && cd %t
+
+# RUN: llvm-mc -filetype=obj -triple=riscv32-unknown-elf -mattr=+relax,+c a.s -o rv32.o
+# RUN: llvm-mc -filetype=obj -triple=riscv64-unknown-elf -mattr=+relax,+c a.s -o rv64.o
+
+# RUN: ld.lld --relax-gp --undefined=__global_pointer$ rv32.o lds -o rv32
+# RUN: ld.lld --relax-gp --undefined=__global_pointer$ rv64.o lds -o rv64
+# RUN: llvm-objdump -td -M no-aliases rv32 | FileCheck %s
+# RUN: llvm-objdump -td -M no-aliases rv64 | FileCheck %s
+
+# CHECK: 00002004 l       .data  {{0+}} foo
+# CHECK: 00020004 l       .far   {{0+}} bar
+# CHECK: 00001800 g       .sdata {{0+}} __global_pointer$
+
+# CHECK: <_start>:
+# CHECK: 09 65         c.lui   a0, 2
+# CHECK: 13 05 45 00   addi    a0, a0, 4
+# CHECK: 37 05 02 00   lui   a0, 32
+# CHECK: 13 05 45 00   addi    a0, a0, 4
+
+
+#--- a.s
+.global _start
+_start:
+  lui a0, %hi(foo)
+  addi a0, a0, %lo(foo)
+  lui a0, %hi(bar)
+  addi a0, a0, %lo(bar)
+
+.section .sdata,"aw"
+  .zero 32
+.section .data,"aw"
+foo:
+  .word 0
+.section .far,"aw"
+bar:
+  .word 0
+
+#--- lds
+SECTIONS {
+  .text : {*(.text) }
+  .sdata 0x1000 : { }
+  .data 0x2004 : { }
+  .far 0x20004 : { }
+}

Copy link

github-actions bot commented Dec 7, 2023

:white_check_mark: With the latest revision this PR passed the C/C++ code formatter.

noRelaxation = true;

if (noRelaxation) {
if (r.type != R_RISCV_HI20 || !(config->eflags & EF_RISCV_RVC) ||
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second condition needs to change to something like the above updated way of determining if RVC is enabled:
!(getEFlags(sec.file) & EF_RISCV_RVC)

@@ -650,8 +650,36 @@ static void relaxHi20Lo12(const InputSection &sec, size_t i, uint64_t loc,
if (!gp)
return;

bool noRelaxation = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this variable? Can we use !isInt<12>(r.sym->getVA(r.addend) - gp->getVA()) in the one place we use the variabe?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. This was added in anticipation of another reason we might decide not to relax that is added in this PR

return;
if (imm == 0)
newInsn &= 0xDFFF;
uint16_t imm17 = extractBits(imm, 17, 17) << 12;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't imm already shifted right by 12. Is this extracting the correct bit or am I missing something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. That does indeed seem wrong. It should be

uint16_t imm17 = extractBits(imm, 5, 5) << 12;
uint16_t imm16_12 = extractBits(imm, 4, 0) << 2;

Nemanja Ivanovic added 2 commits December 11, 2023 16:40
- Extract the correct bits from the immediate
- Add a test for an offset that fits in 18 bits, but the upper is not 6 bits
uint16_t imm16_12 = extractBits(imm, 4, 0) << 2;
remove = 2;
sec.relaxAux->relocTypes[i] = R_RISCV_RVC_LUI;
sec.relaxAux->writes.push_back(newInsn | imm17 | imm16_12);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know the final address of the global at this point? We're in the middle of shrinking code here, but I don't think the layout hasn't been finalized. That's why R_RISCV_LO12_I and R_RISCV_LO12_S are changed to INTERNAL_R_RISCV_GPREL_I/INTERNAL_R_RISCV_GPREL_S and the rewrite occurs in RISCV::relocate

I wonder if you only need to change the opcode to 0x6001 | (rd << 7) here and let RISCV::relocate finish the rewrite?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The final instruction is indeed written in RISCV::relocate(). While we technically don't need to add the immediate here (i.e. we could just write a zero), I figured there is no harm in putting it in there - I don't know if there is any code path where this is needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants