Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lld] Add thunks for hexagon #111217

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

androm3da
Copy link
Member

This draft is a starting point for adding thunks to lld for hexagon.

Unfortunately there's at least one known issue: mid-packet jumps are calculating the wrong offset to the thunk. I think I've misunderstood how to correctly account for the original addend and the recalculated addend.

Other potential issues:

  • the symbols generated for thunks don't match qcld aka hexagon-link. Should that be a concern at all, or can the symbols for each linker be defined independently?
  • the symbols generated for thunks aren't necessarily unique: should they be?
  • some small-ish cleanup/TODO/FIXMEs

Happy to accept any feedback while I try to work through the remaining known issues.

@llvmbot
Copy link
Collaborator

llvmbot commented Oct 4, 2024

@llvm/pr-subscribers-lld-elf

@llvm/pr-subscribers-lld

Author: Brian Cain (androm3da)

Changes

This draft is a starting point for adding thunks to lld for hexagon.

Unfortunately there's at least one known issue: mid-packet jumps are calculating the wrong offset to the thunk. I think I've misunderstood how to correctly account for the original addend and the recalculated addend.

Other potential issues:

  • the symbols generated for thunks don't match qcld aka hexagon-link. Should that be a concern at all, or can the symbols for each linker be defined independently?
  • the symbols generated for thunks aren't necessarily unique: should they be?
  • some small-ish cleanup/TODO/FIXMEs

Happy to accept any feedback while I try to work through the remaining known issues.


Full diff: https://github.com/llvm/llvm-project/pull/111217.diff

4 Files Affected:

  • (modified) lld/ELF/Arch/Hexagon.cpp (+38)
  • (modified) lld/ELF/Thunks.cpp (+99-1)
  • (added) lld/test/ELF/hexagon-thunks-packets.s (+122)
  • (added) lld/test/ELF/hexagon-thunks.s (+44)
diff --git a/lld/ELF/Arch/Hexagon.cpp b/lld/ELF/Arch/Hexagon.cpp
index d689fc2a152101..6d6d49c3b539d9 100644
--- a/lld/ELF/Arch/Hexagon.cpp
+++ b/lld/ELF/Arch/Hexagon.cpp
@@ -10,6 +10,7 @@
 #include "Symbols.h"
 #include "SyntheticSections.h"
 #include "Target.h"
+#include "Thunks.h"
 #include "lld/Common/ErrorHandler.h"
 #include "llvm/BinaryFormat/ELF.h"
 #include "llvm/Support/Endian.h"
@@ -30,6 +31,10 @@ class Hexagon final : public TargetInfo {
                      const uint8_t *loc) const override;
   RelType getDynRel(RelType type) const override;
   int64_t getImplicitAddend(const uint8_t *buf, RelType type) const override;
+  bool needsThunk(RelExpr expr, RelType type, const InputFile *file,
+                  uint64_t branchAddr, const Symbol &s,
+                  int64_t a) const override;
+  bool inBranchRange(RelType type, uint64_t src, uint64_t dst) const override;
   void relocate(uint8_t *loc, const Relocation &rel,
                 uint64_t val) const override;
   void writePltHeader(uint8_t *buf) const override;
@@ -57,6 +62,8 @@ Hexagon::Hexagon(Ctx &ctx) : TargetInfo(ctx) {
   tlsGotRel = R_HEX_TPREL_32;
   tlsModuleIndexRel = R_HEX_DTPMOD_32;
   tlsOffsetRel = R_HEX_DTPREL_32;
+
+  needsThunks = true;
 }
 
 uint32_t Hexagon::calcEFlags() const {
@@ -239,6 +246,37 @@ static uint32_t findMaskR16(uint32_t insn) {
 
 static void or32le(uint8_t *p, int32_t v) { write32le(p, read32le(p) | v); }
 
+bool Hexagon::inBranchRange(RelType type, uint64_t src, uint64_t dst) const {
+  int64_t offset = dst - src;
+  switch (type) {
+  case llvm::ELF::R_HEX_B22_PCREL:
+  case llvm::ELF::R_HEX_PLT_B22_PCREL:
+  case llvm::ELF::R_HEX_GD_PLT_B22_PCREL:
+  case llvm::ELF::R_HEX_LD_PLT_B22_PCREL:
+    return llvm::isInt<22>(offset >> 2);
+  case llvm::ELF::R_HEX_B15_PCREL:
+    return llvm::isInt<15>(offset >> 2);
+    break;
+  case llvm::ELF::R_HEX_B13_PCREL:
+    return llvm::isInt<13>(offset >> 2);
+    break;
+  case llvm::ELF::R_HEX_B9_PCREL:
+    return llvm::isInt<9>(offset >> 2);
+  default:
+    return true;
+  }
+  llvm::report_fatal_error(StringRef(
+      "unsupported relocation type used in isInRange: " + toString(type)));
+}
+
+bool Hexagon::needsThunk(RelExpr expr, RelType type, const InputFile *file,
+                         uint64_t branchAddr, const Symbol &s,
+                         int64_t a) const {
+  // FIXME: needsThunk() should return false for !inRange() &&
+  // !enoughSpaceForThunk() ?
+  return !ctx.target->inBranchRange(type, branchAddr, s.getVA(a));
+}
+
 void Hexagon::relocate(uint8_t *loc, const Relocation &rel,
                        uint64_t val) const {
   switch (rel.type) {
diff --git a/lld/ELF/Thunks.cpp b/lld/ELF/Thunks.cpp
index ef97530679469d..c27422a5bc66b0 100644
--- a/lld/ELF/Thunks.cpp
+++ b/lld/ELF/Thunks.cpp
@@ -351,6 +351,47 @@ class AVRThunk : public Thunk {
   void addSymbols(ThunkSection &isec) override;
 };
 
+static const uint32_t MASK_END_PACKET = 3 << 14;
+static const uint32_t END_OF_PACKET = 3 << 14;
+static const uint32_t END_OF_DUPLEX = 0 << 14;
+
+static std::optional<int32_t> getRealAddend(const InputSection &isec,
+                                            const Relocation &rel) {
+  const ArrayRef<uint8_t> SectContents = isec.content();
+  if (SectContents.size() < sizeof(int32_t))
+    // FIXME: assert?  emit a diagnostic?
+    return std::nullopt;
+  int32_t offset = rel.offset;
+
+  // Search as many as 4 instructions:
+  for (int i = 0; i < 4; i++) {
+    uint32_t instWord = 0;
+    const ArrayRef<uint8_t> InstWordContents = SectContents.drop_front(offset);
+    ::memcpy(&instWord, InstWordContents.data(), sizeof(instWord));
+    if (((instWord & MASK_END_PACKET) == END_OF_PACKET) ||
+        ((instWord & MASK_END_PACKET) == END_OF_DUPLEX))
+      break;
+    offset += sizeof(instWord);
+  }
+  return offset - rel.offset;
+}
+// Hexagon CPUs need thunks for <<FIXME TBD>>
+// when their destination is out of range [0, 0x_?].
+class HexagonThunk : public Thunk {
+public:
+  HexagonThunk(Ctx &ctx, const InputSection &isec, Relocation &rel,
+               Symbol &dest)
+      : Thunk(ctx, dest, 0), RealAddend(getRealAddend(isec, rel)),
+        RelOffset(rel.offset) {
+    alignment = 4;
+  }
+  std::optional<int32_t> RealAddend;
+  int32_t RelOffset;
+  uint32_t size() override { return ctx.arg.isPic ? 12 : 8; }
+  void writeTo(uint8_t *buf) override;
+  void addSymbols(ThunkSection &isec) override;
+};
+
 // MIPS LA25 thunk
 class MipsThunk final : public Thunk {
 public:
@@ -1352,6 +1393,39 @@ bool PPC64LongBranchThunk::isCompatibleWith(const InputSection &isec,
   return rel.type == R_PPC64_REL24 || rel.type == R_PPC64_REL14;
 }
 
+// Hexagon Target Thunks
+static uint64_t getHexagonThunkDestVA(const Symbol &s, int64_t a) {
+  uint64_t v = s.isInPlt() ? s.getPltVA() : s.getVA(a);
+  return SignExtend64<32>(v); // FIXME: sign extend to 64-bit?
+}
+
+void HexagonThunk::writeTo(uint8_t *buf) {
+  uint64_t s = getHexagonThunkDestVA(destination, addend);
+  uint64_t p = getThunkTargetSym()->getVA();
+
+  if (ctx.arg.isPic) {
+    write32(buf + 0, 0x00004000); // {  immext(#0)
+    ctx.target->relocateNoSym(buf, R_HEX_B32_PCREL_X, s - p);
+    write32(buf + 4, 0x6a49c00e); //    r14 = add(pc,##0) }
+    ctx.target->relocateNoSym(buf + 4, R_HEX_6_PCREL_X, s - p);
+
+    write32(buf + 8, 0x528ec000); // {  jumpr r14 }
+  } else {
+    write32(buf + 0, 0x00004000); //  { immext
+    ctx.target->relocateNoSym(buf, R_HEX_B32_PCREL_X, s - p);
+    write32(buf + 4, 0x5800c000); //    jump <> }
+    ctx.target->relocateNoSym(buf + 4, R_HEX_B22_PCREL_X, s - p);
+  }
+}
+void HexagonThunk::addSymbols(ThunkSection &isec) {
+  Symbol *enclosing = isec.getEnclosingSymbol(RelOffset);
+  StringRef src = enclosing ? enclosing->getName() : isec.name;
+
+  addSymbol(saver().save("__trampoline_for_" + destination.getName() +
+                         "_from_" + src),
+            STT_FUNC, 0, isec);
+}
+
 Thunk::Thunk(Ctx &ctx, Symbol &d, int64_t a)
     : ctx(ctx), destination(d), addend(a), offset(0) {
   destination.thunkAccessed = true;
@@ -1513,6 +1587,27 @@ static Thunk *addThunkAVR(Ctx &ctx, RelType type, Symbol &s, int64_t a) {
   }
 }
 
+static Thunk *addThunkHexagon(Ctx &ctx, const InputSection &isec,
+                              Relocation &rel, Symbol &s) {
+  switch (rel.type) {
+  case R_HEX_B9_PCREL:
+  case R_HEX_B13_PCREL:
+  case R_HEX_B15_PCREL:
+  case R_HEX_B22_PCREL:
+  case R_HEX_PLT_B22_PCREL:
+  case R_HEX_GD_PLT_B22_PCREL:
+#if 0
+    // FIXME: we don't need this for extended rels?
+  case R_HEX_B32_PCREL_X:
+  case R_HEX_6_PCREL_X:
+  case R_HEX_B22_PCREL_X:
+#endif
+    return make<HexagonThunk>(ctx, isec, rel, s);
+  default:
+    fatal("unrecognized relocation type " + toString(rel.type));
+  }
+}
+
 static Thunk *addThunkMips(Ctx &ctx, RelType type, Symbol &s) {
   if ((s.stOther & STO_MIPS_MICROMIPS) && isMipsR6())
     return make<MicroMipsR6Thunk>(ctx, s);
@@ -1578,8 +1673,11 @@ Thunk *elf::addThunk(Ctx &ctx, const InputSection &isec, Relocation &rel) {
     return addThunkPPC32(ctx, isec, rel, s);
   case EM_PPC64:
     return addThunkPPC64(ctx, rel.type, s, a);
+  case EM_HEXAGON:
+    return addThunkHexagon(ctx, isec, rel, s);
   default:
-    llvm_unreachable("add Thunk only supported for ARM, AVR, Mips and PowerPC");
+    llvm_unreachable(
+        "add Thunk only supported for ARM, AVR, Hexagon, Mips and PowerPC");
   }
 }
 
diff --git a/lld/test/ELF/hexagon-thunks-packets.s b/lld/test/ELF/hexagon-thunks-packets.s
new file mode 100644
index 00000000000000..e50558763119ed
--- /dev/null
+++ b/lld/test/ELF/hexagon-thunks-packets.s
@@ -0,0 +1,122 @@
+# REQUIRES: hexagon
+# RUN: llvm-mc -filetype=obj -triple=hexagon-unknown-linux-musl %s -o %t.o
+# RUN: ld.lld %t.o -o %t
+# RUN: llvm-objdump -d %t 2>&1 | FileCheck --check-prefix=CHECK-NONPIC %s
+# RUN: llvm-mc -filetype=obj --position-independent \
+# RUN:         -triple=hexagon-unknown-linux-musl %s -o %t.o
+# RUN: ld.lld --pie %t.o -o %t
+# RUN: llvm-objdump -d %t 2>&1 | FileCheck --check-prefix=CHECK-PIC %s
+
+# Packets with pc-relative relocations are more interesting because
+# the offset must be relative to the start of the source, destination
+# packets and not necessarily the instruction word containing the jump/call.
+
+# CHECK:  Disassembly of section .text:
+
+# CHECK-NONPIC: 000200b4 <__trampoline_for_myfn_a_from_.text.thunk>:
+# CHECK-NONPIC: { immext(#0x800040)
+# CHECK-NONPIC:   jump 0x820118 }
+# CHECK-NONPIC: 000200bc <__trampoline_for_myfn_a_from_.text.thunk>:
+# CHECK-NONPIC: { immext(#0x800040)
+# CHECK-NONPIC:   jump 0x820118 }
+
+# CHECK-PIC:    00010150 <__trampoline_for_myfn_a_from_.text.thunk>:
+# CHECK-PIC:    { immext(#0x800040)
+# CHECK-PIC:      r14 = add(pc,##0x80006c) }
+# CHECK-PIC:    { jumpr r14 }
+
+# CHECK-NONPIC: 000200c4 <myfn_b>:
+# CHECK-NONPIC: { jumpr r31 }
+# CHECK-PIC:    00010168 <myfn_b>:
+# CHECK-PIC:    { jumpr r31 }
+    .globl myfn_b
+    .type  myfn_b, @function
+myfn_b:
+    jumpr r31
+    .size  myfn_b, .-myfn_b
+
+# CHECK-PIC:    0001016c <main>:
+    .globl main
+    .type  main, @function
+main:
+    { r0 = #0
+      call myfn_a }
+# CHECK-PIC:      { call 0x10150
+# CHECK-NONPIC:   { call 0x200b4
+# CHECK:            r0 = #0x0 }
+    call myfn_a
+# CHECK-PIC:    call 0x10150
+# CHECK-NONPIC: call 0x200b4
+    call myfn_b
+# CHECK-PIC:    call 0x10168
+# CHECK-NONPIC: call 0x200c4
+
+    { r2 = add(r0, r1)
+      if (p0) call #myfn_b
+      if (!p0) call #myfn_a }
+# CHECK-PIC:     { if (p0) call 0x10168
+# CHECK-PIC:       if (!p0) call 0x10150
+# CHECK-NONPIC:  { if (p0) call 0x200bc
+# CHECK-NONPIC:    if (!p0) call 0x200b4
+# CHECK:           r2 = add(r0,r1) }
+
+    { r2 = add(r0, r1)
+      if (p0) call #myfn_a
+      if (!p0) call #myfn_a }
+# CHECK-PIC:  { if (p0) call 0x10150
+# CHECK-PIC:    if (!p0) call 0x10150
+# CHECK-NONPIC:  { if (p0) call 0x200b4
+# CHECK-NONPIC:    if (!p0) call 0x200b4
+# CHECK:           r2 = add(r0,r1) }
+
+    { r2 = add(r0, r1)
+      r1 = r4
+      r4 = r5
+      if (r0 == #0) jump:t #myfn_a }
+# CHECK-PIC:     { if (r0==#0) jump:t 0x10150
+# CHECK-NONPIC:  { if (r0==#0) jump:t 0x200b4
+# CHECK:           r2 = add(r0,r1)
+# CHECK:           r1 = r4; r4 = r5 }
+
+    { r2 = add(r0, r1)
+      r4 = r5
+      if (r0 <= #0) jump:t #myfn_a
+      p1 = cmp.eq(r0, #0); if (p1.new) jump:nt #myfn_a }
+# CHECK-NONPIC:  { if (r0==#0) jump:t 0x200b4
+# CHECK-NONPIC:    p1 = cmp.eq(r0,#0x0); if (p1.new) jump:nt 0x200b4
+# CHECK-PIC:     { if (r0<=#0) jump:t 0x10150
+# CHECK-PIC:       p1 = cmp.eq(r0,#0x0); if (p1.new) jump:nt 0x10150
+# CHECK:           r2 = add(r0,r1)
+# CHECK:           r4 = r5 }
+
+    {r0 = #0; jump #myfn_a}
+# CHECK-PIC:    { r0 = #0x0 ; jump 0x10150 }
+# CHECK-NONPIC: { r0 = #0x0 ; jump 0x200b4 }
+    {r0 = #0; jump #myfn_b}
+# CHECK-PIC:    { r0 = #0x0 ; jump 0x10168 }
+# CHECK-NONPIC: { r0 = #0x0 ; jump 0x200c4 }
+    jumpr r31
+    .size   main, .-main
+
+    .section .text.foo
+    .skip 0x800000
+
+    .globl myfn_a
+    .type  myfn_a, @function
+myfn_a:
+    {r0 = #0; jump #myfn_b}
+    jumpr r31
+    .size  myfn_a, .-myfn_a
+
+# CHECK-NONPIC: 00820118 <myfn_a>:
+# CHECK-NONPIC: { r0 = #0x0 ; jump 0x820120 }
+# CHECK-NONPIC: { jumpr r31 }
+
+# CHECK-NONPIC: 00820120 <__trampoline_for_myfn_b_from_.text.thunk>:
+# CHECK-NONPIC: { immext(#0xff7fff80)
+# CHECK-NONPIC:   jump 0x200c4 }
+
+# CHECK-PIC:    008101c4 <__trampoline_for_myfn_b_from_.text.thunk>:
+# CHECK-PIC:    { immext(#0xff7fff80)
+# CHECK-PIC:      r14 = add(pc,##0xff7fffa4) } // fixme??
+# CHECK-PIC:    { jumpr r14 }
diff --git a/lld/test/ELF/hexagon-thunks.s b/lld/test/ELF/hexagon-thunks.s
new file mode 100644
index 00000000000000..1c5ba86842d8c1
--- /dev/null
+++ b/lld/test/ELF/hexagon-thunks.s
@@ -0,0 +1,44 @@
+# REQUIRES: hexagon
+# RUN: llvm-mc -filetype=obj -triple=hexagon-unknown-elf %s -o %t.o
+# RUN: ld.lld %t.o -o %t
+# RUN: llvm-objdump -d %t 2>&1 | FileCheck --check-prefix=CHECK-NONPIC %s
+# RUN: llvm-mc -filetype=obj --position-independent \
+# RUN:         -triple=hexagon-unknown-elf %s -o %t.o
+
+# RUN: ld.lld --pie %t.o -o %t
+# RUN: llvm-objdump -d %t 2>&1 | FileCheck --check-prefix=CHECK-PIC %s
+
+    .globl main
+    .type  main, @function
+main:
+    call myfn
+    jumpr r31
+    .size   main, .-main
+
+    .org 0x800000
+
+    .globl myfn
+    .type  myfn, @function
+myfn:
+    jumpr r31
+    .size  myfn, .-myfn
+
+# CHECK:  Disassembly of section .text:
+
+# CHECK-NONPIC:  000200b4 <__trampoline_for_myfn_from_.text.thunk>:
+# CHECK-NONPIC:  { immext(#0x800000)
+# CHECK-NONPIC:    jump 0x8200bc }
+# CHECK-PIC:     00010150 <__trampoline_for_myfn_from_.text.thunk>:
+# CHECK-PIC:     { immext(#0x800000)
+# CHECK-PIC:       r14 = add(pc,##0x80000c) }
+# CHECK-PIC:     { jumpr r14 }
+
+# CHECK-NONPIC:  000200bc <main>:
+# CHECK-NONPIC:    call 0x200b4
+# CHECK-PIC:     0001015c <main>:
+# CHECK-PIC:       call 0x10150
+# CHECK:           jumpr r31
+
+# CHECK-NONPIC:  008200bc <myfn>:
+# CHECK-PIC:     0081015c <myfn>:
+# CHECK:           jumpr r31

@llvmbot
Copy link
Collaborator

llvmbot commented Oct 4, 2024

@llvm/pr-subscribers-backend-hexagon

Author: Brian Cain (androm3da)

Changes

This draft is a starting point for adding thunks to lld for hexagon.

Unfortunately there's at least one known issue: mid-packet jumps are calculating the wrong offset to the thunk. I think I've misunderstood how to correctly account for the original addend and the recalculated addend.

Other potential issues:

  • the symbols generated for thunks don't match qcld aka hexagon-link. Should that be a concern at all, or can the symbols for each linker be defined independently?
  • the symbols generated for thunks aren't necessarily unique: should they be?
  • some small-ish cleanup/TODO/FIXMEs

Happy to accept any feedback while I try to work through the remaining known issues.


Full diff: https://github.com/llvm/llvm-project/pull/111217.diff

4 Files Affected:

  • (modified) lld/ELF/Arch/Hexagon.cpp (+38)
  • (modified) lld/ELF/Thunks.cpp (+99-1)
  • (added) lld/test/ELF/hexagon-thunks-packets.s (+122)
  • (added) lld/test/ELF/hexagon-thunks.s (+44)
diff --git a/lld/ELF/Arch/Hexagon.cpp b/lld/ELF/Arch/Hexagon.cpp
index d689fc2a152101..6d6d49c3b539d9 100644
--- a/lld/ELF/Arch/Hexagon.cpp
+++ b/lld/ELF/Arch/Hexagon.cpp
@@ -10,6 +10,7 @@
 #include "Symbols.h"
 #include "SyntheticSections.h"
 #include "Target.h"
+#include "Thunks.h"
 #include "lld/Common/ErrorHandler.h"
 #include "llvm/BinaryFormat/ELF.h"
 #include "llvm/Support/Endian.h"
@@ -30,6 +31,10 @@ class Hexagon final : public TargetInfo {
                      const uint8_t *loc) const override;
   RelType getDynRel(RelType type) const override;
   int64_t getImplicitAddend(const uint8_t *buf, RelType type) const override;
+  bool needsThunk(RelExpr expr, RelType type, const InputFile *file,
+                  uint64_t branchAddr, const Symbol &s,
+                  int64_t a) const override;
+  bool inBranchRange(RelType type, uint64_t src, uint64_t dst) const override;
   void relocate(uint8_t *loc, const Relocation &rel,
                 uint64_t val) const override;
   void writePltHeader(uint8_t *buf) const override;
@@ -57,6 +62,8 @@ Hexagon::Hexagon(Ctx &ctx) : TargetInfo(ctx) {
   tlsGotRel = R_HEX_TPREL_32;
   tlsModuleIndexRel = R_HEX_DTPMOD_32;
   tlsOffsetRel = R_HEX_DTPREL_32;
+
+  needsThunks = true;
 }
 
 uint32_t Hexagon::calcEFlags() const {
@@ -239,6 +246,37 @@ static uint32_t findMaskR16(uint32_t insn) {
 
 static void or32le(uint8_t *p, int32_t v) { write32le(p, read32le(p) | v); }
 
+bool Hexagon::inBranchRange(RelType type, uint64_t src, uint64_t dst) const {
+  int64_t offset = dst - src;
+  switch (type) {
+  case llvm::ELF::R_HEX_B22_PCREL:
+  case llvm::ELF::R_HEX_PLT_B22_PCREL:
+  case llvm::ELF::R_HEX_GD_PLT_B22_PCREL:
+  case llvm::ELF::R_HEX_LD_PLT_B22_PCREL:
+    return llvm::isInt<22>(offset >> 2);
+  case llvm::ELF::R_HEX_B15_PCREL:
+    return llvm::isInt<15>(offset >> 2);
+    break;
+  case llvm::ELF::R_HEX_B13_PCREL:
+    return llvm::isInt<13>(offset >> 2);
+    break;
+  case llvm::ELF::R_HEX_B9_PCREL:
+    return llvm::isInt<9>(offset >> 2);
+  default:
+    return true;
+  }
+  llvm::report_fatal_error(StringRef(
+      "unsupported relocation type used in isInRange: " + toString(type)));
+}
+
+bool Hexagon::needsThunk(RelExpr expr, RelType type, const InputFile *file,
+                         uint64_t branchAddr, const Symbol &s,
+                         int64_t a) const {
+  // FIXME: needsThunk() should return false for !inRange() &&
+  // !enoughSpaceForThunk() ?
+  return !ctx.target->inBranchRange(type, branchAddr, s.getVA(a));
+}
+
 void Hexagon::relocate(uint8_t *loc, const Relocation &rel,
                        uint64_t val) const {
   switch (rel.type) {
diff --git a/lld/ELF/Thunks.cpp b/lld/ELF/Thunks.cpp
index ef97530679469d..c27422a5bc66b0 100644
--- a/lld/ELF/Thunks.cpp
+++ b/lld/ELF/Thunks.cpp
@@ -351,6 +351,47 @@ class AVRThunk : public Thunk {
   void addSymbols(ThunkSection &isec) override;
 };
 
+static const uint32_t MASK_END_PACKET = 3 << 14;
+static const uint32_t END_OF_PACKET = 3 << 14;
+static const uint32_t END_OF_DUPLEX = 0 << 14;
+
+static std::optional<int32_t> getRealAddend(const InputSection &isec,
+                                            const Relocation &rel) {
+  const ArrayRef<uint8_t> SectContents = isec.content();
+  if (SectContents.size() < sizeof(int32_t))
+    // FIXME: assert?  emit a diagnostic?
+    return std::nullopt;
+  int32_t offset = rel.offset;
+
+  // Search as many as 4 instructions:
+  for (int i = 0; i < 4; i++) {
+    uint32_t instWord = 0;
+    const ArrayRef<uint8_t> InstWordContents = SectContents.drop_front(offset);
+    ::memcpy(&instWord, InstWordContents.data(), sizeof(instWord));
+    if (((instWord & MASK_END_PACKET) == END_OF_PACKET) ||
+        ((instWord & MASK_END_PACKET) == END_OF_DUPLEX))
+      break;
+    offset += sizeof(instWord);
+  }
+  return offset - rel.offset;
+}
+// Hexagon CPUs need thunks for <<FIXME TBD>>
+// when their destination is out of range [0, 0x_?].
+class HexagonThunk : public Thunk {
+public:
+  HexagonThunk(Ctx &ctx, const InputSection &isec, Relocation &rel,
+               Symbol &dest)
+      : Thunk(ctx, dest, 0), RealAddend(getRealAddend(isec, rel)),
+        RelOffset(rel.offset) {
+    alignment = 4;
+  }
+  std::optional<int32_t> RealAddend;
+  int32_t RelOffset;
+  uint32_t size() override { return ctx.arg.isPic ? 12 : 8; }
+  void writeTo(uint8_t *buf) override;
+  void addSymbols(ThunkSection &isec) override;
+};
+
 // MIPS LA25 thunk
 class MipsThunk final : public Thunk {
 public:
@@ -1352,6 +1393,39 @@ bool PPC64LongBranchThunk::isCompatibleWith(const InputSection &isec,
   return rel.type == R_PPC64_REL24 || rel.type == R_PPC64_REL14;
 }
 
+// Hexagon Target Thunks
+static uint64_t getHexagonThunkDestVA(const Symbol &s, int64_t a) {
+  uint64_t v = s.isInPlt() ? s.getPltVA() : s.getVA(a);
+  return SignExtend64<32>(v); // FIXME: sign extend to 64-bit?
+}
+
+void HexagonThunk::writeTo(uint8_t *buf) {
+  uint64_t s = getHexagonThunkDestVA(destination, addend);
+  uint64_t p = getThunkTargetSym()->getVA();
+
+  if (ctx.arg.isPic) {
+    write32(buf + 0, 0x00004000); // {  immext(#0)
+    ctx.target->relocateNoSym(buf, R_HEX_B32_PCREL_X, s - p);
+    write32(buf + 4, 0x6a49c00e); //    r14 = add(pc,##0) }
+    ctx.target->relocateNoSym(buf + 4, R_HEX_6_PCREL_X, s - p);
+
+    write32(buf + 8, 0x528ec000); // {  jumpr r14 }
+  } else {
+    write32(buf + 0, 0x00004000); //  { immext
+    ctx.target->relocateNoSym(buf, R_HEX_B32_PCREL_X, s - p);
+    write32(buf + 4, 0x5800c000); //    jump <> }
+    ctx.target->relocateNoSym(buf + 4, R_HEX_B22_PCREL_X, s - p);
+  }
+}
+void HexagonThunk::addSymbols(ThunkSection &isec) {
+  Symbol *enclosing = isec.getEnclosingSymbol(RelOffset);
+  StringRef src = enclosing ? enclosing->getName() : isec.name;
+
+  addSymbol(saver().save("__trampoline_for_" + destination.getName() +
+                         "_from_" + src),
+            STT_FUNC, 0, isec);
+}
+
 Thunk::Thunk(Ctx &ctx, Symbol &d, int64_t a)
     : ctx(ctx), destination(d), addend(a), offset(0) {
   destination.thunkAccessed = true;
@@ -1513,6 +1587,27 @@ static Thunk *addThunkAVR(Ctx &ctx, RelType type, Symbol &s, int64_t a) {
   }
 }
 
+static Thunk *addThunkHexagon(Ctx &ctx, const InputSection &isec,
+                              Relocation &rel, Symbol &s) {
+  switch (rel.type) {
+  case R_HEX_B9_PCREL:
+  case R_HEX_B13_PCREL:
+  case R_HEX_B15_PCREL:
+  case R_HEX_B22_PCREL:
+  case R_HEX_PLT_B22_PCREL:
+  case R_HEX_GD_PLT_B22_PCREL:
+#if 0
+    // FIXME: we don't need this for extended rels?
+  case R_HEX_B32_PCREL_X:
+  case R_HEX_6_PCREL_X:
+  case R_HEX_B22_PCREL_X:
+#endif
+    return make<HexagonThunk>(ctx, isec, rel, s);
+  default:
+    fatal("unrecognized relocation type " + toString(rel.type));
+  }
+}
+
 static Thunk *addThunkMips(Ctx &ctx, RelType type, Symbol &s) {
   if ((s.stOther & STO_MIPS_MICROMIPS) && isMipsR6())
     return make<MicroMipsR6Thunk>(ctx, s);
@@ -1578,8 +1673,11 @@ Thunk *elf::addThunk(Ctx &ctx, const InputSection &isec, Relocation &rel) {
     return addThunkPPC32(ctx, isec, rel, s);
   case EM_PPC64:
     return addThunkPPC64(ctx, rel.type, s, a);
+  case EM_HEXAGON:
+    return addThunkHexagon(ctx, isec, rel, s);
   default:
-    llvm_unreachable("add Thunk only supported for ARM, AVR, Mips and PowerPC");
+    llvm_unreachable(
+        "add Thunk only supported for ARM, AVR, Hexagon, Mips and PowerPC");
   }
 }
 
diff --git a/lld/test/ELF/hexagon-thunks-packets.s b/lld/test/ELF/hexagon-thunks-packets.s
new file mode 100644
index 00000000000000..e50558763119ed
--- /dev/null
+++ b/lld/test/ELF/hexagon-thunks-packets.s
@@ -0,0 +1,122 @@
+# REQUIRES: hexagon
+# RUN: llvm-mc -filetype=obj -triple=hexagon-unknown-linux-musl %s -o %t.o
+# RUN: ld.lld %t.o -o %t
+# RUN: llvm-objdump -d %t 2>&1 | FileCheck --check-prefix=CHECK-NONPIC %s
+# RUN: llvm-mc -filetype=obj --position-independent \
+# RUN:         -triple=hexagon-unknown-linux-musl %s -o %t.o
+# RUN: ld.lld --pie %t.o -o %t
+# RUN: llvm-objdump -d %t 2>&1 | FileCheck --check-prefix=CHECK-PIC %s
+
+# Packets with pc-relative relocations are more interesting because
+# the offset must be relative to the start of the source, destination
+# packets and not necessarily the instruction word containing the jump/call.
+
+# CHECK:  Disassembly of section .text:
+
+# CHECK-NONPIC: 000200b4 <__trampoline_for_myfn_a_from_.text.thunk>:
+# CHECK-NONPIC: { immext(#0x800040)
+# CHECK-NONPIC:   jump 0x820118 }
+# CHECK-NONPIC: 000200bc <__trampoline_for_myfn_a_from_.text.thunk>:
+# CHECK-NONPIC: { immext(#0x800040)
+# CHECK-NONPIC:   jump 0x820118 }
+
+# CHECK-PIC:    00010150 <__trampoline_for_myfn_a_from_.text.thunk>:
+# CHECK-PIC:    { immext(#0x800040)
+# CHECK-PIC:      r14 = add(pc,##0x80006c) }
+# CHECK-PIC:    { jumpr r14 }
+
+# CHECK-NONPIC: 000200c4 <myfn_b>:
+# CHECK-NONPIC: { jumpr r31 }
+# CHECK-PIC:    00010168 <myfn_b>:
+# CHECK-PIC:    { jumpr r31 }
+    .globl myfn_b
+    .type  myfn_b, @function
+myfn_b:
+    jumpr r31
+    .size  myfn_b, .-myfn_b
+
+# CHECK-PIC:    0001016c <main>:
+    .globl main
+    .type  main, @function
+main:
+    { r0 = #0
+      call myfn_a }
+# CHECK-PIC:      { call 0x10150
+# CHECK-NONPIC:   { call 0x200b4
+# CHECK:            r0 = #0x0 }
+    call myfn_a
+# CHECK-PIC:    call 0x10150
+# CHECK-NONPIC: call 0x200b4
+    call myfn_b
+# CHECK-PIC:    call 0x10168
+# CHECK-NONPIC: call 0x200c4
+
+    { r2 = add(r0, r1)
+      if (p0) call #myfn_b
+      if (!p0) call #myfn_a }
+# CHECK-PIC:     { if (p0) call 0x10168
+# CHECK-PIC:       if (!p0) call 0x10150
+# CHECK-NONPIC:  { if (p0) call 0x200bc
+# CHECK-NONPIC:    if (!p0) call 0x200b4
+# CHECK:           r2 = add(r0,r1) }
+
+    { r2 = add(r0, r1)
+      if (p0) call #myfn_a
+      if (!p0) call #myfn_a }
+# CHECK-PIC:  { if (p0) call 0x10150
+# CHECK-PIC:    if (!p0) call 0x10150
+# CHECK-NONPIC:  { if (p0) call 0x200b4
+# CHECK-NONPIC:    if (!p0) call 0x200b4
+# CHECK:           r2 = add(r0,r1) }
+
+    { r2 = add(r0, r1)
+      r1 = r4
+      r4 = r5
+      if (r0 == #0) jump:t #myfn_a }
+# CHECK-PIC:     { if (r0==#0) jump:t 0x10150
+# CHECK-NONPIC:  { if (r0==#0) jump:t 0x200b4
+# CHECK:           r2 = add(r0,r1)
+# CHECK:           r1 = r4; r4 = r5 }
+
+    { r2 = add(r0, r1)
+      r4 = r5
+      if (r0 <= #0) jump:t #myfn_a
+      p1 = cmp.eq(r0, #0); if (p1.new) jump:nt #myfn_a }
+# CHECK-NONPIC:  { if (r0==#0) jump:t 0x200b4
+# CHECK-NONPIC:    p1 = cmp.eq(r0,#0x0); if (p1.new) jump:nt 0x200b4
+# CHECK-PIC:     { if (r0<=#0) jump:t 0x10150
+# CHECK-PIC:       p1 = cmp.eq(r0,#0x0); if (p1.new) jump:nt 0x10150
+# CHECK:           r2 = add(r0,r1)
+# CHECK:           r4 = r5 }
+
+    {r0 = #0; jump #myfn_a}
+# CHECK-PIC:    { r0 = #0x0 ; jump 0x10150 }
+# CHECK-NONPIC: { r0 = #0x0 ; jump 0x200b4 }
+    {r0 = #0; jump #myfn_b}
+# CHECK-PIC:    { r0 = #0x0 ; jump 0x10168 }
+# CHECK-NONPIC: { r0 = #0x0 ; jump 0x200c4 }
+    jumpr r31
+    .size   main, .-main
+
+    .section .text.foo
+    .skip 0x800000
+
+    .globl myfn_a
+    .type  myfn_a, @function
+myfn_a:
+    {r0 = #0; jump #myfn_b}
+    jumpr r31
+    .size  myfn_a, .-myfn_a
+
+# CHECK-NONPIC: 00820118 <myfn_a>:
+# CHECK-NONPIC: { r0 = #0x0 ; jump 0x820120 }
+# CHECK-NONPIC: { jumpr r31 }
+
+# CHECK-NONPIC: 00820120 <__trampoline_for_myfn_b_from_.text.thunk>:
+# CHECK-NONPIC: { immext(#0xff7fff80)
+# CHECK-NONPIC:   jump 0x200c4 }
+
+# CHECK-PIC:    008101c4 <__trampoline_for_myfn_b_from_.text.thunk>:
+# CHECK-PIC:    { immext(#0xff7fff80)
+# CHECK-PIC:      r14 = add(pc,##0xff7fffa4) } // fixme??
+# CHECK-PIC:    { jumpr r14 }
diff --git a/lld/test/ELF/hexagon-thunks.s b/lld/test/ELF/hexagon-thunks.s
new file mode 100644
index 00000000000000..1c5ba86842d8c1
--- /dev/null
+++ b/lld/test/ELF/hexagon-thunks.s
@@ -0,0 +1,44 @@
+# REQUIRES: hexagon
+# RUN: llvm-mc -filetype=obj -triple=hexagon-unknown-elf %s -o %t.o
+# RUN: ld.lld %t.o -o %t
+# RUN: llvm-objdump -d %t 2>&1 | FileCheck --check-prefix=CHECK-NONPIC %s
+# RUN: llvm-mc -filetype=obj --position-independent \
+# RUN:         -triple=hexagon-unknown-elf %s -o %t.o
+
+# RUN: ld.lld --pie %t.o -o %t
+# RUN: llvm-objdump -d %t 2>&1 | FileCheck --check-prefix=CHECK-PIC %s
+
+    .globl main
+    .type  main, @function
+main:
+    call myfn
+    jumpr r31
+    .size   main, .-main
+
+    .org 0x800000
+
+    .globl myfn
+    .type  myfn, @function
+myfn:
+    jumpr r31
+    .size  myfn, .-myfn
+
+# CHECK:  Disassembly of section .text:
+
+# CHECK-NONPIC:  000200b4 <__trampoline_for_myfn_from_.text.thunk>:
+# CHECK-NONPIC:  { immext(#0x800000)
+# CHECK-NONPIC:    jump 0x8200bc }
+# CHECK-PIC:     00010150 <__trampoline_for_myfn_from_.text.thunk>:
+# CHECK-PIC:     { immext(#0x800000)
+# CHECK-PIC:       r14 = add(pc,##0x80000c) }
+# CHECK-PIC:     { jumpr r14 }
+
+# CHECK-NONPIC:  000200bc <main>:
+# CHECK-NONPIC:    call 0x200b4
+# CHECK-PIC:     0001015c <main>:
+# CHECK-PIC:       call 0x10150
+# CHECK:           jumpr r31
+
+# CHECK-NONPIC:  008200bc <myfn>:
+# CHECK-PIC:     0081015c <myfn>:
+# CHECK:           jumpr r31

@androm3da
Copy link
Member Author

For reference, here's how the Hexagon SDK's hexagon-link handles this interesting test case:

$ /local/mnt/workspace/Qualcomm/Hexagon_SDK/5.4.1.1/tools/HEXAGON_Tools/8.7.06/Tools/bin/hexagon-llvm-mc -filetype=obj -triple=hexagon-unknown-linux-musl ../llvm-project/lld/test/ELF/hexagon-thunks-packets.s -o hexagon-thunks-packets.s.tmp.o && /local/mnt/workspace/Qualcomm/Hexagon_SDK/5.4.1.1/tools/HEXAGON_Tools/8.7.06/Tools/bin/hexagon-link -o hexagon-thunks-packets.s.tmp hexagon-thunks-packets.s.tmp.o && /local/mnt/workspace/Qualcomm/Hexagon_SDK/5.4.1.1/tools/HEXAGON_Tools/8.7.06/Tools/bin/hexagon-llvm-objdump -d --print-imm-hex hexagon-thunks-packets.s.tmp

hexagon-thunks-packets.s.tmp:	file format elf32-hexagon

Disassembly of section .text:

00000000 <myfn_b>:
       0:	00 c0 9f 52	529fc000 { 	jumpr r31 } 

00000004 <main>:
       4:	28 40 00 5a	5a004028 { 	call 0x54 <trampoline_1_for_myfn_a>
       8:	00 c0 00 78	7800c000   	r0 = #0x0 } 
       c:	24 c0 00 5a	5a00c024 { 	call 0x54 <trampoline_1_for_myfn_a> } 
      10:	f8 ff ff 5b	5bfffff8 { 	call 0x0 <myfn_b> } 
      14:	f6 60 df 5d	5ddf60f6 { 	if (p0) call 0x0 <myfn_b>
      18:	20 40 20 5d	5d204020   	if (!p0) call 0x54 <trampoline_1_for_myfn_a>
      1c:	02 c1 00 f3	f300c102   	r2 = add(r0,r1) } 
      20:	1a 40 00 5d	5d00401a { 	if (p0) call 0x54 <trampoline_1_for_myfn_a>
      24:	1a 40 20 5d	5d20401a   	if (!p0) call 0x54 <trampoline_1_for_myfn_a>
      28:	02 c1 00 f3	f300c102   	r2 = add(r0,r1) } 
      2c:	14 50 80 61	61805014 { 	if (r0==#0) jump:t 0x54
      30:	02 41 00 f3	f3004102   	r2 = add(r0,r1)
      34:	54 30 41 30	30413054   	r1 = r4; 	r4 = r5 } 
      38:	0e 50 c0 61	61c0500e { 	if (r0<=#0) jump:t 0x54
      3c:	0e 40 00 12	1200400e   	p1 = cmp.eq(r0,#0x0); if (p1.new) jump:nt 0x54 <trampoline_1_for_myfn_a>
      40:	02 41 00 f3	f3004102   	r2 = add(r0,r1)
      44:	04 c0 65 70	7065c004   	r4 = r5 } 
      48:	06 c0 00 16	1600c006 { 	r0 = #0x0 ; jump 0x54 <trampoline_1_for_myfn_a> } 
      4c:	da c0 30 16	1630c0da { 	r0 = #0x0 ; jump 0x0 <myfn_b> } 
      50:	00 c0 9f 52	529fc000 { 	jumpr r31 } 

00000054 <trampoline_1_for_myfn_a>:
      54:	00 40 08 00	00084000 { 	immext(#0x800000)
      58:	10 c0 00 58	5800c010   	jump 0x80005c <myfn_a> } 
		...

0080005c <myfn_a>:
  80005c:	04 c0 00 16	1600c004 { 	r0 = #0x0 ; jump 0x800064 <trampoline_2_for_myfn_b> } 
  800060:	00 c0 9f 52	529fc000 { 	jumpr r31 } 

00800064 <trampoline_2_for_myfn_b>:
  800064:	fe 7f f7 0f	0ff77ffe { 	immext(#0xff7fff80)
  800068:	38 c0 00 58	5800c038   	jump 0x0 <myfn_b> } 

@androm3da
Copy link
Member Author

@quic-akaryaki has fixed the remaining test failures with this patch. Thanks Alexey!

I'll audit it later this week and probably mark it ready for review then.

Co-authored-by: Alexey Karyakin <akaryaki@quicinc.com>
@androm3da androm3da marked this pull request as ready for review November 8, 2024 02:54
@androm3da androm3da changed the title draft: [lld] Add thunks for hexagon [lld] Add thunks for hexagon Nov 8, 2024
default:
return true;
}
llvm::report_fatal_error(StringRef(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not right. needsThunk should reject non-branch relocations and unknown relocations are llvm_unreachable here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inconsistent, yes. Maybe this should be unreachable instead? The default case above should handle the unknown relocations and there should be no way to execute this code.

}
}
if (ctx.arg.emachine == EM_HEXAGON) {
return -getHexagonPacketOffset(isec, rel);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, thank you.

: Thunk(ctx, dest, 0), RelOffset(rel.offset) {
alignment = 4;
}
uint32_t RelOffset;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lowercase

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

Symbol *enclosing = isec.getEnclosingSymbol(RelOffset);
StringRef src = enclosing ? enclosing->getName() : isec.name;

addSymbol(saver().save("__trampoline_for_" + destination.getName() +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not a hexagon specific name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that it might make sense to preserve the symbol name generated by hexagon-link - but I don't think anything should depend on it? In which case, I'm happy to change it.

How about __hexagon_trampoline_for_ ... instead?

case R_HEX_GD_PLT_B22_PCREL:
return make<HexagonThunk>(ctx, isec, rel, s);
default:
fatal("unrecognized relocation type " + toString(rel.type));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

llvm_unreachable

ensure this indeed unreachable

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been changed to follow the pattern used by other architectures, hopefully it's satisfactory.

ensure this indeed unreachable

Sorry, I'm not sure I understand what would be sufficient. Ensure it by reviewing the set of relocations for which we could add a thunk? Or by increasing test coverage here? Or ... ?

# RUN: llvm-mc -filetype=obj -triple=hexagon-unknown-elf %s -o %t.o
# RUN: ld.lld %t.o -o %t
# RUN: llvm-objdump -d %t 2>&1 | FileCheck --check-prefix=CHECK-NONPIC %s
# RUN: llvm-mc -filetype=obj --position-independent \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete --position-independent. it's only for mips.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants