Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PowerPC] Support local-dynamic TLS relocation on AIX #66316

Merged
merged 31 commits into from
Mar 1, 2024

Conversation

orcguru
Copy link
Contributor

@orcguru orcguru commented Sep 14, 2023

Supports TLS local-dynamic on AIX, generates below sequence of code:

.tc foo[TC],foo[TL]@ld # Variable offset, ld relocation specifier
.tc mh[TC],mh[TC]@ml # Module handle for the caller
lwz 3,mh[TC]\(2\) $$ For 64-bit: ld 3,mh[TC]\(2\)
bla .__tls_get_mod # Modifies r0,r3,r4,r5,r11,lr,cr0
#r3 = &TLS for module
lwz 4,foo[TC]\(2\) $$ For 64-bit: ld 4,foo[TC]\(2\)
add 5,3,4 # Compute &foo
.rename mh[TC], "\_$TLSML" # Symbol for the module handle must have the name "_$TLSML"

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 14, 2023

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-mc

Changes To enable TLS local-dynamic on AIX, according to docs, this patch need to generate below sequence of code:

.tc foo[TC],foo[TL]@ld # Variable offset, ld relocation specifier
.tc mh[TC],mh[TC]@ml # Module handle for the caller
lwz 3,mh[TC](2) $$ For 64-bit: ld 3,mh[TC](2)
bla .__tls_get_mod # Modifies r0,r3,r4,r5,r11,lr,cr0
#r3 = &TLS for module
lwz 4,foo[TC](2) $$ For 64-bit: ld 4,foo[TC](2)
add 5,3,4 # Compute &foo
.rename mh[TC], "_$TLSML" # Symbol for the module handle must have the name "_$TLSML"

Patch migrated from https://reviews.llvm.org/D157673

Patch is 134.04 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/66316.diff

16 Files Affected:

  • (modified) llvm/include/llvm/MC/MCExpr.h (+2)
  • (modified) llvm/lib/MC/MCExpr.cpp (+4)
  • (modified) llvm/lib/Target/PowerPC/MCTargetDesc/PPCMCTargetDesc.cpp (+6-3)
  • (modified) llvm/lib/Target/PowerPC/MCTargetDesc/PPCXCOFFObjectWriter.cpp (+4)
  • (modified) llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp (+71-19)
  • (modified) llvm/lib/Target/PowerPC/PPCISelLowering.cpp (+23-7)
  • (modified) llvm/lib/Target/PowerPC/PPCISelLowering.h (+13-1)
  • (modified) llvm/lib/Target/PowerPC/PPCInstr64Bit.td (+12-1)
  • (modified) llvm/lib/Target/PowerPC/PPCInstrInfo.td (+14-1)
  • (modified) llvm/lib/Target/PowerPC/PPCTLSDynamicCall.cpp (+61-12)
  • (modified) llvm/test/CodeGen/PowerPC/aix-tls-gd-double.ll (+53-37)
  • (modified) llvm/test/CodeGen/PowerPC/aix-tls-gd-int.ll (+56-40)
  • (modified) llvm/test/CodeGen/PowerPC/aix-tls-gd-longlong.ll (+157-141)
  • (added) llvm/test/CodeGen/PowerPC/aix-tls-local-dynamic.ll (+364)
  • (modified) llvm/test/CodeGen/PowerPC/aix-tls-xcoff-reloc-large.ll (+154-132)
  • (modified) llvm/test/CodeGen/PowerPC/aix-tls-xcoff-reloc.ll (+155-133)

<pre>
diff --git a/llvm/include/llvm/MC/MCExpr.h b/llvm/include/llvm/MC/MCExpr.h
index 67836292874f5f7..f3bc0491fd2f193 100644
--- a/llvm/include/llvm/MC/MCExpr.h
+++ b/llvm/include/llvm/MC/MCExpr.h
@@ -301,6 +301,8 @@ class MCSymbolRefExpr : public MCExpr {
VK_PPC_AIX_TLSGDM, // symbol@m
VK_PPC_AIX_TLSIE, // symbol@ie
VK_PPC_AIX_TLSLE, // symbol@le

  • VK_PPC_AIX_TLSLD, // symbol@ld
  • VK_PPC_AIX_TLSML, // symbol@ml
    VK_PPC_GOT_TLSLD, // symbol@got@tlsld
    VK_PPC_GOT_TLSLD_LO, // symbol@got@tlsld@l
    VK_PPC_GOT_TLSLD_HI, // symbol@got@tlsld@h
    diff --git a/llvm/lib/MC/MCExpr.cpp b/llvm/lib/MC/MCExpr.cpp
    index 73e6569f96e4630..bc1bb9b80630546 100644
    --- a/llvm/lib/MC/MCExpr.cpp
    +++ b/llvm/lib/MC/MCExpr.cpp
    @@ -331,6 +331,10 @@ StringRef MCSymbolRefExpr::getVariantKindName(VariantKind Kind) {
    return &quot;ie&quot;;
    case VK_PPC_AIX_TLSLE:
    return &quot;le&quot;;
  • case VK_PPC_AIX_TLSLD:
  • return &quot;ld&quot;;
  • case VK_PPC_AIX_TLSML:
  • return &quot;ml&quot;;
    case VK_PPC_GOT_TLSLD: return &quot;got@tlsld&quot;;
    case VK_PPC_GOT_TLSLD_LO: return &quot;got@tlsld@l&quot;;
    case VK_PPC_GOT_TLSLD_HI: return &quot;got@tlsld@h&quot;;
    diff --git a/llvm/lib/Target/PowerPC/MCTargetDesc/PPCMCTargetDesc.cpp b/llvm/lib/Target/PowerPC/MCTargetDesc/PPCMCTargetDesc.cpp
    index a804dd823daa4da..22cd2fc03ef7c39 100644
    --- a/llvm/lib/Target/PowerPC/MCTargetDesc/PPCMCTargetDesc.cpp
    +++ b/llvm/lib/Target/PowerPC/MCTargetDesc/PPCMCTargetDesc.cpp
    @@ -231,12 +231,15 @@ class PPCTargetAsmStreamer : public PPCTargetStreamer {
    MCSymbolXCOFF *TCSym =
    cast&lt;MCSectionXCOFF&gt;(Streamer.getCurrentSectionOnly())
    -&gt;getQualNameSymbol();
  •  // On AIX, we have a region handle (symbol@m) and the variable offset
    
  •  // (symbol@{gd|ie|le}) for TLS variables, depending on the TLS model.
    
  •  // On AIX, we have a region handle (symbol@m), module handle
    
  •  // (__TLSML[TC]@ml) and the variable offset (symbol@{gd|ie|le|ld}) for TLS
    
  •  // variables, depending on the TLS model.
     if (Kind == MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSGD ||
         Kind == MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSGDM ||
         Kind == MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSIE ||
    
  •      Kind == MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSLE)
    
  •      Kind == MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSLE ||
    
  •      Kind == MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSLD ||
    
  •      Kind == MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSML)
       OS &amp;lt;&amp;lt; &amp;quot;\t.tc &amp;quot; &amp;lt;&amp;lt; TCSym-&amp;gt;getName() &amp;lt;&amp;lt; &amp;quot;,&amp;quot; &amp;lt;&amp;lt; XSym-&amp;gt;getName() &amp;lt;&amp;lt; &amp;quot;@&amp;quot;
          &amp;lt;&amp;lt; MCSymbolRefExpr::getVariantKindName(Kind) &amp;lt;&amp;lt; &amp;#x27;\n&amp;#x27;;
     else
    

diff --git a/llvm/lib/Target/PowerPC/MCTargetDesc/PPCXCOFFObjectWriter.cpp b/llvm/lib/Target/PowerPC/MCTargetDesc/PPCXCOFFObjectWriter.cpp
index 065daf42fe6eb0c..f4998e9b9dcba86 100644
--- a/llvm/lib/Target/PowerPC/MCTargetDesc/PPCXCOFFObjectWriter.cpp
+++ b/llvm/lib/Target/PowerPC/MCTargetDesc/PPCXCOFFObjectWriter.cpp
@@ -116,6 +116,10 @@ std::pair&lt;uint8_t, uint8_t&gt; PPCXCOFFObjectWriter::getRelocTypeAndSignSize(
return {XCOFF::RelocationType::R_TLS_IE, SignAndSizeForFKData};
case MCSymbolRefExpr::VK_PPC_AIX_TLSLE:
return {XCOFF::RelocationType::R_TLS_LE, SignAndSizeForFKData};

  • case MCSymbolRefExpr::VK_PPC_AIX_TLSLD:
  •  return {XCOFF::RelocationType::R_TLS_LD, SignAndSizeForFKData};
    
  • case MCSymbolRefExpr::VK_PPC_AIX_TLSML:
  •  return {XCOFF::RelocationType::R_TLSML, SignAndSizeForFKData};
    
    case MCSymbolRefExpr::VK_None:
    return {XCOFF::RelocationType::R_POS, SignAndSizeForFKData};
    }
    diff --git a/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp b/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
    index 4b97e3e1a09152f..1c34dc3fefceaba 100644
    --- a/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
    +++ b/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
    @@ -613,12 +613,23 @@ void PPCAsmPrinter::LowerPATCHPOINT(StackMaps &amp;SM, const MachineInstr &amp;MI) {
    EmitToStreamer(*OutStreamer, MCInstBuilder(PPC::NOP));
    }

-/// This helper function creates the TlsGetAddr MCSymbol for AIX. We will
-/// create the csect and use the qual-name symbol instead of creating just the
-/// external symbol.
+/// This helper function creates the TlsGetAddr/TlsGetMod MCSymbol for AIX. We
+/// will create the csect and use the qual-name symbol instead of creating just
+/// the external symbol.
static MCSymbol *createMCSymbolForTlsGetAddr(MCContext &amp;Ctx, unsigned MIOpc) {

  • StringRef SymName =
  •  MIOpc == PPC::GETtlsTpointer32AIX ? &amp;quot;.__get_tpointer&amp;quot; : &amp;quot;.__tls_get_addr&amp;quot;;
    
  • StringRef SymName;

  • switch (MIOpc) {

  • default:

  • SymName = &quot;.__tls_get_addr&quot;;

  • break;

  • case PPC::GETtlsTpointer32AIX:

  • SymName = &quot;.__get_tpointer&quot;;

  • break;

  • case PPC::GETtlsMOD32AIX:

  • case PPC::GETtlsMOD64AIX:

  • SymName = &quot;.__tls_get_mod&quot;;

  • break;

  • }
    return Ctx
    .getXCOFFSection(SymName, SectionKind::getText(),
    XCOFF::CsectProperties(XCOFF::XMC_PR, XCOFF::XTY_ER))
    @@ -660,14 +671,15 @@ void PPCAsmPrinter::EmitTlsCall(const MachineInstr *MI,
    &quot;GETtls[ld]ADDR[32] must read GPR3&quot;);

    if (Subtarget-&gt;isAIXABI()) {

  • // On AIX, the variable offset should already be in R4 and the region handle
  • // should already be in R3.
  • // For TLSGD, which currently is the only supported access model, we only
  • // need to generate an absolute branch to .__tls_get_addr.
  • // On AIX, for TLSGD the variable offset should already be in R4 and the
  • // region handle should already be in R3, need to generate an absolute
  • // branch to .__tls_get_addr. For TLSLD the module handle should already be
  • // in R3, need to generate branch to .__tls_get_mod.
    Register VarOffsetReg = Subtarget-&gt;isPPC64() ? PPC::X4 : PPC::R4;
    (void)VarOffsetReg;
  • assert(MI-&gt;getOperand(2).isReg() &amp;&amp;
  •       MI-&amp;gt;getOperand(2).getReg() == VarOffsetReg &amp;amp;&amp;amp;
    
  • assert((MI-&gt;getNumExplicitOperands() &lt; 3 ||
  •        (MI-&amp;gt;getOperand(2).isReg() &amp;amp;&amp;amp;
    
  •         MI-&amp;gt;getOperand(2).getReg() == VarOffsetReg)) &amp;amp;&amp;amp;
          &amp;quot;GETtls[ld]ADDR[32] must read GPR4&amp;quot;);
    
    EmitAIXTlsCallHelper(MI);
    return;
    @@ -710,6 +722,8 @@ static MCSymbol *getMCSymbolForTOCPseudoMO(const MachineOperand &amp;MO,
    return AP.GetJTISymbol(MO.getIndex());
    case MachineOperand::MO_BlockAddress:
    return AP.GetBlockAddressSymbol(MO.getBlockAddress());
  • case MachineOperand::MO_ExternalSymbol:
  • return AP.OutContext.getOrCreateSymbol(MO.getSymbolName());
    default:
    llvm_unreachable(&quot;Unexpected operand type to get symbol.&quot;);
    }
    @@ -757,6 +771,16 @@ getTOCEntryTypeForMO(const MachineOperand &amp;MO) {
    llvm_unreachable(&quot;Unexpected operand type to get TOC type.&quot;);
    }
    }

+// On AIX, TLS-local-dynamic requires that symbol for the module handle must
+// have the name &quot;_$TLSML&quot;. This symbol is used as one TOC symbol reference
+// itself with ML relocation type, thus it has &quot;[TC]&quot; attached to its name.
+static inline bool isSpecialAIXSymbolTLSML(const MachineOperand &amp;MO,

  •                                       const bool IsAIX) {
    
  • return IsAIX &amp;&amp; MO.isSymbol() &amp;&amp;
  •     (std::strcmp(MO.getSymbolName(), &amp;quot;_$TLSML[TC]&amp;quot;) == 0);
    

+}
+
/// EmitInstruction -- Print out a single PowerPC MI in Darwin syntax to
/// the current output stream.
///
@@ -846,6 +870,15 @@ void PPCAsmPrinter::emitInstruction(const MachineInstr *MI) {
return MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSGDM;
if (MO.getTargetFlags() &amp; PPCII::MO_TLSGD_FLAG)
return MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSGD;

  • if (MO.getTargetFlags() &amp; PPCII::MO_TLSLD_FLAG) {
  •  if (isSpecialAIXSymbolTLSML(MO, IsAIX))
    
  •    // FIXME: On AIX the ML relocation type is only valid for a reference to
    
  •    // a TOC symbol from the symbol itself, and right now its only user is
    
  •    // symbol &amp;quot;_$TLSML&amp;quot;. Use symbol name to decide that R_TLSML is expected.
    
  •    return MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSML;
    
  •  if (IsAIX)
    
  •    return MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSLD;
    
  • }
    return MCSymbolRefExpr::VariantKind::VK_None;
    };

@@ -964,7 +997,8 @@ void PPCAsmPrinter::emitInstruction(const MachineInstr *MI) {
TmpInst.setOpcode(PPC::LWZ);

 const MachineOperand &amp;amp;MO = MI-&amp;gt;getOperand(1);
  • assert((MO.isGlobal() || MO.isCPI() || MO.isJTI() || MO.isBlockAddress()) &amp;&amp;
  • assert((MO.isGlobal() || MO.isCPI() || MO.isJTI() || MO.isBlockAddress() ||

  •        isSpecialAIXSymbolTLSML(MO, IsAIX)) &amp;amp;&amp;amp;
          &amp;quot;Invalid operand for LWZtoc.&amp;quot;);
    

    // Map the operand to its corresponding MCSymbol.
    @@ -1053,7 +1087,8 @@ void PPCAsmPrinter::emitInstruction(const MachineInstr *MI) {
    TmpInst.setOpcode(PPC::LD);

    const MachineOperand &amp;MO = MI-&gt;getOperand(1);

  • assert((MO.isGlobal() || MO.isCPI() || MO.isJTI() || MO.isBlockAddress()) &amp;&amp;
  • assert((MO.isGlobal() || MO.isCPI() || MO.isJTI() || MO.isBlockAddress() ||

  •        isSpecialAIXSymbolTLSML(MO, IsAIX)) &amp;amp;&amp;amp;
          &amp;quot;Invalid operand!&amp;quot;);
    

    // Map the operand to its corresponding MCSymbol.
    @@ -1091,7 +1126,8 @@ void PPCAsmPrinter::emitInstruction(const MachineInstr *MI) {
    TmpInst.setOpcode(PPC::ADDIS);

    const MachineOperand &amp;MO = MI-&gt;getOperand(2);

  • assert((MO.isGlobal() || MO.isCPI() || MO.isJTI() || MO.isBlockAddress()) &amp;&amp;
  • assert((MO.isGlobal() || MO.isCPI() || MO.isJTI() || MO.isBlockAddress() ||

  •        isSpecialAIXSymbolTLSML(MO, IsAIX)) &amp;amp;&amp;amp;
          &amp;quot;Invalid operand for ADDIStocHA.&amp;quot;);
    

    // Map the machine operand to its corresponding MCSymbol.
    @@ -1124,7 +1160,8 @@ void PPCAsmPrinter::emitInstruction(const MachineInstr *MI) {
    TmpInst.setOpcode(PPC::LWZ);

    const MachineOperand &amp;MO = MI-&gt;getOperand(1);

  • assert((MO.isGlobal() || MO.isCPI() || MO.isJTI() || MO.isBlockAddress()) &amp;&amp;
  • assert((MO.isGlobal() || MO.isCPI() || MO.isJTI() || MO.isBlockAddress() ||

  •        isSpecialAIXSymbolTLSML(MO, IsAIX)) &amp;amp;&amp;amp;
          &amp;quot;Invalid operand for LWZtocL.&amp;quot;);
    

    // Map the machine operand to its corresponding MCSymbol.
    @@ -1156,7 +1193,8 @@ void PPCAsmPrinter::emitInstruction(const MachineInstr *MI) {
    TmpInst.setOpcode(PPC::ADDIS8);

    const MachineOperand &amp;MO = MI-&gt;getOperand(2);

  • assert((MO.isGlobal() || MO.isCPI() || MO.isJTI() || MO.isBlockAddress()) &amp;&amp;
  • assert((MO.isGlobal() || MO.isCPI() || MO.isJTI() || MO.isBlockAddress() ||

  •        isSpecialAIXSymbolTLSML(MO, IsAIX)) &amp;amp;&amp;amp;
          &amp;quot;Invalid operand for ADDIStocHA8!&amp;quot;);
    

    const MCSymbol *MOSymbol = getMCSymbolForTOCPseudoMO(MO, *this);
    @@ -1166,7 +1204,8 @@ void PPCAsmPrinter::emitInstruction(const MachineInstr *MI) {
    const bool GlobalToc =
    MO.isGlobal() &amp;&amp; Subtarget-&gt;isGVIndirectSymbol(MO.getGlobal());
    if (GlobalToc || MO.isJTI() || MO.isBlockAddress() ||

  •    (MO.isCPI() &amp;amp;&amp;amp; TM.getCodeModel() == CodeModel::Large))
    
  •    (MO.isCPI() &amp;amp;&amp;amp; TM.getCodeModel() == CodeModel::Large) ||
    
  •    isSpecialAIXSymbolTLSML(MO, IsAIX))
     MOSymbol = lookUpOrCreateTOCEntry(MOSymbol, getTOCEntryTypeForMO(MO), VK);
    

    VK = IsAIX ? MCSymbolRefExpr::VK_PPC_U : MCSymbolRefExpr::VK_PPC_TOC_HA;
    @@ -1195,8 +1234,8 @@ void PPCAsmPrinter::emitInstruction(const MachineInstr *MI) {
    TmpInst.setOpcode(PPC::LD);

    const MachineOperand &amp;MO = MI-&gt;getOperand(1);

  • assert((MO.isGlobal() || MO.isCPI() || MO.isJTI() ||
  •        MO.isBlockAddress()) &amp;amp;&amp;amp;
    
  • assert((MO.isGlobal() || MO.isCPI() || MO.isJTI() || MO.isBlockAddress() ||

  •        isSpecialAIXSymbolTLSML(MO, IsAIX)) &amp;amp;&amp;amp;
          &amp;quot;Invalid operand for LDtocL!&amp;quot;);
    

    LLVM_DEBUG(assert(
    @@ -1362,6 +1401,8 @@ void PPCAsmPrinter::emitInstruction(const MachineInstr *MI) {
    case PPC::GETtlsADDRPCREL:
    case PPC::GETtlsADDR32AIX:
    case PPC::GETtlsADDR64AIX:

  • case PPC::GETtlsMOD32AIX:

  • case PPC::GETtlsMOD64AIX:
    // Transform: %r3 = GETtlsADDRNNAIX %r3, %r4 (for NN == 32/64).
    // Into: BLA .__tls_get_addr()
    // Unlike on Linux, there is no symbol or relocation needed for this call.
    @@ -2710,6 +2751,15 @@ void PPCAIXAsmPrinter::emitEndOfAsmFile(Module &amp;M) {
    MCSymbol *S = OutContext.getOrCreateSymbol(Name);
    TCEntry = cast&lt;MCSectionXCOFF&gt;(
    getObjFileLowering().getSectionForTOCEntry(S, TM));

  • } else if (I.first.second ==

  •           MCSymbolRefExpr::VariantKind::VK_PPC_AIX_TLSML) {
    
  •  // AIX assembler expects TC storage-mapping class for the &amp;quot;_$TLSML&amp;quot;
    
  •  // symbol.
    
  •  MCSection *MCSect = getObjFileLowering().getContext().getXCOFFSection(
    
  •      cast&amp;lt;MCSymbolXCOFF&amp;gt;(I.first.first)-&amp;gt;getSymbolTableName(),
    
  •      SectionKind::getData(),
    
  •      XCOFF::CsectProperties(XCOFF::XMC_TC, XCOFF::XTY_SD));
    
  •  TCEntry = cast&amp;lt;MCSectionXCOFF&amp;gt;(MCSect);
    

    } else {
    TCEntry = cast&lt;MCSectionXCOFF&gt;(
    getObjFileLowering().getSectionForTOCEntry(I.first.first, TM));
    @@ -2826,6 +2876,8 @@ void PPCAIXAsmPrinter::emitInstruction(const MachineInstr *MI) {
    MMI-&gt;hasDebugInfo());
    break;
    }

  • case PPC::GETtlsMOD32AIX:

  • case PPC::GETtlsMOD64AIX:
    case PPC::GETtlsTpointer32AIX:
    case PPC::GETtlsADDR64AIX:
    case PPC::GETtlsADDR32AIX: {
    diff --git a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
    index 95f2243178c8a10..9297e84cedd7375 100644
    --- a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
    +++ b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
    @@ -1771,9 +1771,13 @@ const char *PPCTargetLowering::getTargetNodeName(unsigned Opcode) const {
    case PPCISD::ADDIS_TLSGD_HA: return &quot;PPCISD::ADDIS_TLSGD_HA&quot;;
    case PPCISD::ADDI_TLSGD_L: return &quot;PPCISD::ADDI_TLSGD_L&quot;;
    case PPCISD::GET_TLS_ADDR: return &quot;PPCISD::GET_TLS_ADDR&quot;;

  • case PPCISD::GET_TLS_MOD:

  • return &quot;PPCISD::GET_TLS_MOD&quot;;
    case PPCISD::GET_TPOINTER: return &quot;PPCISD::GET_TPOINTER&quot;;
    case PPCISD::ADDI_TLSGD_L_ADDR: return &quot;PPCISD::ADDI_TLSGD_L_ADDR&quot;;
    case PPCISD::TLSGD_AIX: return &quot;PPCISD::TLSGD_AIX&quot;;

  • case PPCISD::TLSLD_AIX:

  • return &quot;PPCISD::TLSLD_AIX&quot;;
    case PPCISD::ADDIS_TLSLD_HA: return &quot;PPCISD::ADDIS_TLSLD_HA&quot;;
    case PPCISD::ADDI_TLSLD_L: return &quot;PPCISD::ADDI_TLSLD_L&quot;;
    case PPCISD::GET_TLSLD_ADDR: return &quot;PPCISD::GET_TLSLD_ADDR&quot;;
    @@ -3412,13 +3416,25 @@ SDValue PPCTargetLowering::LowerGlobalTLSAddressAIX(SDValue Op,
    return DAG.getNode(PPCISD::ADD_TLS, dl, PtrVT, TLSReg, VariableOffset);
    }

  • // Only Local-Exec, Initial-Exec and General-Dynamic TLS models are currently
  • // supported models. If Local- or Initial-exec are not possible or specified,
  • // all GlobalTLSAddress nodes are lowered using the general-dynamic model.
  • // We need to generate two TOC entries, one for the variable offset, one for
  • // the region handle. The global address for the TOC entry of the region
  • // handle is created with the MO_TLSGDM_FLAG flag and the global address
  • // for the TOC entry of the variable offset is created with MO_TLSGD_FLAG.
  • if (Model == TLSModel::LocalDynamic) {
  • // For local-dynamic on AIX, we need to generate two TOC entries, one for
  • // the variable offset, the other for the module handle. The module handle
  • // is encapsulated inside the TLSLD_AIX pseudo node, and will be expanded by
  • // PPCTLSDynamicCall.
  • SDValue VariableOffsetTGA =
  •    DAG.getTargetGlobalAddress(GV, dl, PtrVT, 0, PPCII::MO_TLSLD_FLAG);
    
  • SDValue VariableOffset = getTOCEntry(DAG, dl, VariableOffsetTGA);
  • return DAG.getNode(PPCISD::TLSLD_AIX, dl, PtrVT, VariableOffset);
  • }
  • // The Local-Exec, Initial-Exec, Local-Dynamic, and General-Dynamic TLS models
  • // are currently supported access models. If Local- or Initial-exec or
  • // local-dynamic is not possible or specified, all GlobalTLSAddress nodes are
  • // lowered using the general-dynamic model. We need to generate two TOC
  • // entries, one for the variable offset, one for the region handle. The global
  • // address for the TOC entry of the region handle is created with the
  • // MO_TLSGDM_FLAG flag and the global address for the TOC entry of the
  • // variable offset is created with MO_TLSGD_FLAG.
    SDValue VariableOffsetTGA =
    DAG.getTargetGlobalAddress(GV, dl, PtrVT, 0, PPCII::MO_TLSGD_FLAG);
    SDValue RegionHandleTGA =
    diff --git a/llvm/lib/Target/PowerPC/PPCISelLowering.h b/llvm/lib/Target/PowerPC/PPCISelLowering.h
    index 7c62e370f1536a4..cffe5c73fcd6d01 100644
    --- a/llvm/lib/Target/PowerPC/PPCISelLowering.h
    +++ b/llvm/lib/Target/PowerPC/PPCISelLowering.h
    @@ -370,11 +370,23 @@ namespace llvm {
    /// G8RC = TLSGD_AIX, TOC_ENTRY, TOC_ENTRY
    /// Op that combines two register copies of TOC entries
    /// (region handle into R3 and variable offset into R4) followed by a
  • /// GET_TLS_ADDR node which will be expanded to a call to __get_tls_addr.
  • /// GET_TLS_ADDR node which will be expanded to a call to __tls_get_addr.
    /// This node is used in 64-bit mode as well (in which case the result is
    /// G8RC and inputs are X3/X4).
    TLSGD_AIX,

  • /// %x3 = GET_TLS_MOD _$TLSML - For the AIX local-dynamic TLS model,

  • /// produces a call to __tls_get_mod(_$TLSML@ml).

  • GET_TLS_MOD,

  • /// [GP|G8]RC = TLSLD_AIX, TOC_ENTRY(variable offset)

  • /// Op that internally creates TOC entry for the &quot;_$TLSML&quot; symbol, generates

  • /// GET_TLS_MOD node which will be expanded into a call to __tls_get_mod,

  • /// and then add the variable offset with the result from the call.

  • /// This node is used in both 32-bit and 64-bit modes. The only difference

  • /// is register class.

  • TLSLD_AIX,

  • /// G8RC = ADDIS_TLSLD_HA %x2, Symbol - For the local-dynamic TLS
    /// model, produces an ADDIS8 instruction that adds the GOT base
    /// register to sym@got@tlsld@ha.
    diff --git a/llvm/lib/Target/PowerPC/PPCInstr64Bit.td b/llvm/lib/Target/PowerPC/PPCInstr64Bit.td
    index fd436d422298355..51e1cd544452adf 100644
    --- a/llvm/lib/Target/PowerPC/PPCInstr64Bit.td
    +++ b/llvm/lib/Target/PowerPC/PPCInstr64Bit.td
    @@ -1584,12 +1584,19 @@ def GETtlsldADDRPCREL : GETtlsldADDRPseudo &lt;&quot;#GETtlsldADDRPCREL&quot;&gt;;
    // so we don&#x27;t need to mark it with a size of 8 bytes. Finally, the assembly
    // manual mentions this exact set of registers as the clobbered set, others
    // are guaranteed not to be clobbered.
    -let Defs = [X0,X4,X5,X11,LR8,CR0] in
    +let Defs = [X0,X4,X5,X11,LR8,CR0] in {
    def GETtlsADDR64AIX :
    PPCEmitTimePseudo&lt;(outs g8rc:$rD),(ins g8rc:$offset, g8rc:$handle),
    &quot;GETtlsADDR64AIX&quot;,
    [(set i64:$rD...

llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/PowerPC/PPCISelLowering.h Outdated Show resolved Hide resolved
llvm/lib/Target/PowerPC/PPCISelLowering.h Outdated Show resolved Hide resolved
llvm/lib/Target/PowerPC/PPCTLSDynamicCall.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp Outdated Show resolved Hide resolved
@orcguru orcguru requested a review from bzEq September 18, 2023 08:05
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/PowerPC/PPCISelLowering.cpp Outdated Show resolved Hide resolved
@orcguru
Copy link
Contributor Author

orcguru commented Sep 21, 2023

@bzEq @ecnelises If there is no major concern, can we land this first before the weekend? Thank you!

@@ -32,7 +33,7 @@ entry:
; RELOC-NEXT: Section (index: 1) .text {
; RELOC-NEXT: Relocation {
; RELOC-NEXT: Virtual Address: 0x16
; RELOC-NEXT: Symbol: .TIInit (17)
; RELOC-NEXT: Symbol: TIInit (19)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is not this patch's scope. But it may make more sense to use a RE to handle the symbol index. We were trying to fix this, but I guess we didn't change them all.

Copy link
Collaborator

@chenzheng1030 chenzheng1030 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments may need follow up, so I posted them first. I will continue to review the code generation part.

We need a front end patch to eliminate the error report in clang/lib/Frontend/CompilerInvocation.cpp and clang/lib/Sema/SemaDeclAttr.cpp, right?

@@ -240,15 +241,35 @@ entry:
; SYM-NEXT: }
; SYM-NEXT: Symbol {
; SYM-NEXT: Index: 3
; SYM-NEXT: Name:
; SYM-NEXT: Name: .__tls_get_addr
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment I know this is not this patch's scope. But it may make more sense to use a RE to handle the symbol index. We were trying to fix this, but I guess we didn't change them all. is more appropriate here.

@orcguru
Copy link
Contributor Author

orcguru commented Sep 21, 2023

Some comments may need follow up, so I posted them first. I will continue to review the code generation part.

We need a front end patch to eliminate the error report in clang/lib/Frontend/CompilerInvocation.cpp and clang/lib/Sema/SemaDeclAttr.cpp, right?

I tried to parallelize the review effort, but github didn't show the delta for me: #66972

; SMALL32-NEXT: stw 0, 40(1)
; SMALL32-NEXT: bla .__tls_get_addr[PR]
; SMALL32-NEXT: lwz 6, L..C4(2) # target-flags(ppc-tlsld) @TIInit
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use r4 here? That should align with the assembler manual and save register use.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__tls_get_mod() clobbers r4.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated a version to address the issue, will continue improve this point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New approach will invalidate d400de0, please ignore that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now for the small-model, we did reuse r4. However for the large-model, since machine-scheduler will decide to move the ADDIS before the .__tls_get_mod call, I did not try to move ADDIS in PPCTLSDynamicCall.cpp

// PPCTLSDynamicCall.
SDValue VariableOffsetTGA =
DAG.getTargetGlobalAddress(GV, dl, PtrVT, 0, PPCII::MO_TLSLD_FLAG);
SDValue VariableOffset = getTOCEntry(DAG, dl, VariableOffsetTGA);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it missing TOC entry for the module handle?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. We can add another flag PPCII::MO_TLSLDM_FLAG(Maybe there is not enough bit in PPCII, we can use a distinct value, or extend the MO_ACCESS_MASK to free more bits.).

With this way, seems we don't need the "strange" function isSpecialAIXSymbolTLSML() in the PPCAsmPrinter file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I was thinking we need the TOC entry for the module handle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a blocker which limits FLAG bit size to be 12:

llvm/include/llvm/CodeGen/MachineOperand.h
...
  unsigned SubReg_TargetFlags : 12;
...
  void setTargetFlags(unsigned F) {
    assert(!isReg() && "Register operands can't have target flags");
    SubReg_TargetFlags = F;
    assert(SubReg_TargetFlags == F && "Target flags out of range");
  } 

Currently PPCII::TOF is already full. I'm afraid this will be a FIXME...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still didn't figure out how to remove the strange function.

The way I tried was getOrCreate ExternalLinkage GV named "_$TLSML", and call getTOCEntry on that GV. However there will be a symbol table entry for the symbol, but we don't want that.

For asm mode, I can add hack in MCAsmStreamer::emitXCOFFSymbolLinkageWithVisibility to do not emit the ".extern _Renamed..5f24__TLSML; rename _Renamed..5f24__TLSML ......".

For obj mode, things are more complex, and I'm not able to cleanly remove the symbol table entry without by using hack approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I did have another version which can essentially remove the assert check with some side effects:

(1) It did generated extra external symbol reference, which looks like no harm with simple HelloWorld. (Notice: xlC does not generate this symbol reference.)

        .extern _Renamed..5f24__TLSML[UA]
        .rename _Renamed..5f24__TLSML[UA],"_$TLSML"

(2) Inside XCOFFObjectWriter::recordRelocation() logic I have to apply a hack to get the correct symbol, because now there are two symbols: one is _Renamed..5f24__TLSML[UA], the other is _Renamed..5f24__TLSML[TC]. Without hack it will choose the [UA] one which will cause linker failure.

Attached the draft patch: ReplaceAssertHack.diff.txt

I think current version has the advantage that it does not generate any extra external symbol reference or any extra relocation entry, although it looks ugly with regard to those asserts.

Is it possible that we fix the hack implementation with some future version (I will open an issue and work on it afterwards)? How about let's move on with current approach?

@amy-kwan @stephenpeckham @bzEq @chenzheng1030 Appreciate your comments. Thank you!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this a bit offline, and we should add a FIXME (thanks for adding it!).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4b932d8 is committed to redesign the target flags of machine operand on PPC. Now we should be able to add new flags.

@bzEq
Copy link
Collaborator

bzEq commented Sep 21, 2023

I tried to parallelize the review effort, but github didn't show the delta for me: #66972

Let's include FE change in this PR.

tingwang and others added 24 commits February 29, 2024 18:17
(1) Removed TLSLDAIX argument so that duplicated nodes can be eliminated.
(2) Try to reuse clobber r4 by moving TLSLDAIX ahead of LoadOffsetToc node.
(3) Add FIXME comments.
(4) Add test case to show duplicated .__tls_get_mod can be eliminated.

Below two cases are not updated yet due to environment issue. I will fix
those in next update.
Failed Tests:
  LLVM :: CodeGen/PowerPC/aix-tls-xcoff-reloc-large.ll
  LLVM :: CodeGen/PowerPC/aix-tls-xcoff-reloc.ll
The "_$TLSML" symbol did not lower through getTOCEntry().
The "_$TLSML" symbol did not lower through getTOCEntry().

The pseudo node "TLSLDAIX" now takes the _$TLSML GV node, and the
ppc-tls-dynamic-call pass is updated to fine tune the relative order
between the LoadOffset@toc node and the .__tls_get_mod node.
Since `getSectionForExternalReference` explicitly set XCOFF::XTY_SD for
the `_$TLSML` symbol reference, and that only CSType `XCOFF::XTY_ER`
symbol references are added into `UndefinedCsects`, the checks inside
`executePostLayoutBinding` are redundant.

Update test case to make sure external reference to the `_$TLSML` symbol
is not observed.
@orcguru orcguru merged commit 5b05870 into llvm:main Mar 1, 2024
2 of 4 checks passed
orcguru added a commit that referenced this pull request Mar 5, 2024
@efriedma-quic
Copy link
Collaborator

It looks like a few of the regression tests are failing on the reverse-iteration buildbot (https://lab.llvm.org/buildbot/#/builders/54/builds/9683)

@orcguru
Copy link
Contributor Author

orcguru commented Apr 23, 2024

It looks like a few of the regression tests are failing on the reverse-iteration buildbot (https://lab.llvm.org/buildbot/#/builders/54/builds/9683)

Thank you! Posted: #89714

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:PowerPC clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants