Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PowerPC][NFC] Add base test case for small-local-dynamic-tls on AIX #84711

Merged
merged 4 commits into from
Mar 24, 2024

Conversation

orcguru
Copy link
Contributor

@orcguru orcguru commented Mar 11, 2024

No description provided.

@llvmbot
Copy link
Collaborator

llvmbot commented Mar 11, 2024

@llvm/pr-subscribers-backend-powerpc

Author: Felix (Ting Wang) (orcguru)

Changes

Patch is 177.00 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/84711.diff

8 Files Affected:

  • (added) llvm/test/CodeGen/PowerPC/aix-small-local-dynamic-tls-char.ll (+336)
  • (added) llvm/test/CodeGen/PowerPC/aix-small-local-dynamic-tls-double.ll (+339)
  • (added) llvm/test/CodeGen/PowerPC/aix-small-local-dynamic-tls-float.ll (+339)
  • (added) llvm/test/CodeGen/PowerPC/aix-small-local-dynamic-tls-int.ll (+375)
  • (added) llvm/test/CodeGen/PowerPC/aix-small-local-dynamic-tls-largeaccess.ll (+363)
  • (added) llvm/test/CodeGen/PowerPC/aix-small-local-dynamic-tls-longlong.ll (+368)
  • (added) llvm/test/CodeGen/PowerPC/aix-small-local-dynamic-tls-short.ll (+336)
  • (added) llvm/test/CodeGen/PowerPC/aix-tls-ld-ldst-O0.ll (+1031)
diff --git a/llvm/test/CodeGen/PowerPC/aix-small-local-dynamic-tls-char.ll b/llvm/test/CodeGen/PowerPC/aix-small-local-dynamic-tls-char.ll
new file mode 100644
index 00000000000000..6fb8683330a303
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/aix-small-local-dynamic-tls-char.ll
@@ -0,0 +1,336 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 3
+; RUN: llc  -verify-machineinstrs -mcpu=pwr7 -ppc-asm-full-reg-names \
+; RUN:      -mtriple powerpc64-ibm-aix-xcoff < %s \
+; RUN:      | FileCheck %s --check-prefix=SMALL-LOCAL-DYNAMIC-SMALLCM64
+; RUN: llc  -verify-machineinstrs -mcpu=pwr7 -ppc-asm-full-reg-names \
+; RUN:      -mtriple powerpc64-ibm-aix-xcoff --code-model=large \
+; RUN:      < %s | FileCheck %s \
+; RUN:      --check-prefix=SMALL-LOCAL-DYNAMIC-LARGECM64
+
+@ThreadLocalVarInit = thread_local(localdynamic) global i8 1, align 1
+@VarInit = local_unnamed_addr global i8 87, align 1
+@IThreadLocalVarInit = internal thread_local(localdynamic) global i8 1, align 1
+declare nonnull ptr @llvm.threadlocal.address.p0(ptr nonnull) #1
+@c = thread_local(localdynamic) global [87 x i8] zeroinitializer, align 1
+
+define nonnull ptr @AddrTest1() local_unnamed_addr #0 {
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-LABEL: AddrTest1:
+; SMALL-LOCAL-DYNAMIC-SMALLCM64:       # %bb.0: # %entry
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    mflr r0
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    stdu r1, -48(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r3, L..C0(r2) # target-flags(ppc-tlsldm) @"_$TLSML"
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    std r0, 64(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    bla .__tls_get_mod[PR]
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r4, L..C1(r2) # target-flags(ppc-tlsld) @c
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    add r3, r3, r4
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    addi r3, r3, 1
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    addi r1, r1, 48
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r0, 16(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    mtlr r0
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    blr
+;
+; SMALL-LOCAL-DYNAMIC-LARGECM64-LABEL: AddrTest1:
+; SMALL-LOCAL-DYNAMIC-LARGECM64:       # %bb.0: # %entry
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    mflr r0
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    stdu r1, -48(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addis r3, L..C0@u(r2)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addis r6, L..C1@u(r2)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    std r0, 64(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r3, L..C0@l(r3)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    bla .__tls_get_mod[PR]
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r4, L..C1@l(r6)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    add r3, r3, r4
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addi r3, r3, 1
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addi r1, r1, 48
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r0, 16(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    mtlr r0
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    blr
+entry:
+  %0 = tail call align 1 ptr @llvm.threadlocal.address.p0(ptr align 1 @c)
+  %arrayidx = getelementptr inbounds [87 x i8], ptr %0, i64 0, i64 1
+  ret ptr %arrayidx
+}
+
+define void @storeITLInit(i8 noundef zeroext %x) {
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-LABEL: storeITLInit:
+; SMALL-LOCAL-DYNAMIC-SMALLCM64:       # %bb.0: # %entry
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    mflr r0
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    stdu r1, -48(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    mr r6, r3
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r3, L..C0(r2) # target-flags(ppc-tlsldm) @"_$TLSML"
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    std r0, 64(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    bla .__tls_get_mod[PR]
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r4, L..C2(r2) # target-flags(ppc-tlsld) @IThreadLocalVarInit
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    stbx r6, r3, r4
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    addi r1, r1, 48
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r0, 16(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    mtlr r0
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    blr
+;
+; SMALL-LOCAL-DYNAMIC-LARGECM64-LABEL: storeITLInit:
+; SMALL-LOCAL-DYNAMIC-LARGECM64:       # %bb.0: # %entry
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    mflr r0
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    stdu r1, -48(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    mr r6, r3
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addis r3, L..C0@u(r2)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addis r7, L..C2@u(r2)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    std r0, 64(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r3, L..C0@l(r3)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    bla .__tls_get_mod[PR]
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r4, L..C2@l(r7)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    stbx r6, r3, r4
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addi r1, r1, 48
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r0, 16(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    mtlr r0
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    blr
+entry:
+  %0 = tail call align 1 ptr @llvm.threadlocal.address.p0(ptr align 1 @IThreadLocalVarInit)
+  store i8 %x, ptr %0, align 1
+  ret void
+}
+
+define void @storeTLInit(i8 noundef zeroext %x) {
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-LABEL: storeTLInit:
+; SMALL-LOCAL-DYNAMIC-SMALLCM64:       # %bb.0: # %entry
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    mflr r0
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    stdu r1, -48(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    mr r6, r3
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r3, L..C0(r2) # target-flags(ppc-tlsldm) @"_$TLSML"
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    std r0, 64(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    bla .__tls_get_mod[PR]
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r4, L..C3(r2) # target-flags(ppc-tlsld) @ThreadLocalVarInit
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    stbx r6, r3, r4
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    addi r1, r1, 48
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r0, 16(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    mtlr r0
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    blr
+;
+; SMALL-LOCAL-DYNAMIC-LARGECM64-LABEL: storeTLInit:
+; SMALL-LOCAL-DYNAMIC-LARGECM64:       # %bb.0: # %entry
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    mflr r0
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    stdu r1, -48(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    mr r6, r3
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addis r3, L..C0@u(r2)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addis r7, L..C3@u(r2)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    std r0, 64(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r3, L..C0@l(r3)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    bla .__tls_get_mod[PR]
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r4, L..C3@l(r7)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    stbx r6, r3, r4
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addi r1, r1, 48
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r0, 16(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    mtlr r0
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    blr
+entry:
+  %0 = tail call align 1 ptr @llvm.threadlocal.address.p0(ptr align 1 @ThreadLocalVarInit)
+  store i8 %x, ptr %0, align 1
+  ret void
+}
+
+define zeroext i8 @loadITLInit() {
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-LABEL: loadITLInit:
+; SMALL-LOCAL-DYNAMIC-SMALLCM64:       # %bb.0: # %entry
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    mflr r0
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    stdu r1, -48(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r3, L..C0(r2) # target-flags(ppc-tlsldm) @"_$TLSML"
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    std r0, 64(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    bla .__tls_get_mod[PR]
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r4, L..C2(r2) # target-flags(ppc-tlsld) @IThreadLocalVarInit
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    lbzx r3, r3, r4
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    addi r1, r1, 48
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r0, 16(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    mtlr r0
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    blr
+;
+; SMALL-LOCAL-DYNAMIC-LARGECM64-LABEL: loadITLInit:
+; SMALL-LOCAL-DYNAMIC-LARGECM64:       # %bb.0: # %entry
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    mflr r0
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    stdu r1, -48(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addis r3, L..C0@u(r2)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addis r6, L..C2@u(r2)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    std r0, 64(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r3, L..C0@l(r3)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    bla .__tls_get_mod[PR]
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r4, L..C2@l(r6)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    lbzx r3, r3, r4
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addi r1, r1, 48
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r0, 16(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    mtlr r0
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    blr
+entry:
+  %0 = tail call align 1 ptr @llvm.threadlocal.address.p0(ptr align 1 @IThreadLocalVarInit)
+  %1 = load i8, ptr %0, align 1
+  ret i8 %1
+}
+
+define zeroext i8 @loadITLInit2() {
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-LABEL: loadITLInit2:
+; SMALL-LOCAL-DYNAMIC-SMALLCM64:       # %bb.0: # %entry
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    mflr r0
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    stdu r1, -48(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r3, L..C0(r2) # target-flags(ppc-tlsldm) @"_$TLSML"
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    std r0, 64(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    bla .__tls_get_mod[PR]
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r4, L..C2(r2) # target-flags(ppc-tlsld) @IThreadLocalVarInit
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    lbzx r3, r3, r4
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r4, L..C4(r2) # @VarInit
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    lbz r4, 0(r4)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    add r3, r4, r3
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    clrldi r3, r3, 56
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    addi r1, r1, 48
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r0, 16(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    mtlr r0
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    blr
+;
+; SMALL-LOCAL-DYNAMIC-LARGECM64-LABEL: loadITLInit2:
+; SMALL-LOCAL-DYNAMIC-LARGECM64:       # %bb.0: # %entry
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    mflr r0
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    stdu r1, -48(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addis r3, L..C0@u(r2)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addis r6, L..C2@u(r2)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    std r0, 64(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r3, L..C0@l(r3)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    bla .__tls_get_mod[PR]
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r4, L..C2@l(r6)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    lbzx r3, r3, r4
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addis r4, L..C4@u(r2)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r4, L..C4@l(r4)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    lbz r4, 0(r4)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    add r3, r4, r3
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    clrldi r3, r3, 56
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addi r1, r1, 48
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r0, 16(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    mtlr r0
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    blr
+entry:
+  %0 = tail call align 1 ptr @llvm.threadlocal.address.p0(ptr align 1 @IThreadLocalVarInit)
+  %1 = load i8, ptr %0, align 1
+  %2 = load i8, ptr @VarInit, align 1
+  %add = add i8 %2, %1
+  ret i8 %add
+}
+
+define zeroext i8 @loadTLInit() {
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-LABEL: loadTLInit:
+; SMALL-LOCAL-DYNAMIC-SMALLCM64:       # %bb.0: # %entry
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    mflr r0
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    stdu r1, -48(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r3, L..C0(r2) # target-flags(ppc-tlsldm) @"_$TLSML"
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    std r0, 64(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    bla .__tls_get_mod[PR]
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r4, L..C3(r2) # target-flags(ppc-tlsld) @ThreadLocalVarInit
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    lbzx r3, r3, r4
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    addi r1, r1, 48
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r0, 16(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    mtlr r0
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    blr
+;
+; SMALL-LOCAL-DYNAMIC-LARGECM64-LABEL: loadTLInit:
+; SMALL-LOCAL-DYNAMIC-LARGECM64:       # %bb.0: # %entry
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    mflr r0
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    stdu r1, -48(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addis r3, L..C0@u(r2)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addis r6, L..C3@u(r2)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    std r0, 64(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r3, L..C0@l(r3)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    bla .__tls_get_mod[PR]
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r4, L..C3@l(r6)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    lbzx r3, r3, r4
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addi r1, r1, 48
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r0, 16(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    mtlr r0
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    blr
+entry:
+  %0 = tail call align 1 ptr @llvm.threadlocal.address.p0(ptr align 1 @ThreadLocalVarInit)
+  %1 = load i8, ptr %0, align 1
+  ret i8 %1
+}
+
+define zeroext i8 @loadTLInit2() {
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-LABEL: loadTLInit2:
+; SMALL-LOCAL-DYNAMIC-SMALLCM64:       # %bb.0: # %entry
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    mflr r0
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    stdu r1, -48(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r3, L..C0(r2) # target-flags(ppc-tlsldm) @"_$TLSML"
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    std r0, 64(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    bla .__tls_get_mod[PR]
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r4, L..C3(r2) # target-flags(ppc-tlsld) @ThreadLocalVarInit
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    lbzx r3, r3, r4
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r4, L..C4(r2) # @VarInit
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    lbz r4, 0(r4)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    add r3, r4, r3
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    clrldi r3, r3, 56
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    addi r1, r1, 48
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r0, 16(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    mtlr r0
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    blr
+;
+; SMALL-LOCAL-DYNAMIC-LARGECM64-LABEL: loadTLInit2:
+; SMALL-LOCAL-DYNAMIC-LARGECM64:       # %bb.0: # %entry
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    mflr r0
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    stdu r1, -48(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addis r3, L..C0@u(r2)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addis r6, L..C3@u(r2)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    std r0, 64(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r3, L..C0@l(r3)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    bla .__tls_get_mod[PR]
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r4, L..C3@l(r6)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    lbzx r3, r3, r4
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addis r4, L..C4@u(r2)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r4, L..C4@l(r4)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    lbz r4, 0(r4)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    add r3, r4, r3
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    clrldi r3, r3, 56
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addi r1, r1, 48
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r0, 16(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    mtlr r0
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    blr
+entry:
+  %0 = tail call align 1 ptr @llvm.threadlocal.address.p0(ptr align 1 @ThreadLocalVarInit)
+  %1 = load i8, ptr %0, align 1
+  %2 = load i8, ptr @VarInit, align 1
+  %add = add i8 %2, %1
+  ret i8 %add
+}
+
+define void @loadStore1(i8 noundef zeroext %x) {
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-LABEL: loadStore1:
+; SMALL-LOCAL-DYNAMIC-SMALLCM64:       # %bb.0: # %entry
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    mflr r0
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    stdu r1, -48(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r3, L..C0(r2) # target-flags(ppc-tlsldm) @"_$TLSML"
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    std r0, 64(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    bla .__tls_get_mod[PR]
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r4, L..C2(r2) # target-flags(ppc-tlsld) @IThreadLocalVarInit
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    lbzx r5, r3, r4
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    addi r5, r5, 9
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    stbx r5, r3, r4
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    addi r1, r1, 48
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r0, 16(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    mtlr r0
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    blr
+;
+; SMALL-LOCAL-DYNAMIC-LARGECM64-LABEL: loadStore1:
+; SMALL-LOCAL-DYNAMIC-LARGECM64:       # %bb.0: # %entry
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    mflr r0
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    stdu r1, -48(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addis r3, L..C0@u(r2)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addis r6, L..C2@u(r2)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    std r0, 64(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r3, L..C0@l(r3)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    bla .__tls_get_mod[PR]
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r4, L..C2@l(r6)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    lbzx r5, r3, r4
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addi r5, r5, 9
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    stbx r5, r3, r4
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addi r1, r1, 48
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    ld r0, 16(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    mtlr r0
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    blr
+entry:
+  %0 = tail call align 1 ptr @llvm.threadlocal.address.p0(ptr align 1 @IThreadLocalVarInit)
+  %1 = load i8, ptr %0, align 1
+  %add = add i8 %1, 9
+  store i8 %add, ptr %0, align 1
+  ret void
+}
diff --git a/llvm/test/CodeGen/PowerPC/aix-small-local-dynamic-tls-double.ll b/llvm/test/CodeGen/PowerPC/aix-small-local-dynamic-tls-double.ll
new file mode 100644
index 00000000000000..3dfc0aa6d4aff1
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/aix-small-local-dynamic-tls-double.ll
@@ -0,0 +1,339 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 3
+; RUN: llc  -verify-machineinstrs -mcpu=pwr7 -ppc-asm-full-reg-names \
+; RUN:      -mtriple powerpc64-ibm-aix-xcoff < %s \
+; RUN:      | FileCheck %s --check-prefix=SMALL-LOCAL-DYNAMIC-SMALLCM64
+; RUN: llc  -verify-machineinstrs -mcpu=pwr7 -ppc-asm-full-reg-names \
+; RUN:      -mtriple powerpc64-ibm-aix-xcoff --code-model=large \
+; RUN:      < %s | FileCheck %s \
+; RUN:      --check-prefix=SMALL-LOCAL-DYNAMIC-LARGECM64
+
+@ThreadLocalVarInit = thread_local(localdynamic) global double 1.000000e+00, align 8
+@VarInit = local_unnamed_addr global double 8.700000e+01, align 8
+@IThreadLocalVarInit = internal thread_local(localdynamic) global double 1.000000e+00, align 8
+declare nonnull ptr @llvm.threadlocal.address.p0(ptr nonnull) #1
+@f = thread_local(localdynamic) global [87 x double] zeroinitializer, align 8
+
+define nonnull ptr @AddrTest1() local_unnamed_addr #0 {
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-LABEL: AddrTest1:
+; SMALL-LOCAL-DYNAMIC-SMALLCM64:       # %bb.0: # %entry
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    mflr r0
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    stdu r1, -48(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r3, L..C0(r2) # target-flags(ppc-tlsldm) @"_$TLSML"
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    std r0, 64(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    bla .__tls_get_mod[PR]
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r4, L..C1(r2) # target-flags(ppc-tlsld) @f
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    add r3, r3, r4
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    addi r3, r3, 48
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    addi r1, r1, 48
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    ld r0, 16(r1)
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    mtlr r0
+; SMALL-LOCAL-DYNAMIC-SMALLCM64-NEXT:    blr
+;
+; SMALL-LOCAL-DYNAMIC-LARGECM64-LABEL: AddrTest1:
+; SMALL-LOCAL-DYNAMIC-LARGECM64:       # %bb.0: # %entry
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    mflr r0
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    stdu r1, -48(r1)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addis r3, L..C0@u(r2)
+; SMALL-LOCAL-DYNAMIC-LARGECM64-NEXT:    addis r6, L..C1@u(r2)
+; SMALL-LOCAL-DYNAMIC-LA...
[truncated]

@orcguru
Copy link
Contributor Author

orcguru commented Mar 13, 2024

@chenzheng1030 Per your request, I have reduced some test cases. I also added missing one which will demonstrate "fixup_ppc_half16ds" related code change. Please let me know if test cases need be reduced further. Thank you!

Copy link
Collaborator

@chenzheng1030 chenzheng1030 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://llvm.org/docs/TestingGuide.html#best-practices-for-regression-tests

I think we should avoid any unnamed instructions in the input IR.

Since the files are too big, could you please simply show how do you arrange these tests? Thanks a lot.

@orcguru
Copy link
Contributor Author

orcguru commented Mar 22, 2024

See https://llvm.org/docs/TestingGuide.html#best-practices-for-regression-tests

I think we should avoid any unnamed instructions in the input IR.

Since the files are too big, could you please simply show how do you arrange these tests? Thanks a lot.

Updated. Thank you!

Copy link
Collaborator

@chenzheng1030 chenzheng1030 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change! LGTM.

@orcguru orcguru merged commit 90a7fc3 into llvm:main Mar 24, 2024
4 checks passed
@orcguru orcguru deleted the add_test branch March 24, 2024 00:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants