Skip to content

[TBAA] Assume struct member TBAA does not alias errno#200367

Open
nikic wants to merge 1 commit into
llvm:mainfrom
nikic:errno-struct-tbaa
Open

[TBAA] Assume struct member TBAA does not alias errno#200367
nikic wants to merge 1 commit into
llvm:mainfrom
nikic:errno-struct-tbaa

Conversation

@nikic
Copy link
Copy Markdown
Contributor

@nikic nikic commented May 29, 2026

Assume that TBAA that refers to a struct member access, assume it does not alias errno even if the member is compatible with int.

This is subject to the usual libc LTO issue where the heuristic might not be strictly correct if the errno implementation is visible (as errno may be part of a struct).

In particular, this is important to make sure that things like string length/capacity are not considered clobbered by allocation calls that may write to errno. See #197199 for where this came up.

Assume that TBAA that refers to a struct member access, assume it
does not alias errno even if the member is compatible with int.

This is subject to the usual libc LTO issue where the heuristic
might not be strictly correct if the errno implementation is
visible (as errno may be part of a struct).

In particular, this is important to make sure that things like
string length/capacity are not considered clobbered by allocation
calls that may write to errno. See llvm#197199 for where this came up.
@llvmorg-github-actions llvmorg-github-actions Bot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels May 29, 2026
@llvmorg-github-actions
Copy link
Copy Markdown

llvmorg-github-actions Bot commented May 29, 2026

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-analysis

Author: Nikita Popov (nikic)

Changes

Assume that TBAA that refers to a struct member access, assume it does not alias errno even if the member is compatible with int.

This is subject to the usual libc LTO issue where the heuristic might not be strictly correct if the errno implementation is visible (as errno may be part of a struct).

In particular, this is important to make sure that things like string length/capacity are not considered clobbered by allocation calls that may write to errno. See #197199 for where this came up.


Full diff: https://github.com/llvm/llvm-project/pull/200367.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp (+5)
  • (modified) llvm/test/Transforms/InstCombine/may-alias-errno.ll (+21-2)
diff --git a/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp b/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
index c7d263a75b33a..04160af71f2e8 100644
--- a/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
+++ b/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
@@ -395,6 +395,11 @@ AliasResult TypeBasedAAResult::aliasErrno(const MemoryLocation &Loc,
   if (!N)
     return AliasResult::MayAlias;
 
+  // Assume that struct member accesses never alias errno, even if the member
+  // is an integer.
+  if (N->getOperand(0) != N->getOperand(1))
+    return AliasResult::NoAlias;
+
   // There cannot be any alias with errno if TBAA proves the given memory
   // location does not alias errno.
   const auto *ErrnoTBAAMD = M->getNamedMetadata("llvm.errno.tbaa");
diff --git a/llvm/test/Transforms/InstCombine/may-alias-errno.ll b/llvm/test/Transforms/InstCombine/may-alias-errno.ll
index 40fab8024b362..8a4c0fe1293ec 100644
--- a/llvm/test/Transforms/InstCombine/may-alias-errno.ll
+++ b/llvm/test/Transforms/InstCombine/may-alias-errno.ll
@@ -3,8 +3,8 @@
 
 ; sinf clobbering errno, but %p cannot alias errno per C/C++ strict aliasing rules via TBAA.
 ; Can do constant store-to-load forwarding.
-define float @does_not_alias_errno(ptr %p, float %f) {
-; CHECK-LABEL: define float @does_not_alias_errno(
+define float @does_not_alias_errno_float_tbaa(ptr %p, float %f) {
+; CHECK-LABEL: define float @does_not_alias_errno_float_tbaa(
 ; CHECK-SAME: ptr [[P:%.*]], float [[F:%.*]]) {
 ; CHECK-NEXT:  [[ENTRY:.*:]]
 ; CHECK-NEXT:    store float 0.000000e+00, ptr [[P]], align 4, !tbaa [[TBAA4:![0-9]+]]
@@ -18,6 +18,21 @@ entry:
   ret float %0
 }
 
+define i32 @does_not_alias_errno_struct_tbaa(ptr %p, float %f) {
+; CHECK-LABEL: define i32 @does_not_alias_errno_struct_tbaa(
+; CHECK-SAME: ptr [[P:%.*]], float [[F:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    store i32 0, ptr [[P]], align 4, !tbaa [[TBAA6:![0-9]+]]
+; CHECK-NEXT:    [[CALL:%.*]] = call float @sinf(float [[F]])
+; CHECK-NEXT:    ret i32 0
+;
+entry:
+  store i32 0, ptr %p, align 4, !tbaa !6
+  %call = call float @sinf(float %f)
+  %val = load i32, ptr %p, align 4, !tbaa !6
+  ret i32 %val
+}
+
 ; sinf clobbering errno, but %p is alloca memory, wich can never aliases errno.
 ; Can do constant store-to-load forwarding.
 define float @does_not_alias_errno_2(float %f) {
@@ -161,6 +176,8 @@ declare void @escape(ptr %p)
 !3 = !{!"Simple C/C++ TBAA"}
 !4 = !{!5, !5, i64 0}
 !5 = !{!"float", !2, i64 0}
+!6 = !{!7, !1, i64 0}
+!7 = !{!"my_struct", !1, i64 0}
 ;.
 ; CHECK: [[TBAA0]] = !{[[META1:![0-9]+]], [[META1]], i64 0}
 ; CHECK: [[META1]] = !{!"int", [[META2:![0-9]+]], i64 0}
@@ -168,4 +185,6 @@ declare void @escape(ptr %p)
 ; CHECK: [[META3]] = !{!"Simple C/C++ TBAA"}
 ; CHECK: [[TBAA4]] = !{[[META5:![0-9]+]], [[META5]], i64 0}
 ; CHECK: [[META5]] = !{!"float", [[META2]], i64 0}
+; CHECK: [[TBAA6]] = !{[[META7:![0-9]+]], [[META1]], i64 0}
+; CHECK: [[META7]] = !{!"my_struct", [[META1]], i64 0}
 ;.

Copy link
Copy Markdown
Contributor

@antoniofrighetto antoniofrighetto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM on my side, would also like to hear @efriedma-quic on this.

@cardigan1008
Copy link
Copy Markdown
Member

cardigan1008 commented May 29, 2026

Hi, there is a related miscompilation case:

@errno = global i32 7, align 4

define i32 @errno_metadata_exact_member(ptr %p, float %f) {
entry:
  store i32 0, ptr %p, align 4, !tbaa !3
  %call = call float @sinf(float %f)
  %val = load i32, ptr %p, align 4, !tbaa !3
  ret i32 %val
}

define float @sinf(float %f) memory(errnomem: write) {
entry:
  store i32 44, ptr @errno, align 4, !tbaa !3
  ret float %f
}

!llvm.errno.tbaa = !{!3}
!0 = !{!"Simple C/C++ TBAA"}
!1 = !{!"omnipotent char", !0, i64 0}
!2 = !{!"int", !1, i64 0}
!3 = !{!4, !2, i64 0}
!4 = !{!"errno_struct", !2, i64 0}

define i32 @main(i32 %argc, ptr %argv) {
entry:
  %r = call i32 @errno_metadata_exact_member(ptr @errno, float 1.000000e+00)
  ret i32 %r
}

Transofrmed into:

@errno = global i32 7, align 4

define i32 @errno_metadata_exact_member(ptr %p, float %f) {
entry:
  store i32 0, ptr %p, align 4, !tbaa !0
  %call = call float @sinf(float %f)
  ret i32 0
}

; Function Attrs: memory(errnomem: write)
define float @sinf(float %f) #0 {
entry:
  store i32 44, ptr @errno, align 4, !tbaa !0
  ret float %f
}

attributes #0 = { memory(errnomem: write) }

!llvm.errno.tbaa = !{!0}

!0 = !{!1, !2, i64 0}
!1 = !{!"errno_struct", !2, i64 0}
!2 = !{!"int", !3, i64 0}
!3 = !{!"omnipotent char", !4, i64 0}
!4 = !{!"Simple C/C++ TBAA"}

define i32 @main(i32 %argc, ptr %argv) {
entry:
  %r = call i32 @errno_metadata_exact_member(ptr @errno, float 1.000000e+00)
  ret i32 %r
}

Executing them with lli gets 44 and 0.

Found with Archer. Please let me know if anything is wrong.

@antoniofrighetto
Copy link
Copy Markdown
Contributor

!llvm.errno.tbaa = !{!3}
!0 = !{!"Simple C/C++ TBAA"}
!1 = !{!"omnipotent char", !0, i64 0}
!2 = !{!"int", !1, i64 0}
!3 = !{!4, !2, i64 0}
!4 = !{!"errno_struct", !2, i64 0}

The frontend should be guaranteed to not emit llvm.errno.tbaa metadata as a struct-member tag (more at: https://discourse.llvm.org/t/rfc-modelling-errno-memory-effects). I wonder if the Verifier should explicitly reject cases as such.

Copy link
Copy Markdown
Contributor

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should fix this by changing the way we construct errno tbaa? If we represent errno as a member of a struct __libc_errno, the correct aliasing implications should fall out automatically.

This is subject to the usual libc LTO issue where the heuristic might not be strictly correct if the errno implementation is visible (as errno may be part of a struct).

Should we make a list somewhere of known libc-LTO issues? I know we've discussed malloc, and now errno; I'm not sure if there's anything else, and I don't want to lose track.

@nikic
Copy link
Copy Markdown
Contributor Author

nikic commented Jun 1, 2026

Maybe we should fix this by changing the way we construct errno tbaa? If we represent errno as a member of a struct __libc_errno, the correct aliasing implications should fall out automatically.

Interesting idea. I wonder, could we also use that to fix the libc LTO issue? What I'm thinking is that we use __libc_errno normally, and int directly with -ffreestanding (as a proxy for that being libc). This means that if both modules get linked together, we'll have both in the !llvm.errno.tbaa, which means that the more liberal int will win.

@efriedma-quic
Copy link
Copy Markdown
Contributor

The primary issue with errno and LTO isn't tbaa; it's the fact that we can't identify errno after __llvm_libc_errno or whatever is inlined. Any solution for that probably also solves the tbaa issue.

@nikic
Copy link
Copy Markdown
Contributor Author

nikic commented Jun 1, 2026

Yes, I was talking only about the specific case here, not the more general problem (such as assumptions about globals). But I guess only handling a subset of the issue isn't really worthwhile.

If we don't want to make a -ffreestanding distinction for !llvm.errno.tbaa though, then I think the current implementation in this PR is preferable over changing !llvm.errno.tbaa to get a particular optimization outcome.

@efriedma-quic
Copy link
Copy Markdown
Contributor

If the errno variable is visible, I can't think of very many solutions for representing that in the IR... basically either:

  • We drop all errno-related markings. That includes the memory attribute, and the llvm.errno.tbaa metadata.
  • We need to mark up the declaration of the variable used to store errno.

For either of those solutions, it doesn't really matter what !llvm.errno.tbaa looks like.


That said, I'm not necessarily opposed to this patch... I was just hoping to avoid the special case, to reduce confusion for anyone reading the IR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:analysis Includes value tracking, cost tables and constant folding llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants