Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TBAA] Handle bitfields when generating !tbaa.struct metadata. #82922

Merged
merged 12 commits into from
Feb 27, 2024

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented Feb 25, 2024

At the moment, clang generates what I believe are incorrect !tbaa.struct fields for named bitfields. At the moment, the base type size is used for named bifields (e.g. sizeof(int)) instead of the bifield width per field. This results in overalpping fields in !tbaa.struct metadata.

This causes incorrect results when extracting individual copied fields from !tbaa.struct as in added in dc85719.

This patch fixes that by skipping by combining adjacent named bitfields in fields with correct size.

If this understanding is correct, I can also extend the verifier to check that !tbaa.struct fields aren't overlapping.

Fixes #82586

At the moment, clang generates what I believe are incorrect !tbaa.struct
fields for named bitfields. At the moment, the base type size is used
for named bifields (e.g. sizeof(int)) instead of the bifield width per
field. This results in overalpping fields in !tbaa.struct metadata.

This causes incorrect results when extracting individual copied fields
from !tbaa.struct as in added in dc85719.

This patch fixes that by skipping all bitfields, not only unnamed ones
(note that CollectFields has a TODO to support bitfields). As bitfields
specify their widths in bits, while !tbaa metadata uses bytes for sizes
and offsets, I don't think we would be able to generate correct metadata
for them in general.

If this understanding is correct, I can also extend the verifier to
check that !tbaa.struct fields aren't overlapping.

Fixes llvm#82586
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen labels Feb 25, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Feb 25, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Florian Hahn (fhahn)

Changes

At the moment, clang generates what I believe are incorrect !tbaa.struct fields for named bitfields. At the moment, the base type size is used for named bifields (e.g. sizeof(int)) instead of the bifield width per field. This results in overalpping fields in !tbaa.struct metadata.

This causes incorrect results when extracting individual copied fields from !tbaa.struct as in added in dc85719.

This patch fixes that by skipping all bitfields, not only unnamed ones (note that CollectFields has a TODO to support bitfields). As bitfields specify their widths in bits, while !tbaa metadata uses bytes for sizes and offsets, I don't think we would be able to generate correct metadata for them in general.

If this understanding is correct, I can also extend the verifier to check that !tbaa.struct fields aren't overlapping.

Fixes #82586


Full diff: https://github.com/llvm/llvm-project/pull/82922.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/CodeGenTBAA.cpp (+1-1)
  • (modified) clang/test/CodeGen/tbaa-struct.cpp (+1-1)
diff --git a/clang/lib/CodeGen/CodeGenTBAA.cpp b/clang/lib/CodeGen/CodeGenTBAA.cpp
index dc288bc3f6157a..43a1aee3d73823 100644
--- a/clang/lib/CodeGen/CodeGenTBAA.cpp
+++ b/clang/lib/CodeGen/CodeGenTBAA.cpp
@@ -298,7 +298,7 @@ CodeGenTBAA::CollectFields(uint64_t BaseOffset,
     unsigned idx = 0;
     for (RecordDecl::field_iterator i = RD->field_begin(),
          e = RD->field_end(); i != e; ++i, ++idx) {
-      if ((*i)->isZeroSize(Context) || (*i)->isUnnamedBitfield())
+      if ((*i)->isZeroSize(Context) || (*i)->isBitField())
         continue;
       uint64_t Offset = BaseOffset +
                         Layout.getFieldOffset(idx) / Context.getCharWidth();
diff --git a/clang/test/CodeGen/tbaa-struct.cpp b/clang/test/CodeGen/tbaa-struct.cpp
index ff5521fcf3f604..17c9d6bf6a7260 100644
--- a/clang/test/CodeGen/tbaa-struct.cpp
+++ b/clang/test/CodeGen/tbaa-struct.cpp
@@ -130,7 +130,7 @@ void copy8(NamedBitfields *a1, NamedBitfields *a2) {
 // CHECK-OLD: [[TS3]] = !{i64 0, i64 8, !{{.*}}, i64 0, i64 2, !{{.*}}, i64 4, i64 8, !{{.*}}}
 // CHECK-OLD: [[TS4]] = !{i64 0, i64 1, [[TAG_CHAR]], i64 1, i64 1, [[TAG_CHAR]], i64 2, i64 1, [[TAG_CHAR]]}
 // CHECK-OLD: [[TS5]] = !{i64 0, i64 1, [[TAG_CHAR]], i64 4, i64 1, [[TAG_CHAR]], i64 5, i64 1, [[TAG_CHAR]]}
-// CHECK-OLD: [[TS6]] = !{i64 0, i64 4, [[TAG_INT]], i64 1, i64 4, [[TAG_INT]], i64 2, i64 1, [[TAG_CHAR]], i64 8, i64 8, [[TAG_DOUBLE:!.+]]}
+// CHECK-OLD: [[TS6]] = !{i64 2, i64 1, [[TAG_CHAR]], i64 8, i64 8, [[TAG_DOUBLE:!.+]]}
 // CHECK-OLD: [[TAG_DOUBLE]] = !{[[DOUBLE:!.+]], [[DOUBLE]], i64 0}
 // CHECK-OLD  [[DOUBLE]] = !{!"double", [[CHAR]], i64 0}
 

@efriedma-quic
Copy link
Collaborator

This seems like it messes up the metadata in a different way: we have no representation of the bitfield at all, so it looks like it's padding.

I think we want to treat multiple adjacent bitfields as a single "field". So NamedBitfields from the testcase would have three fields in the LLVM TBAA data: one "field" containing both bitfields, followed by a field for the char, followed by a field for the double. You should be able to use CGBitFieldInfo for this, I think?

Copy link

github-actions bot commented Feb 26, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

fhahn added a commit that referenced this pull request Feb 26, 2024
@fhahn fhahn changed the title [TBAA] Skip all bitfields when generating !tbaa.struct metadata. [TBAA] Handle bitfields when generating !tbaa.struct metadata. Feb 26, 2024
@fhahn
Copy link
Contributor Author

fhahn commented Feb 26, 2024

This seems like it messes up the metadata in a different way: we have no representation of the bitfield at all, so it looks like it's padding.

I think we want to treat multiple adjacent bitfields as a single "field". So NamedBitfields from the testcase would have three fields in the LLVM TBAA data: one "field" containing both bitfields, followed by a field for the char, followed by a field for the double. You should be able to use CGBitFieldInfo for this, I think?

Thanks, updated the patch as suggested.

if ((*i)->isBitField() && !(*i)->isUnnamedBitfield()) {
unsigned CurrentBitFieldSize = 0;
unsigned CurrentBitFieldOffset = CGRL.getBitFieldInfo(*i).Offset;
while (i != e && (*i)->isBitField() && !(*i)->isUnnamedBitfield()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this handles non-zero-length unnamed bitfields correctly; they count as part of a series of bitfields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, test showing that added in 54cff50, thanks!

const CGBitFieldInfo &Info = CGRL.getBitFieldInfo(*i);
if (CurrentBitFieldSize + CurrentBitFieldOffset != Info.Offset)
break;
CurrentBitFieldSize += Info.Size;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pull the initial offset and total length of the whole run bitfields out of the CGBitFieldInfo, instead of trying to add up the sizes of the individual fields? I think it should be StorageOffset/StorageSize.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, updated the patch to add a field for the first bitfield in the run using its StorageSize, and skip bitfields where the offset in the run isn't 0 (Offset).

fhahn added a commit that referenced this pull request Feb 26, 2024
Extra tests with unnamed bitfields for
#82922.
Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whitespace is weird in a few places; otherwise looks fine.

@fhahn
Copy link
Contributor Author

fhahn commented Feb 27, 2024

Whitespace is weird in a few places; otherwise looks fine.

Argh yes, should be fixed in the latest version. Still need to figure out how to make sure all changes are formatted once the branch contains merges (before the format checker complains and shows the commits)

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fhahn fhahn merged commit d2a9df2 into llvm:main Feb 27, 2024
4 checks passed
@fhahn fhahn deleted the clang-tbaa-struct-named-bitfields branch February 27, 2024 20:09
@fhahn
Copy link
Contributor Author

fhahn commented Feb 27, 2024

Thanks!

fhahn added a commit to fhahn/llvm-project that referenced this pull request Mar 1, 2024
Additional test for llvm#82922.

(cherry-picked from f290c00)
fhahn added a commit to fhahn/llvm-project that referenced this pull request Mar 1, 2024
Extra tests with unnamed bitfields for
llvm#82922.

(cherry-picked from 54cff50)
fhahn added a commit to fhahn/llvm-project that referenced this pull request Mar 1, 2024
…82922)

At the moment, clang generates what I believe are incorrect !tbaa.struct
fields for named bitfields. At the moment, the base type size is used
for named bifields (e.g. sizeof(int)) instead of the bifield width per
field. This results in overalpping fields in !tbaa.struct metadata.

This causes incorrect results when extracting individual copied fields
from !tbaa.struct as in added in dc85719.

This patch fixes that by skipping by combining adjacent bitfields
in fields with correct sizes.

Fixes llvm#82586

(cherry-picked from d2a9df2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[TBAA][DSE] bitfield store is incorrectly optimized out
3 participants