Skip to content

Commit 6ddc48e

Browse files
[CIR] Update Accumulate Bits Algorithm (#1688)
This PR addresses the feedback from llvm/llvm-project#142041 (comment). Our algorithm for accumulating bitfields has diverged from CodeGen since Clang 19. There is one key difference: in CIR, we use the function `getBitfieldStorageType`, which checks whether the bit width of the current accumulation run is a valid fundamental width (i.e., a power of two: 8, 16, 32, 64). If it is, it returns a CIR type of that size otherwise, it returns an array with the closest alignment. For example, given the following struct: ```c struct S { int a : 4; int b : 27; int c : 17; int d : 2; int e : 15; unsigned f; }; ``` The CodeGen output is: ```llvm %struct.S = type { i64, i16, i32 } ``` Whereas the new CIR algorithm produces: ```mlir !cir.record<struct "S" {!cir.array<!u8i x 7>, !u16i, !u32i}> ``` In CIR, the algorithm accumulates up to field `d`, resulting in 50 accumulated bits. Since 50 is not a fundamental width, the closest alignment is 56 bits, which leads to the type `!cir.array<!u8i x 7>`. The algorithm stops before accumulating field `e` because including it would exceed the register size (64), which is not ideal. At this point, it's unclear whether this divergence from CodeGen represents an improvement. If we wanted to match CodeGen exactly, we would need to replace the use of `getBitfieldStorageType` with `getUIntNType`. The difference is that `getUIntNType` always returns the closest power-of-two integer type instead of falling back to an array when the size is not a fundamental width. With this change, CIR would match CodeGen's layout exactly. It would require the following small code change: ```diff diff --git a/clang/lib/CIR/CodeGen/CIRRecordLayoutBuilder.cpp b/clang/lib/CIR/CodeGen/CIRRecordLayoutBuilder.cpp index 7c1802b..17538b191738 100644 --- a/clang/lib/CIR/CodeGen/CIRRecordLayoutBuilder.cpp +++ b/clang/lib/CIR/CodeGen/CIRRecordLayoutBuilder.cpp @@ -616,7 +616,7 @@ CIRRecordLowering::accumulateBitFields(RecordDecl::field_iterator Field, if (!InstallBest) { // Determine if accumulating the just-seen span will create an expensive // access unit or not. - mlir::Type Type = getBitfieldStorageType(astContext.toBits(AccessSize)); + mlir::Type Type = getUIntNType(astContext.toBits(AccessSize)); if (!astContext.getTargetInfo().hasCheapUnalignedBitFieldAccess()) llvm_unreachable("NYI"); @@ -674,12 +674,12 @@ CIRRecordLowering::accumulateBitFields(RecordDecl::field_iterator Field, // remain there after a stable sort. mlir::Type Type; if (BestClipped) { - assert(getSize(getBitfieldStorageType( + assert(getSize(getUIntNType( astContext.toBits(AccessSize))) > AccessSize && "Clipped access need not be clipped"); Type = getByteArrayType(AccessSize); } else { - Type = getBitfieldStorageType(astContext.toBits(AccessSize)); + Type = getUIntNType(astContext.toBits(AccessSize)); assert(getSize(Type) == AccessSize && "Unclipped access must be clipped"); } ``` You can see a comparison between the two functions https://godbolt.org/z/qjx1MaEWG. I'm currently unsure whether using one function over the other has performance implications. Regarding the **ARM error I mentioned in the meeting: it was an `assert` I had forgotten to update. It's now fixed sorry for the confusion.**
1 parent 1ca3376 commit 6ddc48e

File tree

7 files changed

+280
-140
lines changed

7 files changed

+280
-140
lines changed

clang/lib/CIR/CodeGen/CIRRecordLayoutBuilder.cpp

Lines changed: 187 additions & 91 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
#include "CIRGenModule.h"
44
#include "CIRGenTypes.h"
55

6+
#include "TargetInfo.h"
67
#include "mlir/IR/BuiltinTypes.h"
78
#include "clang/AST/ASTContext.h"
89
#include "clang/AST/Decl.h"
@@ -77,8 +78,9 @@ struct CIRRecordLowering final {
7778
void accumulateVPtrs();
7879
void accumulateVBases();
7980
void accumulateFields();
80-
void accumulateBitFields(RecordDecl::field_iterator Field,
81-
RecordDecl::field_iterator FieldEnd);
81+
RecordDecl::field_iterator
82+
accumulateBitFields(RecordDecl::field_iterator Field,
83+
RecordDecl::field_iterator FieldEnd);
8284

8385
mlir::Type getVFPtrType();
8486

@@ -509,113 +511,207 @@ void CIRRecordLowering::fillOutputFields() {
509511
}
510512
}
511513

512-
void CIRRecordLowering::accumulateBitFields(
513-
RecordDecl::field_iterator Field, RecordDecl::field_iterator FieldEnd) {
514-
// Run stores the first element of the current run of bitfields. FieldEnd is
515-
// used as a special value to note that we don't have a current run. A
516-
// bitfield run is a contiguous collection of bitfields that can be stored in
517-
// the same storage block. Zero-sized bitfields and bitfields that would
518-
// cross an alignment boundary break a run and start a new one.
519-
RecordDecl::field_iterator Run = FieldEnd;
520-
// Tail is the offset of the first bit off the end of the current run. It's
521-
// used to determine if the ASTRecordLayout is treating these two bitfields as
522-
// contiguous. StartBitOffset is offset of the beginning of the Run.
523-
uint64_t StartBitOffset, Tail = 0;
524-
if (isDiscreteBitFieldABI()) {
525-
llvm_unreachable("NYI");
526-
}
514+
RecordDecl::field_iterator
515+
CIRRecordLowering::accumulateBitFields(RecordDecl::field_iterator Field,
516+
RecordDecl::field_iterator FieldEnd) {
527517

528-
// Check if OffsetInRecord (the size in bits of the current run) is better
529-
// as a single field run. When OffsetInRecord has legal integer width, and
530-
// its bitfield offset is naturally aligned, it is better to make the
531-
// bitfield a separate storage component so as it can be accessed directly
532-
// with lower cost.
533-
auto IsBetterAsSingleFieldRun = [&](uint64_t OffsetInRecord,
534-
uint64_t StartBitOffset,
535-
uint64_t nextTail = 0) {
536-
if (!cirGenTypes.getModule().getCodeGenOpts().FineGrainedBitfieldAccesses)
537-
return false;
518+
if (isDiscreteBitFieldABI())
538519
llvm_unreachable("NYI");
539-
// if (OffsetInRecord < 8 || !llvm::isPowerOf2_64(OffsetInRecord) ||
540-
// !DataLayout.fitsInLegalInteger(OffsetInRecord))
541-
// return false;
542-
// Make sure StartBitOffset is naturally aligned if it is treated as an
543-
// IType integer.
544-
// if (StartBitOffset %
545-
// astContext.toBits(getAlignment(getUIntNType(OffsetInRecord))) !=
546-
// 0)
547-
// return false;
548-
return true;
549-
};
550520

551-
// The start field is better as a single field run.
552-
bool StartFieldAsSingleRun = false;
521+
CharUnits RegSize =
522+
bitsToCharUnits(astContext.getTargetInfo().getRegisterWidth());
523+
unsigned CharBits = astContext.getCharWidth();
524+
525+
// Data about the start of the span we're accumulating to create an access
526+
// unit from. Begin is the first bitfield of the span. If Begin is FieldEnd,
527+
// we've not got a current span. The span starts at the BeginOffset character
528+
// boundary. BitSizeSinceBegin is the size (in bits) of the span -- this might
529+
// include padding when we've advanced to a subsequent bitfield run.
530+
RecordDecl::field_iterator Begin = FieldEnd;
531+
CharUnits BeginOffset;
532+
uint64_t BitSizeSinceBegin;
533+
534+
// The (non-inclusive) end of the largest acceptable access unit we've found
535+
// since Begin. If this is Begin, we're gathering the initial set of bitfields
536+
// of a new span. BestEndOffset is the end of that acceptable access unit --
537+
// it might extend beyond the last character of the bitfield run, using
538+
// available padding characters.
539+
RecordDecl::field_iterator BestEnd = Begin;
540+
CharUnits BestEndOffset;
541+
bool BestClipped; // Whether the representation must be in a byte array.
542+
553543
for (;;) {
554-
// Check to see if we need to start a new run.
555-
if (Run == FieldEnd) {
556-
// If we're out of fields, return.
557-
if (Field == FieldEnd)
544+
// AtAlignedBoundary is true if Field is the (potential) start of a new
545+
// span (or the end of the bitfields). When true, LimitOffset is the
546+
// character offset of that span and Barrier indicates whether the new
547+
// span cannot be merged into the current one.
548+
bool AtAlignedBoundary = false;
549+
bool Barrier = false; // a barrier can be a zero Bit Width or non bit member
550+
if (Field != FieldEnd && Field->isBitField()) {
551+
uint64_t BitOffset = getFieldBitOffset(*Field);
552+
if (Begin == FieldEnd) {
553+
// Beginning a new span.
554+
Begin = Field;
555+
BestEnd = Begin;
556+
557+
assert((BitOffset % CharBits) == 0 && "Not at start of char");
558+
BeginOffset = bitsToCharUnits(BitOffset);
559+
BitSizeSinceBegin = 0;
560+
} else if ((BitOffset % CharBits) != 0) {
561+
// Bitfield occupies the same character as previous bitfield, it must be
562+
// part of the same span. This can include zero-length bitfields, should
563+
// the target not align them to character boundaries. Such non-alignment
564+
// is at variance with the standards, which require zero-length
565+
// bitfields be a barrier between access units. But of course we can't
566+
// achieve that in the middle of a character.
567+
assert(BitOffset ==
568+
astContext.toBits(BeginOffset) + BitSizeSinceBegin &&
569+
"Concatenating non-contiguous bitfields");
570+
} else {
571+
// Bitfield potentially begins a new span. This includes zero-length
572+
// bitfields on non-aligning targets that lie at character boundaries
573+
// (those are barriers to merging).
574+
if (Field->isZeroLengthBitField())
575+
Barrier = true;
576+
AtAlignedBoundary = true;
577+
}
578+
} else {
579+
// We've reached the end of the bitfield run. Either we're done, or this
580+
// is a barrier for the current span.
581+
if (Begin == FieldEnd)
558582
break;
559-
// Any non-zero-length bitfield can start a new run.
560-
if (!Field->isZeroLengthBitField()) {
561-
Run = Field;
562-
StartBitOffset = getFieldBitOffset(*Field);
563-
Tail = StartBitOffset + Field->getBitWidthValue();
564-
StartFieldAsSingleRun =
565-
IsBetterAsSingleFieldRun(Tail - StartBitOffset, StartBitOffset);
583+
584+
Barrier = true;
585+
AtAlignedBoundary = true;
586+
}
587+
588+
// InstallBest indicates whether we should create an access unit for the
589+
// current best span: fields [Begin, BestEnd) occupying characters
590+
// [BeginOffset, BestEndOffset).
591+
bool InstallBest = false;
592+
if (AtAlignedBoundary) {
593+
// Field is the start of a new span or the end of the bitfields. The
594+
// just-seen span now extends to BitSizeSinceBegin.
595+
596+
// Determine if we can accumulate that just-seen span into the current
597+
// accumulation.
598+
CharUnits AccessSize = bitsToCharUnits(BitSizeSinceBegin + CharBits - 1);
599+
if (BestEnd == Begin) {
600+
// This is the initial run at the start of a new span. By definition,
601+
// this is the best seen so far.
602+
BestEnd = Field;
603+
BestEndOffset = BeginOffset + AccessSize;
604+
// Assume clipped until proven not below.
605+
BestClipped = true;
606+
if (!BitSizeSinceBegin)
607+
// A zero-sized initial span -- this will install nothing and reset
608+
// for another.
609+
InstallBest = true;
610+
} else if (AccessSize > RegSize) {
611+
// Accumulating the just-seen span would create a multi-register access
612+
// unit, which would increase register pressure.
613+
InstallBest = true;
614+
}
615+
616+
if (!InstallBest) {
617+
// Determine if accumulating the just-seen span will create an expensive
618+
// access unit or not.
619+
mlir::Type Type = getUIntNType(astContext.toBits(AccessSize));
620+
if (!astContext.getTargetInfo().hasCheapUnalignedBitFieldAccess())
621+
llvm_unreachable("NYI");
622+
623+
if (!InstallBest) {
624+
// Find the next used storage offset to determine what the limit of
625+
// the current span is. That's either the offset of the next field
626+
// with storage (which might be Field itself) or the end of the
627+
// non-reusable tail padding.
628+
CharUnits LimitOffset;
629+
for (auto Probe = Field; Probe != FieldEnd; ++Probe)
630+
if (!isEmptyFieldForLayout(astContext, *Probe)) {
631+
// A member with storage sets the limit.
632+
assert((getFieldBitOffset(*Probe) % CharBits) == 0 &&
633+
"Next storage is not byte-aligned");
634+
LimitOffset = bitsToCharUnits(getFieldBitOffset(*Probe));
635+
goto FoundLimit;
636+
}
637+
LimitOffset = cxxRecordDecl ? astRecordLayout.getNonVirtualSize()
638+
: astRecordLayout.getDataSize();
639+
FoundLimit:
640+
CharUnits TypeSize = getSize(Type);
641+
if (BeginOffset + TypeSize <= LimitOffset) {
642+
// There is space before LimitOffset to create a naturally-sized
643+
// access unit.
644+
BestEndOffset = BeginOffset + TypeSize;
645+
BestEnd = Field;
646+
BestClipped = false;
647+
}
648+
if (Barrier) {
649+
// The next field is a barrier that we cannot merge across.
650+
InstallBest = true;
651+
} else if (cirGenTypes.getModule()
652+
.getCodeGenOpts()
653+
.FineGrainedBitfieldAccesses) {
654+
llvm_unreachable("NYI");
655+
} else {
656+
// Otherwise, we're not installing. Update the bit size
657+
// of the current span to go all the way to LimitOffset, which is
658+
// the (aligned) offset of next bitfield to consider.
659+
BitSizeSinceBegin = astContext.toBits(LimitOffset - BeginOffset);
660+
}
661+
}
566662
}
567-
++Field;
568-
continue;
569663
}
570664

571-
// If the start field of a new run is better as a single run, or if current
572-
// field (or consecutive fields) is better as a single run, or if current
573-
// field has zero width bitfield and either UseZeroLengthBitfieldAlignment
574-
// or UseBitFieldTypeAlignment is set to true, or if the offset of current
575-
// field is inconsistent with the offset of previous field plus its offset,
576-
// skip the block below and go ahead to emit the storage. Otherwise, try to
577-
// add bitfields to the run.
578-
uint64_t nextTail = Tail;
579-
if (Field != FieldEnd)
580-
nextTail += Field->getBitWidthValue();
581-
582-
if (!StartFieldAsSingleRun && Field != FieldEnd &&
583-
!IsBetterAsSingleFieldRun(Tail - StartBitOffset, StartBitOffset,
584-
nextTail) &&
585-
(!Field->isZeroLengthBitField() ||
586-
(!astContext.getTargetInfo().useZeroLengthBitfieldAlignment() &&
587-
!astContext.getTargetInfo().useBitFieldTypeAlignment())) &&
588-
Tail == getFieldBitOffset(*Field)) {
589-
Tail = nextTail;
665+
if (InstallBest) {
666+
assert((Field == FieldEnd || !Field->isBitField() ||
667+
(getFieldBitOffset(*Field) % CharBits) == 0) &&
668+
"Installing but not at an aligned bitfield or limit");
669+
CharUnits AccessSize = BestEndOffset - BeginOffset;
670+
if (!AccessSize.isZero()) {
671+
// Add the storage member for the access unit to the record. The
672+
// bitfields get the offset of their storage but come afterward and
673+
// remain there after a stable sort.
674+
mlir::Type Type;
675+
if (BestClipped) {
676+
assert(getSize(getUIntNType(astContext.toBits(AccessSize))) >
677+
AccessSize &&
678+
"Clipped access need not be clipped");
679+
Type = getByteArrayType(AccessSize);
680+
} else {
681+
Type = getUIntNType(astContext.toBits(AccessSize));
682+
assert(getSize(Type) == AccessSize &&
683+
"Unclipped access must be clipped");
684+
}
685+
members.push_back(StorageInfo(BeginOffset, Type));
686+
for (; Begin != BestEnd; ++Begin)
687+
if (!Begin->isZeroLengthBitField())
688+
members.push_back(MemberInfo(
689+
BeginOffset, MemberInfo::InfoKind::Field, nullptr, *Begin));
690+
}
691+
// Reset to start a new span.
692+
Field = BestEnd;
693+
Begin = FieldEnd;
694+
} else {
695+
assert(Field != FieldEnd && Field->isBitField() &&
696+
"Accumulating past end of bitfields");
697+
assert(!Barrier && "Accumulating across barrier");
698+
// Accumulate this bitfield into the current (potential) span.
699+
BitSizeSinceBegin += Field->getBitWidthValue();
590700
++Field;
591-
continue;
592701
}
593-
594-
// We've hit a break-point in the run and need to emit a storage field.
595-
auto Type = getBitfieldStorageType(Tail - StartBitOffset);
596-
597-
// Add the storage member to the record and set the bitfield info for all of
598-
// the bitfields in the run. Bitfields get the offset of their storage but
599-
// come afterward and remain there after a stable sort.
600-
members.push_back(StorageInfo(bitsToCharUnits(StartBitOffset), Type));
601-
for (; Run != Field; ++Run)
602-
members.push_back(MemberInfo(bitsToCharUnits(StartBitOffset),
603-
MemberInfo::InfoKind::Field, nullptr, *Run));
604-
Run = FieldEnd;
605-
StartFieldAsSingleRun = false;
606702
}
703+
704+
return Field;
607705
}
608706

609707
void CIRRecordLowering::accumulateFields() {
610708
for (RecordDecl::field_iterator field = recordDecl->field_begin(),
611709
fieldEnd = recordDecl->field_end();
612710
field != fieldEnd;) {
613711
if (field->isBitField()) {
614-
RecordDecl::field_iterator start = field;
615-
// Iterate to gather the list of bitfields.
616-
for (++field; field != fieldEnd && field->isBitField(); ++field)
617-
;
618-
accumulateBitFields(start, field);
712+
field = accumulateBitFields(field, fieldEnd);
713+
assert((field == fieldEnd || !field->isBitField()) &&
714+
"Failed to accumulate all the bitfields");
619715
} else if (!field->isZeroSize(astContext)) {
620716
members.push_back(MemberInfo{bitsToCharUnits(getFieldBitOffset(*field)),
621717
MemberInfo::InfoKind::Field,

clang/lib/CIR/CodeGen/TargetInfo.cpp

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,42 @@ static QualType useFirstFieldIfTransparentUnion(QualType Ty) {
2525
return Ty;
2626
}
2727

28+
bool clang::CIRGen::isEmptyRecordForLayout(const ASTContext &Context,
29+
QualType T) {
30+
const RecordType *RT = T->getAs<RecordType>();
31+
if (!RT)
32+
return false;
33+
34+
const RecordDecl *RD = RT->getDecl();
35+
36+
// If this is a C++ record, check the bases first.
37+
if (const CXXRecordDecl *CXXRD = dyn_cast<CXXRecordDecl>(RD)) {
38+
if (CXXRD->isDynamicClass())
39+
return false;
40+
41+
for (const auto &I : CXXRD->bases())
42+
if (!isEmptyRecordForLayout(Context, I.getType()))
43+
return false;
44+
}
45+
46+
for (const auto *I : RD->fields())
47+
if (!isEmptyFieldForLayout(Context, I))
48+
return false;
49+
50+
return true;
51+
}
52+
53+
bool clang::CIRGen::isEmptyFieldForLayout(const ASTContext &Context,
54+
const FieldDecl *FD) {
55+
if (FD->isZeroLengthBitField())
56+
return true;
57+
58+
if (FD->isUnnamedBitField())
59+
return false;
60+
61+
return isEmptyRecordForLayout(Context, FD->getType());
62+
}
63+
2864
namespace {
2965

3066
/// The default implementation for ABI specific

clang/lib/CIR/CodeGen/TargetInfo.h

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,16 @@
2525

2626
namespace clang::CIRGen {
2727

28+
/// isEmptyFieldForLayout - Return true if the field is "empty", that is,
29+
/// either a zero-width bit-field or an isEmptyRecordForLayout.
30+
bool isEmptyFieldForLayout(const ASTContext &Context, const FieldDecl *FD);
31+
32+
/// isEmptyRecordForLayout - Return true if a structure contains only empty
33+
/// base classes (per isEmptyRecordForLayout) and fields (per
34+
/// isEmptyFieldForLayout). Note, C++ record fields are considered empty
35+
/// if the [[no_unique_address]] attribute would have made them empty.
36+
bool isEmptyRecordForLayout(const ASTContext &Context, QualType T);
37+
2838
class CIRGenFunction;
2939
class CIRGenModule;
3040
class CIRGenBuilderTy;

0 commit comments

Comments
 (0)