Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Clang] Convert __builtin_dynamic_object_size into a calculation #80256

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

bwendling
Copy link
Collaborator

If we're able to, we convert a call to __builtin_dynamic_object_size into a calculation. This is a cleaner implementation, and has the potential to improve optimizations and, especially, inlining, because optimizing around an 'llvm.objectsize' call isn't trivial.

If we're able to, we convert a call to __builtin_dynamic_object_size
into a calculation. This is a cleaner implementation, and has the
potential to improve optimizations and, especially, inlining, because
optimizing around an 'llvm.objectsize' call isn't trivial.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen labels Feb 1, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Feb 1, 2024

@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Bill Wendling (bwendling)

Changes

If we're able to, we convert a call to __builtin_dynamic_object_size into a calculation. This is a cleaner implementation, and has the potential to improve optimizations and, especially, inlining, because optimizing around an 'llvm.objectsize' call isn't trivial.


Full diff: https://github.com/llvm/llvm-project/pull/80256.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+149-6)
  • (modified) clang/lib/CodeGen/CGExpr.cpp (+7-6)
  • (modified) clang/lib/CodeGen/CodeGenFunction.h (+6)
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index f3ab5ad7b08ec..2c109821de43c 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -26,6 +26,7 @@
 #include "clang/AST/Decl.h"
 #include "clang/AST/OSLog.h"
 #include "clang/AST/OperationKinds.h"
+#include "clang/AST/StmtVisitor.h"
 #include "clang/Basic/TargetBuiltins.h"
 #include "clang/Basic/TargetInfo.h"
 #include "clang/Basic/TargetOptions.h"
@@ -1051,6 +1052,145 @@ CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type,
   return Builder.CreateSelect(Cmp, Res, ConstantInt::get(ResType, 0, IsSigned));
 }
 
+namespace {
+
+/// \p StructBaseExpr returns the base \p Expr with a structure or union type.
+struct StructBaseExpr : public ConstStmtVisitor<StructBaseExpr, const Expr *> {
+  StructBaseExpr() = default;
+
+  //===--------------------------------------------------------------------===//
+  //                            Visitor Methods
+  //===--------------------------------------------------------------------===//
+
+  const Expr *VisitStmt(const Stmt *S) { return nullptr; }
+
+  const Expr *Visit(const Expr *E) {
+    QualType Ty = E->getType();
+    if (Ty->isStructureType() || Ty->isUnionType())
+      return E;
+
+    return ConstStmtVisitor<StructBaseExpr, const Expr *>::Visit(E);
+  }
+
+  const Expr *VisitDeclRefExpr(const DeclRefExpr *E) { return E; }
+
+  const Expr *VisitMemberExpr(const MemberExpr *E) {
+    return Visit(E->getBase());
+  }
+  const Expr *VisitArraySubscriptExpr(const ArraySubscriptExpr *E) {
+    return Visit(E->getBase());
+  }
+  const Expr *VisitCastExpr(const CastExpr *E) {
+    return Visit(E->getSubExpr());
+  }
+  const Expr *VisitParenExpr(const ParenExpr *E) {
+    return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryAddrOf(const clang::UnaryOperator *E) {
+    return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryDeref(const clang::UnaryOperator *E) {
+    return Visit(E->getSubExpr());
+  }
+};
+
+} // end anonymous namespace
+
+/// The offset of a field from the beginning of the record.
+llvm::Value *
+CodeGenFunction::tryEmitObjectSizeCalculation(const Expr *E, unsigned Type,
+                                              llvm::IntegerType *ResType) {
+  if ((Type & 0x01) != 0)
+    // We handle only the whole object size.
+    return nullptr;
+
+  E = E->IgnoreParenImpCasts();
+
+  const Expr *Base = StructBaseExpr().Visit(E);
+  if (!Base)
+    return nullptr;
+
+  const RecordDecl *RD = Base->getType()->getAsRecordDecl();
+  if (!RD)
+    return nullptr;
+
+  // Get the full size of the struct.
+  ASTContext &Ctx = getContext();
+  const RecordDecl *OuterRD = RD->getOuterLexicalRecordContext();
+  const clang::Type *RT = OuterRD->getTypeForDecl();
+  CharUnits RecordSize = Ctx.getTypeSizeInChars(RT);
+
+  Value *Res = nullptr;
+
+  if (const auto *U = dyn_cast<UnaryOperator>(E);
+      U && (U->getOpcode() == UO_AddrOf || U->getOpcode() == UO_Deref))
+    E = U->getSubExpr()->IgnoreParenImpCasts();
+
+  if (const auto *ASE = dyn_cast<ArraySubscriptExpr>(E)) {
+    const Expr *Idx = ASE->getIdx();
+    Base = ASE->getBase()->IgnoreParenImpCasts();
+
+    if (const auto *ME = dyn_cast<MemberExpr>(Base);
+        ME && ME->getType()->isConstantArrayType()) {
+      // The simple case:
+      //
+      //     struct s {
+      //         int arr[42];
+      //         char c;
+      //         /* others */
+      //     };
+      //
+      //     __builtin_dynamic_object_size(&p->arr[idx], 0);
+      //
+      // We can translate the __builtin_dynamic_object_call into:
+      //
+      //     sizeof(struct s) - offsetof(arr) - (idx * sizeof(int))
+      //
+      bool IsSigned = Idx->getType()->isSignedIntegerType();
+      Value *IdxInst = EmitAnyExprToTemp(Idx).getScalarVal();
+      IdxInst = Builder.CreateIntCast(IdxInst, ResType, IsSigned);
+
+      const ConstantArrayType *CAT = cast<ConstantArrayType>(ME->getType());
+      CharUnits ElemSize = Ctx.getTypeSizeInChars(CAT->getElementType());
+      Value *ElemSizeInst = Builder.getInt32(ElemSize.getQuantity());
+      ElemSizeInst = Builder.CreateIntCast(ElemSizeInst, ResType, IsSigned);
+
+      // idx * sizeof(<arr element type>)
+      Res = Builder.CreateMul(IdxInst, ElemSizeInst, "", !IsSigned, IsSigned);
+
+      // sizeof(struct s)
+      Value *RecordSizeInst = Builder.getInt32(RecordSize.getQuantity());
+      RecordSizeInst = Builder.CreateIntCast(RecordSizeInst, ResType,
+                                             RT->isSignedIntegerType());
+
+      // offsetof(arr)
+      int64_t Offset = 0;
+      getFieldOffsetInBits(OuterRD, cast<FieldDecl>(ME->getMemberDecl()),
+                           Offset);
+
+      CharUnits OffsetVal = Ctx.toCharUnitsFromBits(Offset);
+      Value *OffsetInst = Builder.getInt64(OffsetVal.getQuantity());
+      OffsetInst = Builder.CreateIntCast(OffsetInst, ResType, IsSigned);
+
+      // sizeof(struct s) - offsetof(arr) ...
+      RecordSizeInst = Builder.CreateSub(RecordSizeInst, OffsetInst, "",
+                                         !IsSigned, IsSigned);
+      // ... - (idx * sizeof(<arr element type>))
+      RecordSizeInst =
+          Builder.CreateSub(RecordSizeInst, Res, "", !IsSigned, IsSigned);
+
+      // Don't allow the index or result to be negative.
+      Value *Cmp = Builder.CreateAnd(Builder.CreateIsNotNeg(Res),
+                                     Builder.CreateIsNotNeg(RecordSizeInst));
+
+      Res = Builder.CreateSelect(Cmp, RecordSizeInst,
+                                 ConstantInt::get(ResType, 0, IsSigned));
+    }
+  }
+
+  return Res;
+}
+
 /// Returns a Value corresponding to the size of the given expression.
 /// This Value may be either of the following:
 ///   - A llvm::Argument (if E is a param with the pass_object_size attribute on
@@ -1083,18 +1223,21 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, unsigned Type,
     }
   }
 
+  // LLVM can't handle Type=3 appropriately, and __builtin_object_size shouldn't
+  // evaluate E for side-effects. In either case, we shouldn't lower to
+  // @llvm.objectsize.
+  if (Type == 3 || (!EmittedE && E->HasSideEffects(getContext())))
+    return getDefaultBuiltinObjectSizeResult(Type, ResType);
+
   if (IsDynamic) {
     // Emit special code for a flexible array member with the "counted_by"
     // attribute.
     if (Value *V = emitFlexibleArrayMemberSize(E, Type, ResType))
       return V;
-  }
 
-  // LLVM can't handle Type=3 appropriately, and __builtin_object_size shouldn't
-  // evaluate E for side-effects. In either case, we shouldn't lower to
-  // @llvm.objectsize.
-  if (Type == 3 || (!EmittedE && E->HasSideEffects(getContext())))
-    return getDefaultBuiltinObjectSizeResult(Type, ResType);
+    if (Value *V = tryEmitObjectSizeCalculation(E, Type, ResType))
+      return V;
+  }
 
   Value *Ptr = EmittedE ? EmittedE : EmitScalarExpr(E);
   assert(Ptr->getType()->isPointerTy() &&
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index 4a2f3caad6588..ce7b3040e94c1 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -4064,15 +4064,16 @@ static Address emitArraySubscriptGEP(CodeGenFunction &CGF, Address addr,
 }
 
 /// The offset of a field from the beginning of the record.
-static bool getFieldOffsetInBits(CodeGenFunction &CGF, const RecordDecl *RD,
-                                 const FieldDecl *FD, int64_t &Offset) {
-  ASTContext &Ctx = CGF.getContext();
+bool CodeGenFunction::getFieldOffsetInBits(const RecordDecl *RD,
+                                           const FieldDecl *FD,
+                                           int64_t &Offset) {
+  ASTContext &Ctx = getContext();
   const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
   unsigned FieldNo = 0;
 
   for (const Decl *D : RD->decls()) {
     if (const auto *Record = dyn_cast<RecordDecl>(D))
-      if (getFieldOffsetInBits(CGF, Record, FD, Offset)) {
+      if (getFieldOffsetInBits(Record, FD, Offset)) {
         Offset += Layout.getFieldOffset(FieldNo);
         return true;
       }
@@ -4108,11 +4109,11 @@ static std::optional<int64_t> getOffsetDifferenceInBits(CodeGenFunction &CGF,
     return std::optional<int64_t>();
 
   int64_t FD1Offset = 0;
-  if (!getFieldOffsetInBits(CGF, FD1OuterRec, FD1, FD1Offset))
+  if (!CGF.getFieldOffsetInBits(FD1OuterRec, FD1, FD1Offset))
     return std::optional<int64_t>();
 
   int64_t FD2Offset = 0;
-  if (!getFieldOffsetInBits(CGF, FD2OuterRec, FD2, FD2Offset))
+  if (!CGF.getFieldOffsetInBits(FD2OuterRec, FD2, FD2Offset))
     return std::optional<int64_t>();
 
   return std::make_optional<int64_t>(FD1Offset - FD2Offset);
diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h
index 143ad64e8816b..5a0725a76149a 100644
--- a/clang/lib/CodeGen/CodeGenFunction.h
+++ b/clang/lib/CodeGen/CodeGenFunction.h
@@ -3093,6 +3093,9 @@ class CodeGenFunction : public CodeGenTypeCache {
                                       const FieldDecl *FAMDecl,
                                       const FieldDecl *CountDecl);
 
+  bool getFieldOffsetInBits(const RecordDecl *RD, const FieldDecl *FD,
+                            int64_t &Offset);
+
   llvm::Value *EmitScalarPrePostIncDec(const UnaryOperator *E, LValue LV,
                                        bool isInc, bool isPre);
   ComplexPairTy EmitComplexPrePostIncDec(const UnaryOperator *E, LValue LV,
@@ -4892,6 +4895,9 @@ class CodeGenFunction : public CodeGenTypeCache {
                                      llvm::Value *EmittedE,
                                      bool IsDynamic);
 
+  llvm::Value *tryEmitObjectSizeCalculation(const Expr *E, unsigned Type,
+                                            llvm::IntegerType *ResType);
+
   llvm::Value *emitFlexibleArrayMemberSize(const Expr *E, unsigned Type,
                                            llvm::IntegerType *ResType);
 

@bwendling
Copy link
Collaborator Author

This is a first step in improving our __bdos implementation. It doesn't address the "sub-object" discussion from the previous PR (#78526), that will come later on. This patch takes care of a common use of __bdos that can't be calculated by the front-end (i.e. performing a __bdos on an array with an variable index). I'll be expanding the cases this can handle in the future.

@nikic nikic removed their request for review February 1, 2024 09:22
@nikic
Copy link
Contributor

nikic commented Feb 1, 2024

(Dropping myself as reviewer as I'm not really familiar with clang.)

Looks like this is missing test coverage?

Comment on lines +1083 to +1094
const Expr *VisitCastExpr(const CastExpr *E) {
return Visit(E->getSubExpr());
}
const Expr *VisitParenExpr(const ParenExpr *E) {
return Visit(E->getSubExpr());
}
const Expr *VisitUnaryAddrOf(const clang::UnaryOperator *E) {
return Visit(E->getSubExpr());
}
const Expr *VisitUnaryDeref(const clang::UnaryOperator *E) {
return Visit(E->getSubExpr());
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking through both casts and dereferences seems like it won't be correct:

struct A { int ***p; } a;
int n = __builtin_dynamic_object_size(***a.p, 0);

... looks like it'll base the result on the size of A, which is unrelated.

I think you should be more conservative about the kinds of casts you'll walk through here: bitcasts and pointer casts seem potentially OK to step through, but things like lvalue-to-rvalue conversions will be problematic.

But looking at what you're doing with this below, I think it might be possible to avoid doing this walk entirely.

Copy link
Collaborator Author

@bwendling bwendling Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Visitor is only to get the structure that the member resides in. The thing about __builtin_dynamic_object_size(**a.p, 0) is that we're not trying to calculate what p points to, but the size of the struct from p to the end (note, using ***a.p in the __bdos isn't valid because it needs a pointer for the first argument). It's counter-intuitive, but it's not (currently) supported to ask __bdos the size of what p points to is.

Comment on lines +1145 to +1148
// We can translate the __builtin_dynamic_object_call into:
//
// sizeof(struct s) - offsetof(arr) - (idx * sizeof(int))
//
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume we only need to do this for Type == 1 (for 0 and 2, we can just use the LLVM intrinsic, and you're bailing out for 3). In that case, I don't think we need to care about sizeof s here -- for a pointer to an array subscript expression where we know the array bound, we can use max(sizeof(arr) - sizeof(arr[0]) * idx, 0). Type 1 shouldn't permit overwriting the unrelated c field in the example above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR isn't dealing with the sub-object issue from the previous PR. This just mimics the current behavior of the intrinsic. I'll be extending it to sub-objects further on. The point about type 2 returning zero if the argument can point to multiple objects is messy on the GCC side of things. It's there because of how GCC's internal representation, but doesn't appear to apply to Clang.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants