[clang][UBSan] Add implicit conversion check for bitfields #75481

Zonotora · 2023-12-14T15:02:37Z

This patch implements the implicit truncation and implicit sign change checks for bitfields using UBSan. E.g.,
-fsanitize=implicit-bitfield-truncation and -fsanitize=implicit-bitfield-sign-change.
~~However, right now some unnecessary emits are generated that ideally will be removed in the future. Let me explain.~~

This commit implements the truncation and sign change checks whenever we are emitting a EmitStoreThroughBitfieldLValue. I use the llvm::Value updated through EmitStoreThroughBitfieldLValue to check whether the value changed or not. However, calling EmitStoreThroughBitfieldLValue means we have already evaluated RHS and potentially emitted a truncation or sign check already, which is the case if we have a bitfield of type char and the RHS is of type int. Thus, in this case we may report the truncation or sign change error two times at runtime under the right circumstances which is not ideal. E.g.,

#include <stdio.h>
#include <math.h>

typedef struct {
    unsigned char a:4;
} X;

int main(void) {
    X x;
    unsigned int a = 272;
    x.a = a;
    return 0;
}

~~produces~~

main.c:11:11: runtime error: implicit conversion from type 'unsigned int' of value 272 (32-bit, unsigned) to type 'unsigned char' changed the value to 16 (8-bit, unsigned)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior main.c:12:11 in
main.c:11:9: runtime error: implicit conversion from type 'unsigned int' of value 16 (32-bit, unsigned) to type 'unsigned char' changed the value to 0 (4-bit bitfield, unsigned)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior main.c:12:9 in

~~because we truncate the value 272 into a char when evaluating the RHS (resulting in the value 16) and then truncate it again when assigning it to the bitfield.~~

I thought of disabling the relevant sanitizer checks when evaluating the RHS, but to make it work for all cases I need to extract the previous (before casting it to char in the above case) llvm::Value from Visit(E->getRHS()) in order to compare it, which seemed more complicated for a first approach. The current approach is easier to iterate upon and is more isolated, but again, not ideal.

I would appreciate feedback so that I can advance this to a mergeable state. This is my first commit here, so I have probably missed some essentials!

github-actions · 2023-12-14T15:02:55Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2023-12-14T15:03:22Z

@llvm/pr-subscribers-clang-driver
@llvm/pr-subscribers-compiler-rt-sanitizer
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Axel Lundberg (Zonotora)

Changes

This patch implements the implicit truncation and implicit sign change checks for bitfields using UBSan. E.g.,
-fsanitize=implicit-integer-truncation and -fsanitize=implicit-integer-sign-change.
However, right now some unnecessary emits are generated that ideally will be removed in the future. Let me explain.

This commit implements the truncation and sign change checks whenever we are emitting a EmitStoreThroughBitfieldLValue. I use the llvm::Value updated through EmitStoreThroughBitfieldLValue to check whether the value changed or not. However, calling EmitStoreThroughBitfieldLValue means we have already evaluated RHS and potentially emitted a truncation or sign check already, which is the case if we have a bitfield of type char and the RHS is of type int. Thus, in this case we may report the truncation or sign change error two times at runtime under the right circumstances which is not ideal. E.g.,

#include &lt;stdio.h&gt;
#include &lt;math.h&gt;

typedef struct {
    unsigned char a:4;
} X;

int main(void) {
    X x;
    unsigned int a = 272;
    x.a = a;
    return 0;
}

produces

main.c:11:11: runtime error: implicit conversion from type 'unsigned int' of value 272 (32-bit, unsigned) to type 'unsigned char' changed the value to 16 (8-bit, unsigned)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior main.c:12:11 in
main.c:11:9: runtime error: implicit conversion from type 'unsigned int' of value 16 (32-bit, unsigned) to type 'unsigned char' changed the value to 0 (4-bit bitfield, unsigned)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior main.c:12:9 in

because we truncate the value 272 into a char when evaluating the RHS (resulting in the value 16) and then truncate it again when assigning it to the bitfield.

I thought of disabling the relevant sanitizer checks when evaluating the RHS, but to make it work for all cases I need to extract the previous (before casting it to char in the above case) llvm::Value from Visit(E->getRHS()) in order to compare it, which seemed more complicated for a first approach. The current approach is easier to iterate upon and is more isolated, but again, not ideal.

I would appreciate feedback so that I can advance this to a mergeable state. This is my first commit here, so I have probably missed some essentials!

Patch is 20.46 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/75481.diff

5 Files Affected:

(modified) clang/lib/CodeGen/CGExprScalar.cpp (+283-26)
(added) clang/test/CodeGen/ubsan-bitfield-conversion.c (+27)
(modified) compiler-rt/lib/ubsan/ubsan_diag.h (+1-1)
(modified) compiler-rt/lib/ubsan/ubsan_handlers.cpp (+6-4)
(modified) compiler-rt/lib/ubsan/ubsan_handlers.h (+1)

diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp
index 41ad2ddac30d2d..c93c5d89583d68 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -15,6 +15,7 @@
 #include "CGDebugInfo.h"
 #include "CGObjCRuntime.h"
 #include "CGOpenMPRuntime.h"
+#include "CGRecordLayout.h"
 #include "CodeGenFunction.h"
 #include "CodeGenModule.h"
 #include "ConstantEmitter.h"
@@ -324,6 +325,25 @@ class ScalarExprEmitter
   void EmitIntegerSignChangeCheck(Value *Src, QualType SrcType, Value *Dst,
                                   QualType DstType, SourceLocation Loc);
 
+  /// Emit a check that an [implicit] truncation of a bitfield does not
+  /// discard any bits. It is not UB, so we use the value after truncation.
+  void EmitBitfieldTruncationCheck(Value *Src, QualType SrcType, Value *Dst,
+                                   QualType DstType, const CGBitFieldInfo &Info,
+                                   SourceLocation Loc);
+
+  /// Emit a check that an [implicit] conversion of a bitfield does not change
+  /// the sign of the value. It is not UB, so we use the value after conversion.
+  /// NOTE: Src and Dst may be the exact same value! (point to the same thing)
+  void EmitBitfieldSignChangeCheck(Value *Src, QualType SrcType, Value *Dst,
+                                   QualType DstType, const CGBitFieldInfo &Info,
+                                   SourceLocation Loc);
+
+  /// Emit a check that an [implicit] conversion of a bitfield. It is not UB,
+  /// so we use the value after conversion.
+  void EmitBitfieldConversionCheck(Value *Src, QualType SrcType, Value *Dst,
+                                   QualType DstType, const CGBitFieldInfo &Info,
+                                   SourceLocation Loc);
+
   /// Emit a conversion from the specified type to the specified destination
   /// type, both of which are LLVM scalar types.
   struct ScalarConversionOpts {
@@ -1094,6 +1114,27 @@ void ScalarExprEmitter::EmitIntegerTruncationCheck(Value *Src, QualType SrcType,
                 {Src, Dst});
 }
 
+static llvm::Value *EmitIsNegativeTestHelper(Value *V, QualType VType,
+                                             const char *Name,
+                                             CGBuilderTy &Builder) {
+  // NOTE: zero value is considered to be non-negative.
+  // Is this value a signed type?
+  bool VSigned = VType->isSignedIntegerOrEnumerationType();
+  llvm::Type *VTy = V->getType();
+  if (!VSigned) {
+    // If the value is unsigned, then it is never negative.
+    // FIXME: can we encounter non-scalar VTy here?
+    return llvm::ConstantInt::getFalse(VTy->getContext());
+  }
+  // Get the zero of the same type with which we will be comparing.
+  llvm::Constant *Zero = llvm::ConstantInt::get(VTy, 0);
+  // %V.isnegative = icmp slt %V, 0
+  // I.e is %V *strictly* less than zero, does it have negative value?
+  return Builder.CreateICmp(llvm::ICmpInst::ICMP_SLT, V, Zero,
+                            llvm::Twine(Name) + "." + V->getName() +
+                                ".negativitycheck");
+};
+
 // Should be called within CodeGenFunction::SanitizerScope RAII scope.
 // Returns 'i1 false' when the conversion Src -> Dst changed the sign.
 static std::pair<ScalarExprEmitter::ImplicitConversionCheckKind,
@@ -1118,30 +1159,12 @@ EmitIntegerSignChangeCheckHelper(Value *Src, QualType SrcType, Value *Dst,
   assert(((SrcBits != DstBits) || (SrcSigned != DstSigned)) &&
          "either the widths should be different, or the signednesses.");
 
-  // NOTE: zero value is considered to be non-negative.
-  auto EmitIsNegativeTest = [&Builder](Value *V, QualType VType,
-                                       const char *Name) -> Value * {
-    // Is this value a signed type?
-    bool VSigned = VType->isSignedIntegerOrEnumerationType();
-    llvm::Type *VTy = V->getType();
-    if (!VSigned) {
-      // If the value is unsigned, then it is never negative.
-      // FIXME: can we encounter non-scalar VTy here?
-      return llvm::ConstantInt::getFalse(VTy->getContext());
-    }
-    // Get the zero of the same type with which we will be comparing.
-    llvm::Constant *Zero = llvm::ConstantInt::get(VTy, 0);
-    // %V.isnegative = icmp slt %V, 0
-    // I.e is %V *strictly* less than zero, does it have negative value?
-    return Builder.CreateICmp(llvm::ICmpInst::ICMP_SLT, V, Zero,
-                              llvm::Twine(Name) + "." + V->getName() +
-                                  ".negativitycheck");
-  };
-
   // 1. Was the old Value negative?
-  llvm::Value *SrcIsNegative = EmitIsNegativeTest(Src, SrcType, "src");
+  llvm::Value *SrcIsNegative =
+      EmitIsNegativeTestHelper(Src, SrcType, "src", Builder);
   // 2. Is the new Value negative?
-  llvm::Value *DstIsNegative = EmitIsNegativeTest(Dst, DstType, "dst");
+  llvm::Value *DstIsNegative =
+      EmitIsNegativeTestHelper(Dst, DstType, "dst", Builder);
   // 3. Now, was the 'negativity status' preserved during the conversion?
   //    NOTE: conversion from negative to zero is considered to change the sign.
   //    (We want to get 'false' when the conversion changed the sign)
@@ -1236,6 +1259,221 @@ void ScalarExprEmitter::EmitIntegerSignChangeCheck(Value *Src, QualType SrcType,
                 {Src, Dst});
 }
 
+// Should be called within CodeGenFunction::SanitizerScope RAII scope.
+// Returns 'i1 false' when the truncation Src -> Dst was lossy.
+static std::pair<ScalarExprEmitter::ImplicitConversionCheckKind,
+                 std::pair<llvm::Value *, SanitizerMask>>
+EmitBitfieldTruncationCheckHelper(Value *Src, QualType SrcType, Value *Dst,
+                                  QualType DstType, CGBuilderTy &Builder) {
+
+  llvm::Type *SrcTy = Src->getType();
+  llvm::Type *DstTy = Dst->getType();
+  (void)SrcTy; // Only used in assert()
+  (void)DstTy; // Only used in assert()
+
+  // This should be truncation of integral types.
+  assert(isa<llvm::IntegerType>(SrcTy) && isa<llvm::IntegerType>(DstTy) &&
+         "non-integer llvm type");
+
+  bool SrcSigned = SrcType->isSignedIntegerOrEnumerationType();
+  bool DstSigned = DstType->isSignedIntegerOrEnumerationType();
+
+  ScalarExprEmitter::ImplicitConversionCheckKind Kind;
+  SanitizerMask Mask;
+  if (!SrcSigned && !DstSigned) {
+    Kind = ScalarExprEmitter::ICCK_UnsignedIntegerTruncation;
+    Mask = SanitizerKind::ImplicitUnsignedIntegerTruncation;
+  } else {
+    Kind = ScalarExprEmitter::ICCK_SignedIntegerTruncation;
+    Mask = SanitizerKind::ImplicitSignedIntegerTruncation;
+  }
+
+  // Since Src already has the same type as Dst, we don't have
+  // to sign extend or truncate the Src value.
+  llvm::Value *CheckV = Builder.CreateICmpEQ(Dst, Src, "bf.truncheck");
+  // If the comparison result is 'i1 false', then the truncation was lossy.
+
+  return std::make_pair(Kind, std::make_pair(CheckV, Mask));
+}
+
+void ScalarExprEmitter::EmitBitfieldTruncationCheck(
+    Value *Src, QualType SrcType, Value *Dst, QualType DstType,
+    const CGBitFieldInfo &Info, SourceLocation Loc) {
+
+  if (!CGF.SanOpts.hasOneOf(SanitizerKind::ImplicitIntegerTruncation))
+    return;
+
+  // TODO: Calculate src width to avoid emitting code
+  // for unecessary cases.
+  unsigned SrcBits = ConvertType(SrcType)->getScalarSizeInBits();
+  unsigned BitfieldWidth = Info.Size;
+  // This must be truncation. Else we do not care.
+  if (SrcBits <= BitfieldWidth)
+    return;
+
+  // If the integer sign change sanitizer is enabled,
+  // and we are truncating from larger unsigned type to smaller signed type,
+  // let that next sanitizer deal with it.
+  bool SrcSigned = SrcType->isSignedIntegerOrEnumerationType();
+  bool DstSigned = DstType->isSignedIntegerOrEnumerationType();
+  if (CGF.SanOpts.has(SanitizerKind::ImplicitIntegerSignChange) &&
+      (!SrcSigned && DstSigned))
+    return;
+
+  CodeGenFunction::SanitizerScope SanScope(&CGF);
+
+  std::pair<ScalarExprEmitter::ImplicitConversionCheckKind,
+            std::pair<llvm::Value *, SanitizerMask>>
+      Check = EmitBitfieldTruncationCheckHelper(Src, SrcType, Dst, DstType,
+                                                Builder);
+  // If the comparison result is 'i1 false', then the truncation was lossy.
+
+  llvm::Constant *StaticArgs[] = {
+      CGF.EmitCheckSourceLocation(Loc), CGF.EmitCheckTypeDescriptor(SrcType),
+      CGF.EmitCheckTypeDescriptor(DstType),
+      llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first),
+      llvm::ConstantInt::get(Builder.getInt32Ty(), Info.Size)};
+  CGF.EmitCheck(Check.second, SanitizerHandler::ImplicitConversion, StaticArgs,
+                {Src, Dst});
+}
+
+// Should be called within CodeGenFunction::SanitizerScope RAII scope.
+// Returns 'i1 false' when the conversion Src -> Dst changed the sign.
+static std::pair<ScalarExprEmitter::ImplicitConversionCheckKind,
+                 std::pair<llvm::Value *, SanitizerMask>>
+EmitBitfieldSignChangeCheckHelper(Value *Src, QualType SrcType,
+                                  unsigned SrcBits, Value *Dst,
+                                  QualType DstType, unsigned DstBits,
+                                  CGBuilderTy &Builder) {
+  llvm::Type *SrcTy = Src->getType();
+  llvm::Type *DstTy = Dst->getType();
+  (void)SrcTy; // Only used in assert()
+  (void)DstTy; // Only used in assert()
+
+  // This should be truncation of integral types.
+  assert(isa<llvm::IntegerType>(SrcTy) && isa<llvm::IntegerType>(DstTy) &&
+         "non-integer llvm type");
+
+  bool SrcSigned = SrcType->isSignedIntegerOrEnumerationType();
+  bool DstSigned = DstType->isSignedIntegerOrEnumerationType();
+
+  assert(((SrcBits != DstBits) || (SrcSigned != DstSigned)) &&
+         "either the widths should be different, or the signednesses.");
+
+  // 1. Was the old Value negative?
+  llvm::Value *SrcIsNegative =
+      EmitIsNegativeTestHelper(Src, SrcType, "bf.src", Builder);
+  // 2. Is the new Value negative?
+  llvm::Value *DstIsNegative =
+      EmitIsNegativeTestHelper(Dst, DstType, "bf.dst", Builder);
+  // 3. Now, was the 'negativity status' preserved during the conversion?
+  //    NOTE: conversion from negative to zero is considered to change the sign.
+  //    (We want to get 'false' when the conversion changed the sign)
+  //    So we should just equality-compare the negativity statuses.
+  llvm::Value *Check = nullptr;
+  Check =
+      Builder.CreateICmpEQ(SrcIsNegative, DstIsNegative, "bf.signchangecheck");
+  // If the comparison result is 'false', then the conversion changed the sign.
+  return std::make_pair(
+      ScalarExprEmitter::ICCK_IntegerSignChange,
+      std::make_pair(Check, SanitizerKind::ImplicitIntegerSignChange));
+}
+
+void ScalarExprEmitter::EmitBitfieldSignChangeCheck(
+    Value *Src, QualType SrcType, Value *Dst, QualType DstType,
+    const CGBitFieldInfo &Info, SourceLocation Loc) {
+
+  if (!CGF.SanOpts.has(SanitizerKind::ImplicitIntegerSignChange))
+    return;
+
+  bool SrcSigned = SrcType->isSignedIntegerOrEnumerationType();
+  bool DstSigned = DstType->isSignedIntegerOrEnumerationType();
+  unsigned SrcBits = ConvertType(SrcType)->getScalarSizeInBits();
+  unsigned DstBits = Info.Size;
+
+  // Now, we do not need to emit the check in *all* of the cases.
+  // We can avoid emitting it in some obvious cases where it would have been
+  // dropped by the opt passes (instcombine) always anyways.
+  // If it's a cast between effectively the same type, no check.
+  // NOTE: this is *not* equivalent to checking the canonical types.
+  if (SrcSigned == DstSigned && SrcBits == DstBits)
+    return;
+  // At least one of the values needs to have signed type.
+  // If both are unsigned, then obviously, neither of them can be negative.
+  if (!SrcSigned && !DstSigned)
+    return;
+  // If the conversion is to *larger* *signed* type, then no check is needed.
+  // Because either sign-extension happens (so the sign will remain),
+  // or zero-extension will happen (the sign bit will be zero.)
+  if ((DstBits > SrcBits) && DstSigned)
+    return;
+  if (CGF.SanOpts.has(SanitizerKind::ImplicitSignedIntegerTruncation) &&
+      (SrcBits > DstBits) && SrcSigned) {
+    // If the signed integer truncation sanitizer is enabled,
+    // and this is a truncation from signed type, then no check is needed.
+    // Because here sign change check is interchangeable with truncation check.
+    return;
+  }
+  // That's it. We can't rule out any more cases with the data we have.
+
+  CodeGenFunction::SanitizerScope SanScope(&CGF);
+
+  std::pair<ScalarExprEmitter::ImplicitConversionCheckKind,
+            std::pair<llvm::Value *, SanitizerMask>>
+      Check;
+
+  // Each of these checks needs to return 'false' when an issue was detected.
+  ImplicitConversionCheckKind CheckKind;
+  llvm::SmallVector<std::pair<llvm::Value *, SanitizerMask>, 2> Checks;
+  // So we can 'and' all the checks together, and still get 'false',
+  // if at least one of the checks detected an issue.
+
+  Check = EmitBitfieldSignChangeCheckHelper(Src, SrcType, SrcBits, Dst, DstType,
+                                            DstBits, Builder);
+  CheckKind = Check.first;
+  Checks.emplace_back(Check.second);
+
+  if (CGF.SanOpts.has(SanitizerKind::ImplicitSignedIntegerTruncation) &&
+      (SrcBits > DstBits) && !SrcSigned && DstSigned) {
+    // If the signed integer truncation sanitizer was enabled,
+    // and we are truncating from larger unsigned type to smaller signed type,
+    // let's handle the case we skipped in that check.
+    Check =
+        EmitBitfieldTruncationCheckHelper(Src, SrcType, Dst, DstType, Builder);
+    CheckKind = ICCK_SignedIntegerTruncationOrSignChange;
+    Checks.emplace_back(Check.second);
+    // If the comparison result is 'i1 false', then the truncation was lossy.
+  }
+
+  llvm::Constant *StaticArgs[] = {
+      CGF.EmitCheckSourceLocation(Loc), CGF.EmitCheckTypeDescriptor(SrcType),
+      CGF.EmitCheckTypeDescriptor(DstType),
+      llvm::ConstantInt::get(Builder.getInt8Ty(), CheckKind),
+      llvm::ConstantInt::get(Builder.getInt32Ty(), Info.Size)};
+  // EmitCheck() will 'and' all the checks together.
+  CGF.EmitCheck(Checks, SanitizerHandler::ImplicitConversion, StaticArgs,
+                {Src, Dst});
+}
+
+void ScalarExprEmitter::EmitBitfieldConversionCheck(
+    Value *Src, QualType SrcType, Value *Dst, QualType DstType,
+    const CGBitFieldInfo &Info, SourceLocation Loc) {
+
+  if (!CGF.SanOpts.hasOneOf(SanitizerKind::ImplicitConversion))
+    return;
+
+  // We only care about int->int conversions here.
+  // We ignore conversions to/from pointer and/or bool.
+  if (!PromotionIsPotentiallyEligibleForImplicitIntegerConversionCheck(SrcType,
+                                                                       DstType))
+    return;
+
+  assert(!DstType->isBooleanType() && "we should not get here with booleans.");
+
+  EmitBitfieldTruncationCheck(Src, SrcType, Dst, DstType, Info, Loc);
+  EmitBitfieldSignChangeCheck(Src, SrcType, Dst, DstType, Info, Loc);
+}
+
 Value *ScalarExprEmitter::EmitScalarCast(Value *Src, QualType SrcType,
                                          QualType DstType, llvm::Type *SrcTy,
                                          llvm::Type *DstTy,
@@ -2845,9 +3083,12 @@ ScalarExprEmitter::EmitScalarPrePostIncDec(const UnaryOperator *E, LValue LV,
   }
 
   // Store the updated result through the lvalue.
-  if (LV.isBitField())
+  if (LV.isBitField()) {
+    Value *Src = value;
     CGF.EmitStoreThroughBitfieldLValue(RValue::get(value), LV, &value);
-  else
+    EmitBitfieldConversionCheck(Src, E->getType(), value, E->getType(),
+                                LV.getBitFieldInfo(), E->getExprLoc());
+  } else
     CGF.EmitStoreThroughLValue(RValue::get(value), LV);
 
   // If this is a postinc, return the value read from memory, otherwise use the
@@ -3372,9 +3613,17 @@ LValue ScalarExprEmitter::EmitCompoundAssignLValue(
   // specially because the result is altered by the store, i.e., [C99 6.5.16p1]
   // 'An assignment expression has the value of the left operand after the
   // assignment...'.
-  if (LHSLV.isBitField())
+  if (LHSLV.isBitField()) {
+    Value *Src = Result;
+    QualType SrcType = E->getRHS()->getType();
+    QualType DstType = E->getLHS()->getType();
+    if (ImplicitCastExpr *ICE = dyn_cast<ImplicitCastExpr>(E->getRHS())) {
+      SrcType = ICE->getSubExpr()->getType();
+    }
     CGF.EmitStoreThroughBitfieldLValue(RValue::get(Result), LHSLV, &Result);
-  else
+    EmitBitfieldConversionCheck(Src, SrcType, Result, DstType,
+                                LHSLV.getBitFieldInfo(), E->getExprLoc());
+  } else
     CGF.EmitStoreThroughLValue(RValue::get(Result), LHSLV);
 
   if (CGF.getLangOpts().OpenMP)
@@ -4510,7 +4759,15 @@ Value *ScalarExprEmitter::VisitBinAssign(const BinaryOperator *E) {
     // 'An assignment expression has the value of the left operand after
     // the assignment...'.
     if (LHS.isBitField()) {
+      Value *Src = RHS;
+      QualType SrcType = E->getRHS()->getType();
+      QualType DstType = E->getLHS()->getType();
+      if (ImplicitCastExpr *ICE = dyn_cast<ImplicitCastExpr>(E->getRHS())) {
+        SrcType = ICE->getSubExpr()->getType();
+      }
       CGF.EmitStoreThroughBitfieldLValue(RValue::get(RHS), LHS, &RHS);
+      EmitBitfieldConversionCheck(Src, SrcType, RHS, DstType,
+                                  LHS.getBitFieldInfo(), E->getExprLoc());
     } else {
       CGF.EmitNullabilityCheck(LHS, RHS, E->getExprLoc());
       CGF.EmitStoreThroughLValue(RValue::get(RHS), LHS);
diff --git a/clang/test/CodeGen/ubsan-bitfield-conversion.c b/clang/test/CodeGen/ubsan-bitfield-conversion.c
new file mode 100644
index 00000000000000..4b2940f106c576
--- /dev/null
+++ b/clang/test/CodeGen/ubsan-bitfield-conversion.c
@@ -0,0 +1,27 @@
+// RUN: %clang -fsanitize=implicit-integer-truncation -target x86_64-linux -S -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang -fsanitize=implicit-integer-sign-change -target x86_64-linux -S -emit-llvm -o - %s | FileCheck %s
+typedef struct _xx {
+  int x1:3;
+  char x2:2;
+} xx, *pxx;
+
+xx vxx;
+
+// CHECK-LABEL: define{{.*}} void @foo1
+void foo1(int x) {
+  vxx.x1 = x;
+  // CHECK: call void @__ubsan_handle_implicit_conversion
+}
+
+// CHECK: declare void @__ubsan_handle_implicit_conversion
+
+// CHECK-LABEL: define{{.*}} void @foo2
+void foo2(int x) {
+  vxx.x2 = x;
+  // CHECK: call void @__ubsan_handle_implicit_conversion
+  // TODO: Ideally we should only emit once (emit is generated
+  //       when evaluating RHS integer->char and when storing
+  //       value in bitfield)
+  // CHECK: call void @__ubsan_handle_implicit_conversion
+}
+
diff --git a/compiler-rt/lib/ubsan/ubsan_diag.h b/compiler-rt/lib/ubsan/ubsan_diag.h
index b444e971b22838..1e60651e272a70 100644
--- a/compiler-rt/lib/ubsan/ubsan_diag.h
+++ b/compiler-rt/lib/ubsan/ubsan_diag.h
@@ -177,7 +177,7 @@ class Diag {
   };
 
 private:
-  static const unsigned MaxArgs = 8;
+  static const unsigned MaxArgs = 9;
   static const unsigned MaxRanges = 1;
 
   /// The arguments which have been added to this diagnostic so far.
diff --git a/compiler-rt/lib/ubsan/ubsan_handlers.cpp b/compiler-rt/lib/ubsan/ubsan_handlers.cpp
index 0f16507d5d88f1..1232f60836b090 100644
--- a/compiler-rt/lib/ubsan/ubsan_handlers.cpp
+++ b/compiler-rt/lib/ubsan/ubsan_handlers.cpp
@@ -594,14 +594,16 @@ static void handleImplicitConversion(ImplicitConversionData *Data,
 
   ScopedReport R(Opts, Loc, ET);
 
-  // FIXME: is it possible to dump the values as hex with fixed width?
+  unsigned DstBits =
+      Data->BitfieldBits ? Data->BitfieldBits : DstTy.getIntegerBitWidth();
 
+  // FIXME: is it possible to dump the values as hex with fixed width?
   Diag(Loc, DL_Error, ET,
        "implicit conversion from type %0 of value %1 (%2-bit, %3signed) to "
-       "type %4 changed the value to %5 (%6-bit, %7signed)")
+       "type %4 changed the value to %5 (%6-bit%7, %8signed)")
       << SrcTy << Value(SrcTy, Src) << SrcTy.getIntegerBitWidth()
-      << (SrcSigned ? "" : "un") << DstTy << Value(DstTy, Dst)
-      << DstTy.getIntegerBitWidth() << (DstSigned ? "" : "un");
+      << (SrcSigned ? "" : "un") << DstTy << Value(DstTy, Dst) << DstBits
+      << (Data->BitfieldBits ? " bitfield" : "") << (DstSigned ? "" : "un");
 }
 
 void __ubsan::__ubsan_handle_implicit_conversion(ImplicitConversionData *Data,
diff --git a/compiler-rt/lib...
[truncated]

Zonotora · 2023-12-15T18:34:49Z

Hi, @vitalybuka @zygoloid @AaronBallman you might have something to say about this commit

vitalybuka · 2023-12-15T18:49:16Z

Is is UB?

vitalybuka · 2023-12-15T18:50:30Z

clang/lib/CodeGen/CGExprScalar.cpp

@@ -1094,6 +1114,27 @@ void ScalarExprEmitter::EmitIntegerTruncationCheck(Value *Src, QualType SrcType,
                {Src, Dst});
 }

+static llvm::Value *EmitIsNegativeTestHelper(Value *V, QualType VType,


Please extract lambda to function into a separate patch

I still see it here, does it needs rebase, or that patch needs to be landed?

vitalybuka · 2023-12-15T18:55:10Z

@LebedevRI who added implicit-integer-truncation

vitalybuka · 2023-12-15T18:57:18Z

Is is UB?

I guess it follows existing convention implicit-integer-truncation is convenience check and implicit-signed-integer-truncation is UB

Zonotora · 2023-12-16T18:52:10Z

Is is UB?

Don't think so? To quote [1]:

Issues caught by these sanitizers are not undefined behavior, but are often unintentional.

where "these sanitizers" refer to implicit-unsigned-integer-truncation, implicit-signed-integer-truncation and implicit-integer-sign-change.

[1] https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html

Zonotora · 2024-01-02T20:30:22Z

Feedback @vitalybuka @AaronBallman @LebedevRI? 😃

efriedma-quic · 2024-01-02T21:03:52Z

Please add test coverage for compound assignment and increment/decrement.

This seems like a reasonable extension of the existing integer truncation checks, but we might want to consider giving it a unique name. Otherwise, people using integer truncation checks will have trouble upgrading their compiler.

Zonotora · 2024-01-02T21:25:25Z

Please add test coverage for compound assignment and increment/decrement.

This seems like a reasonable extension of the existing integer truncation checks, but we might want to consider giving it a unique name. Otherwise, people using integer truncation checks will have trouble upgrading their compiler.

Sure 👍 Fair enough, what name do you suggest? Replace "integer" with "bitfield" as in implicit-unsigned-bitfield-truncation, implicit-signed-bitfield-truncation and implicit-bitfield-sign-change? Or just have a single flag implicit-bitfield-conversion for now?

Zonotora · 2024-02-07T14:55:08Z

Hi again, I have now finally gotten time and updated the patch so that the unnecessary emits I mentioned in the initial commit are avoided. The current patch introduces a number of new fsanitizer flags to separate integer conversions from bitfield conversions. E.g.,

-fsanitize=implicit-unsigned-bitfield-truncation
-fsanitize=implicit-signed-bitfield-truncation
-fsanitize=implicit-bitfield-sign-change
-fsanitize=implicit-bitfield-truncation
-fsanitize=implicit-bitfield-arithmetic-value-change
-fsanitize=implicit-bitfield-conversion
-fsanitize=implicit-integer-conversion <---- This used to be -fsanitize=implicit-conversion

-fsanitize=implicit-conversion will now represent -fsanitize=implicit-integer-conversion and -fsanitize=implicit-bitfield-conversion.

Previously the following:

typedef struct {
    unsigned char a:4;
} X;

int main(void) {
    X x;
    unsigned int a = 272;
    x.a = a;
    return 0;
}

emitted an implicit integer conversion error in the assignment of x.a = a with the -fsanitize=implicit-integer-conversion. This is no longer the case as the assignment involves bitfields. To get the emission error, one would have to include the -fsanitize=implicit-bitfield-conversion flag instead.

I have compiled clang with the -fsanitizer flag -fsanitize=implicit-bitfield-conversion without any problems. What are your thoughts on this new change? @vitalybuka @AaronBallman @LebedevRI @efriedma-quic

vabridgers · 2024-02-09T14:30:41Z

LGTM, but someone else must approve. Only other thing I might add is to try enabling these checks for some open source test cases. Maybe look through some past reviews to see what others have done, or wait for one of the maintainers/owners to comment on which ones might be best.

Thanks for doing this work, Axel!

AaronBallman · 2024-02-13T18:19:57Z

Precommit CI seems to have found some related failures that should be investigated. I'd love to hear opinions from the codegen code owners on this, as well as @zygoloid as UBSan code owner. I'm not opposed to the changes, though I do find it somewhat strange to add them to UBSan given that they're not undefined behavior (even though there's precedent).

efriedma-quic · 2024-03-13T22:57:11Z

compiler-rt/lib/ubsan/ubsan_handlers.h

@@ -147,6 +147,7 @@ struct ImplicitConversionData {
  const TypeDescriptor &FromType;
  const TypeDescriptor &ToType;
  /* ImplicitConversionCheckKind */ unsigned char Kind;
+  unsigned int BitfieldBits;


Do we have an ABI stability policy for UBsan? Is the change allowed under it?

Ping @zygoloid -- I'm not certain if we have an ABI stability policy for sanitizers or not.

clang/test/CodeGen/ubsan-bitfield-conversion.c

Zonotora · 2024-03-18T08:49:47Z

@LebedevRI LGTM?

This patch implements the implicit truncation and implicit sign change checks for bitfields.

Update testcases due to new -fsanitize options regarding bitfields

Zonotora · 2024-03-18T09:10:48Z

Also, I don't have commit access (first PR here), someone has to commit on my behalf!

Zonotora · 2024-03-27T08:48:14Z

Also, I don't have commit access (first PR here), someone has to commit on my behalf!

@AaronBallman

AaronBallman · 2024-03-27T11:33:44Z

Also, I don't have commit access (first PR here), someone has to commit on my behalf!

@AaronBallman

Thank you for mentioning this! I'll commit on your behalf, but I'd like to hear from @zygoloid on the ABI question. It would also be nice if @efriedma-quic signed off on the testing changes (I think they're sufficient, but he may have other cases in mind).

zygoloid · 2024-03-27T17:09:21Z

I don't think we've established an explicit policy, and we've made ABI-breaking changes previously. I think we should avoid breaks within a major release version (don't backport this) to avoid giving packagers headaches: the ubsan runtime is installed in a versioned directory, but I don't think that includes the patch version.

Having an ABI break here seems OK and in line with historic precedent. (But we ought to document our ABI policy for our runtime libraries.)

efriedma-quic

LGTM

Zonotora · 2024-04-03T10:09:05Z

@AaronBallman looks good?

vitalybuka · 2024-04-03T16:34:32Z

This bot started inconsistently crash https://lab.llvm.org/buildbot/#/builders/75/builds/45086
Few builds before are green and tests only, so I suspect this patch.

vitalybuka · 2024-04-03T16:35:55Z

These are consistently broken after the patch:
https://lab.llvm.org/buildbot/#/builders/77/builds/36021
https://lab.llvm.org/buildbot/#/builders/37/builds/33244
https://lab.llvm.org/buildbot/#/builders/19/builds/26087

…75481)" This reverts commit 450f195.

AaronBallman · 2024-04-03T16:39:22Z

Thank you for catching that and doing the revert!

Zonotora · 2024-04-03T17:09:08Z

Will look into this...

…75481)" This reverts commit 1189e87. This reverts commit 450f195.

…87518) Reverts #75481 Breaks multiple bots, see #75481

vitalybuka · 2024-04-03T17:44:31Z

Probably 1189e87 fixed the issue.
I will revert my revert if https://lab.llvm.org/buildbot/#/builders/269/builds/7066 is green.

vitalybuka · 2024-04-03T22:20:05Z

Still had to revert with #87562

) Fix since #75481 got reverted. - Explicitly set BitfieldBits to 0 to avoid uninitialized field member for the integer checks: ```diff - llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first)}; + llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first), + llvm::ConstantInt::get(Builder.getInt32Ty(), 0)}; ``` - `Value **Previous` was erroneously `Value *Previous` in `CodeGenFunction::EmitWithOriginalRHSBitfieldAssignment`, fixed now. - Update following: ```diff - if (Kind == CK_IntegralCast) { + if (Kind == CK_IntegralCast || Kind == CK_LValueToRValue) { ``` CK_LValueToRValue when going from, e.g., char to char, and CK_IntegralCast otherwise. - Make sure that `Value *Previous = nullptr;` is initialized (see 1189e87) - Add another extensive testcase `ubsan/TestCases/ImplicitConversion/bitfield-conversion.c` --------- Co-authored-by: Vitaly Buka <vitalybuka@gmail.com>

llvmbot added clang Clang issues not falling into any other category compiler-rt clang:codegen compiler-rt:sanitizer labels Dec 14, 2023

vitalybuka requested review from zygoloid, vitalybuka and AaronBallman December 15, 2023 18:48

vitalybuka reviewed Dec 15, 2023

View reviewed changes

vitalybuka requested a review from LebedevRI December 15, 2023 18:54

Zonotora force-pushed the main branch from 50bcbdf to 73a92d1 Compare December 16, 2023 18:37

Zonotora requested a review from vitalybuka January 2, 2024 08:18

Zonotora force-pushed the main branch from 73a92d1 to 49e3c6b Compare February 7, 2024 14:35

llvmbot added clang:frontend Language frontend issues, e.g. anything involving "Sema" compiler-rt:ubsan Undefined behavior sanitizer labels Feb 7, 2024

AaronBallman requested review from rjmccall and efriedma-quic February 13, 2024 18:16

Zonotora force-pushed the main branch from 49e3c6b to 5c3fd87 Compare February 13, 2024 20:28

Zonotora force-pushed the main branch 2 times, most recently from 537ee15 to c70b72b Compare March 12, 2024 21:39

vitalybuka approved these changes Mar 13, 2024

View reviewed changes

efriedma-quic reviewed Mar 13, 2024

View reviewed changes

efriedma-quic reviewed Mar 14, 2024

View reviewed changes

clang/test/CodeGen/ubsan-bitfield-conversion.c Show resolved Hide resolved

Zonotora force-pushed the main branch from c70b72b to 009a197 Compare March 14, 2024 19:15

Zonotora added 3 commits March 18, 2024 10:00

[clang] Extract negativity check lambda to function

42207b6

[clang][UBSan] Add implicit conversion check for bitfields

77fed88

This patch implements the implicit truncation and implicit sign change checks for bitfields.

[clang][UBSan] Update testcases

f160c9e

Update testcases due to new -fsanitize options regarding bitfields

Zonotora force-pushed the main branch from 009a197 to f160c9e Compare March 18, 2024 09:00

efriedma-quic approved these changes Mar 27, 2024

View reviewed changes

AaronBallman merged commit 450f195 into llvm:main Apr 3, 2024
4 checks passed

vitalybuka added a commit that referenced this pull request Apr 3, 2024

Revert "[clang][UBSan] Add implicit conversion check for bitfields (#…

075d427

…75481)" This reverts commit 450f195.

vitalybuka mentioned this pull request Apr 3, 2024

Revert "[clang][UBSan] Add implicit conversion check for bitfields" #87518

Merged

vitalybuka added a commit that referenced this pull request Apr 3, 2024

Revert "[clang][UBSan] Add implicit conversion check for bitfields (#…

2f46953

…75481)" This reverts commit 1189e87. This reverts commit 450f195.

vitalybuka added a commit that referenced this pull request Apr 3, 2024

Revert "[clang][UBSan] Add implicit conversion check for bitfields" (#…

5822ca5

…87518) Reverts #75481 Breaks multiple bots, see #75481

Zonotora mentioned this pull request Apr 5, 2024

Fix "[clang][UBSan] Add implicit conversion check for bitfields" #87761

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[clang][UBSan] Add implicit conversion check for bitfields #75481

[clang][UBSan] Add implicit conversion check for bitfields #75481

Zonotora commented Dec 14, 2023 •

edited

github-actions bot commented Dec 14, 2023

llvmbot commented Dec 14, 2023 •

edited

Zonotora commented Dec 15, 2023

vitalybuka commented Dec 15, 2023

vitalybuka Dec 15, 2023

Zonotora Dec 16, 2023

vitalybuka Mar 13, 2024

vitalybuka commented Dec 15, 2023

vitalybuka commented Dec 15, 2023

Zonotora commented Dec 16, 2023

Zonotora commented Jan 2, 2024

efriedma-quic commented Jan 2, 2024

Zonotora commented Jan 2, 2024

Zonotora commented Feb 7, 2024 •

edited

vabridgers commented Feb 9, 2024 •

edited

AaronBallman commented Feb 13, 2024

efriedma-quic Mar 13, 2024

Zonotora Mar 14, 2024

AaronBallman Mar 27, 2024

Zonotora commented Mar 18, 2024

Zonotora commented Mar 18, 2024

Zonotora commented Mar 27, 2024

AaronBallman commented Mar 27, 2024

zygoloid commented Mar 27, 2024

efriedma-quic left a comment

Zonotora commented Apr 3, 2024

vitalybuka commented Apr 3, 2024

vitalybuka commented Apr 3, 2024

AaronBallman commented Apr 3, 2024

Zonotora commented Apr 3, 2024

vitalybuka commented Apr 3, 2024 •

edited

vitalybuka commented Apr 3, 2024

[clang][UBSan] Add implicit conversion check for bitfields #75481

[clang][UBSan] Add implicit conversion check for bitfields #75481

Conversation

Zonotora commented Dec 14, 2023 • edited

github-actions bot commented Dec 14, 2023

llvmbot commented Dec 14, 2023 • edited

Zonotora commented Dec 15, 2023

vitalybuka commented Dec 15, 2023

vitalybuka Dec 15, 2023

Choose a reason for hiding this comment

Zonotora Dec 16, 2023

Choose a reason for hiding this comment

vitalybuka Mar 13, 2024

Choose a reason for hiding this comment

vitalybuka commented Dec 15, 2023

vitalybuka commented Dec 15, 2023

Zonotora commented Dec 16, 2023

Zonotora commented Jan 2, 2024

efriedma-quic commented Jan 2, 2024

Zonotora commented Jan 2, 2024

Zonotora commented Feb 7, 2024 • edited

vabridgers commented Feb 9, 2024 • edited

AaronBallman commented Feb 13, 2024

efriedma-quic Mar 13, 2024

Choose a reason for hiding this comment

Zonotora Mar 14, 2024

Choose a reason for hiding this comment

AaronBallman Mar 27, 2024

Choose a reason for hiding this comment

Zonotora commented Mar 18, 2024

Zonotora commented Mar 18, 2024

Zonotora commented Mar 27, 2024

AaronBallman commented Mar 27, 2024

zygoloid commented Mar 27, 2024

efriedma-quic left a comment

Choose a reason for hiding this comment

Zonotora commented Apr 3, 2024

vitalybuka commented Apr 3, 2024

vitalybuka commented Apr 3, 2024

AaronBallman commented Apr 3, 2024

Zonotora commented Apr 3, 2024

vitalybuka commented Apr 3, 2024 • edited

vitalybuka commented Apr 3, 2024

Zonotora commented Dec 14, 2023 •

edited

llvmbot commented Dec 14, 2023 •

edited

Zonotora commented Feb 7, 2024 •

edited

vabridgers commented Feb 9, 2024 •

edited

vitalybuka commented Apr 3, 2024 •

edited