-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clang][UBSan] Add implicit conversion check for bitfields #75481
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write If you have received no comments on your PR for a week, you can request a review If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-clang-driver @llvm/pr-subscribers-clang-codegen Author: Axel Lundberg (Zonotora) ChangesThis patch implements the implicit truncation and implicit sign change checks for bitfields using UBSan. E.g., This commit implements the truncation and sign change checks whenever we are emitting a #include <stdio.h>
#include <math.h>
typedef struct {
unsigned char a:4;
} X;
int main(void) {
X x;
unsigned int a = 272;
x.a = a;
return 0;
} produces main.c:11:11: runtime error: implicit conversion from type 'unsigned int' of value 272 (32-bit, unsigned) to type 'unsigned char' changed the value to 16 (8-bit, unsigned)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior main.c:12:11 in
main.c:11:9: runtime error: implicit conversion from type 'unsigned int' of value 16 (32-bit, unsigned) to type 'unsigned char' changed the value to 0 (4-bit bitfield, unsigned)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior main.c:12:9 in because we truncate the value 272 into a char when evaluating the RHS (resulting in the value 16) and then truncate it again when assigning it to the bitfield. I thought of disabling the relevant sanitizer checks when evaluating the RHS, but to make it work for all cases I need to extract the previous (before casting it to char in the above case) I would appreciate feedback so that I can advance this to a mergeable state. This is my first commit here, so I have probably missed some essentials! Patch is 20.46 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/75481.diff 5 Files Affected:
diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp
index 41ad2ddac30d2d..c93c5d89583d68 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -15,6 +15,7 @@
#include "CGDebugInfo.h"
#include "CGObjCRuntime.h"
#include "CGOpenMPRuntime.h"
+#include "CGRecordLayout.h"
#include "CodeGenFunction.h"
#include "CodeGenModule.h"
#include "ConstantEmitter.h"
@@ -324,6 +325,25 @@ class ScalarExprEmitter
void EmitIntegerSignChangeCheck(Value *Src, QualType SrcType, Value *Dst,
QualType DstType, SourceLocation Loc);
+ /// Emit a check that an [implicit] truncation of a bitfield does not
+ /// discard any bits. It is not UB, so we use the value after truncation.
+ void EmitBitfieldTruncationCheck(Value *Src, QualType SrcType, Value *Dst,
+ QualType DstType, const CGBitFieldInfo &Info,
+ SourceLocation Loc);
+
+ /// Emit a check that an [implicit] conversion of a bitfield does not change
+ /// the sign of the value. It is not UB, so we use the value after conversion.
+ /// NOTE: Src and Dst may be the exact same value! (point to the same thing)
+ void EmitBitfieldSignChangeCheck(Value *Src, QualType SrcType, Value *Dst,
+ QualType DstType, const CGBitFieldInfo &Info,
+ SourceLocation Loc);
+
+ /// Emit a check that an [implicit] conversion of a bitfield. It is not UB,
+ /// so we use the value after conversion.
+ void EmitBitfieldConversionCheck(Value *Src, QualType SrcType, Value *Dst,
+ QualType DstType, const CGBitFieldInfo &Info,
+ SourceLocation Loc);
+
/// Emit a conversion from the specified type to the specified destination
/// type, both of which are LLVM scalar types.
struct ScalarConversionOpts {
@@ -1094,6 +1114,27 @@ void ScalarExprEmitter::EmitIntegerTruncationCheck(Value *Src, QualType SrcType,
{Src, Dst});
}
+static llvm::Value *EmitIsNegativeTestHelper(Value *V, QualType VType,
+ const char *Name,
+ CGBuilderTy &Builder) {
+ // NOTE: zero value is considered to be non-negative.
+ // Is this value a signed type?
+ bool VSigned = VType->isSignedIntegerOrEnumerationType();
+ llvm::Type *VTy = V->getType();
+ if (!VSigned) {
+ // If the value is unsigned, then it is never negative.
+ // FIXME: can we encounter non-scalar VTy here?
+ return llvm::ConstantInt::getFalse(VTy->getContext());
+ }
+ // Get the zero of the same type with which we will be comparing.
+ llvm::Constant *Zero = llvm::ConstantInt::get(VTy, 0);
+ // %V.isnegative = icmp slt %V, 0
+ // I.e is %V *strictly* less than zero, does it have negative value?
+ return Builder.CreateICmp(llvm::ICmpInst::ICMP_SLT, V, Zero,
+ llvm::Twine(Name) + "." + V->getName() +
+ ".negativitycheck");
+};
+
// Should be called within CodeGenFunction::SanitizerScope RAII scope.
// Returns 'i1 false' when the conversion Src -> Dst changed the sign.
static std::pair<ScalarExprEmitter::ImplicitConversionCheckKind,
@@ -1118,30 +1159,12 @@ EmitIntegerSignChangeCheckHelper(Value *Src, QualType SrcType, Value *Dst,
assert(((SrcBits != DstBits) || (SrcSigned != DstSigned)) &&
"either the widths should be different, or the signednesses.");
- // NOTE: zero value is considered to be non-negative.
- auto EmitIsNegativeTest = [&Builder](Value *V, QualType VType,
- const char *Name) -> Value * {
- // Is this value a signed type?
- bool VSigned = VType->isSignedIntegerOrEnumerationType();
- llvm::Type *VTy = V->getType();
- if (!VSigned) {
- // If the value is unsigned, then it is never negative.
- // FIXME: can we encounter non-scalar VTy here?
- return llvm::ConstantInt::getFalse(VTy->getContext());
- }
- // Get the zero of the same type with which we will be comparing.
- llvm::Constant *Zero = llvm::ConstantInt::get(VTy, 0);
- // %V.isnegative = icmp slt %V, 0
- // I.e is %V *strictly* less than zero, does it have negative value?
- return Builder.CreateICmp(llvm::ICmpInst::ICMP_SLT, V, Zero,
- llvm::Twine(Name) + "." + V->getName() +
- ".negativitycheck");
- };
-
// 1. Was the old Value negative?
- llvm::Value *SrcIsNegative = EmitIsNegativeTest(Src, SrcType, "src");
+ llvm::Value *SrcIsNegative =
+ EmitIsNegativeTestHelper(Src, SrcType, "src", Builder);
// 2. Is the new Value negative?
- llvm::Value *DstIsNegative = EmitIsNegativeTest(Dst, DstType, "dst");
+ llvm::Value *DstIsNegative =
+ EmitIsNegativeTestHelper(Dst, DstType, "dst", Builder);
// 3. Now, was the 'negativity status' preserved during the conversion?
// NOTE: conversion from negative to zero is considered to change the sign.
// (We want to get 'false' when the conversion changed the sign)
@@ -1236,6 +1259,221 @@ void ScalarExprEmitter::EmitIntegerSignChangeCheck(Value *Src, QualType SrcType,
{Src, Dst});
}
+// Should be called within CodeGenFunction::SanitizerScope RAII scope.
+// Returns 'i1 false' when the truncation Src -> Dst was lossy.
+static std::pair<ScalarExprEmitter::ImplicitConversionCheckKind,
+ std::pair<llvm::Value *, SanitizerMask>>
+EmitBitfieldTruncationCheckHelper(Value *Src, QualType SrcType, Value *Dst,
+ QualType DstType, CGBuilderTy &Builder) {
+
+ llvm::Type *SrcTy = Src->getType();
+ llvm::Type *DstTy = Dst->getType();
+ (void)SrcTy; // Only used in assert()
+ (void)DstTy; // Only used in assert()
+
+ // This should be truncation of integral types.
+ assert(isa<llvm::IntegerType>(SrcTy) && isa<llvm::IntegerType>(DstTy) &&
+ "non-integer llvm type");
+
+ bool SrcSigned = SrcType->isSignedIntegerOrEnumerationType();
+ bool DstSigned = DstType->isSignedIntegerOrEnumerationType();
+
+ ScalarExprEmitter::ImplicitConversionCheckKind Kind;
+ SanitizerMask Mask;
+ if (!SrcSigned && !DstSigned) {
+ Kind = ScalarExprEmitter::ICCK_UnsignedIntegerTruncation;
+ Mask = SanitizerKind::ImplicitUnsignedIntegerTruncation;
+ } else {
+ Kind = ScalarExprEmitter::ICCK_SignedIntegerTruncation;
+ Mask = SanitizerKind::ImplicitSignedIntegerTruncation;
+ }
+
+ // Since Src already has the same type as Dst, we don't have
+ // to sign extend or truncate the Src value.
+ llvm::Value *CheckV = Builder.CreateICmpEQ(Dst, Src, "bf.truncheck");
+ // If the comparison result is 'i1 false', then the truncation was lossy.
+
+ return std::make_pair(Kind, std::make_pair(CheckV, Mask));
+}
+
+void ScalarExprEmitter::EmitBitfieldTruncationCheck(
+ Value *Src, QualType SrcType, Value *Dst, QualType DstType,
+ const CGBitFieldInfo &Info, SourceLocation Loc) {
+
+ if (!CGF.SanOpts.hasOneOf(SanitizerKind::ImplicitIntegerTruncation))
+ return;
+
+ // TODO: Calculate src width to avoid emitting code
+ // for unecessary cases.
+ unsigned SrcBits = ConvertType(SrcType)->getScalarSizeInBits();
+ unsigned BitfieldWidth = Info.Size;
+ // This must be truncation. Else we do not care.
+ if (SrcBits <= BitfieldWidth)
+ return;
+
+ // If the integer sign change sanitizer is enabled,
+ // and we are truncating from larger unsigned type to smaller signed type,
+ // let that next sanitizer deal with it.
+ bool SrcSigned = SrcType->isSignedIntegerOrEnumerationType();
+ bool DstSigned = DstType->isSignedIntegerOrEnumerationType();
+ if (CGF.SanOpts.has(SanitizerKind::ImplicitIntegerSignChange) &&
+ (!SrcSigned && DstSigned))
+ return;
+
+ CodeGenFunction::SanitizerScope SanScope(&CGF);
+
+ std::pair<ScalarExprEmitter::ImplicitConversionCheckKind,
+ std::pair<llvm::Value *, SanitizerMask>>
+ Check = EmitBitfieldTruncationCheckHelper(Src, SrcType, Dst, DstType,
+ Builder);
+ // If the comparison result is 'i1 false', then the truncation was lossy.
+
+ llvm::Constant *StaticArgs[] = {
+ CGF.EmitCheckSourceLocation(Loc), CGF.EmitCheckTypeDescriptor(SrcType),
+ CGF.EmitCheckTypeDescriptor(DstType),
+ llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first),
+ llvm::ConstantInt::get(Builder.getInt32Ty(), Info.Size)};
+ CGF.EmitCheck(Check.second, SanitizerHandler::ImplicitConversion, StaticArgs,
+ {Src, Dst});
+}
+
+// Should be called within CodeGenFunction::SanitizerScope RAII scope.
+// Returns 'i1 false' when the conversion Src -> Dst changed the sign.
+static std::pair<ScalarExprEmitter::ImplicitConversionCheckKind,
+ std::pair<llvm::Value *, SanitizerMask>>
+EmitBitfieldSignChangeCheckHelper(Value *Src, QualType SrcType,
+ unsigned SrcBits, Value *Dst,
+ QualType DstType, unsigned DstBits,
+ CGBuilderTy &Builder) {
+ llvm::Type *SrcTy = Src->getType();
+ llvm::Type *DstTy = Dst->getType();
+ (void)SrcTy; // Only used in assert()
+ (void)DstTy; // Only used in assert()
+
+ // This should be truncation of integral types.
+ assert(isa<llvm::IntegerType>(SrcTy) && isa<llvm::IntegerType>(DstTy) &&
+ "non-integer llvm type");
+
+ bool SrcSigned = SrcType->isSignedIntegerOrEnumerationType();
+ bool DstSigned = DstType->isSignedIntegerOrEnumerationType();
+
+ assert(((SrcBits != DstBits) || (SrcSigned != DstSigned)) &&
+ "either the widths should be different, or the signednesses.");
+
+ // 1. Was the old Value negative?
+ llvm::Value *SrcIsNegative =
+ EmitIsNegativeTestHelper(Src, SrcType, "bf.src", Builder);
+ // 2. Is the new Value negative?
+ llvm::Value *DstIsNegative =
+ EmitIsNegativeTestHelper(Dst, DstType, "bf.dst", Builder);
+ // 3. Now, was the 'negativity status' preserved during the conversion?
+ // NOTE: conversion from negative to zero is considered to change the sign.
+ // (We want to get 'false' when the conversion changed the sign)
+ // So we should just equality-compare the negativity statuses.
+ llvm::Value *Check = nullptr;
+ Check =
+ Builder.CreateICmpEQ(SrcIsNegative, DstIsNegative, "bf.signchangecheck");
+ // If the comparison result is 'false', then the conversion changed the sign.
+ return std::make_pair(
+ ScalarExprEmitter::ICCK_IntegerSignChange,
+ std::make_pair(Check, SanitizerKind::ImplicitIntegerSignChange));
+}
+
+void ScalarExprEmitter::EmitBitfieldSignChangeCheck(
+ Value *Src, QualType SrcType, Value *Dst, QualType DstType,
+ const CGBitFieldInfo &Info, SourceLocation Loc) {
+
+ if (!CGF.SanOpts.has(SanitizerKind::ImplicitIntegerSignChange))
+ return;
+
+ bool SrcSigned = SrcType->isSignedIntegerOrEnumerationType();
+ bool DstSigned = DstType->isSignedIntegerOrEnumerationType();
+ unsigned SrcBits = ConvertType(SrcType)->getScalarSizeInBits();
+ unsigned DstBits = Info.Size;
+
+ // Now, we do not need to emit the check in *all* of the cases.
+ // We can avoid emitting it in some obvious cases where it would have been
+ // dropped by the opt passes (instcombine) always anyways.
+ // If it's a cast between effectively the same type, no check.
+ // NOTE: this is *not* equivalent to checking the canonical types.
+ if (SrcSigned == DstSigned && SrcBits == DstBits)
+ return;
+ // At least one of the values needs to have signed type.
+ // If both are unsigned, then obviously, neither of them can be negative.
+ if (!SrcSigned && !DstSigned)
+ return;
+ // If the conversion is to *larger* *signed* type, then no check is needed.
+ // Because either sign-extension happens (so the sign will remain),
+ // or zero-extension will happen (the sign bit will be zero.)
+ if ((DstBits > SrcBits) && DstSigned)
+ return;
+ if (CGF.SanOpts.has(SanitizerKind::ImplicitSignedIntegerTruncation) &&
+ (SrcBits > DstBits) && SrcSigned) {
+ // If the signed integer truncation sanitizer is enabled,
+ // and this is a truncation from signed type, then no check is needed.
+ // Because here sign change check is interchangeable with truncation check.
+ return;
+ }
+ // That's it. We can't rule out any more cases with the data we have.
+
+ CodeGenFunction::SanitizerScope SanScope(&CGF);
+
+ std::pair<ScalarExprEmitter::ImplicitConversionCheckKind,
+ std::pair<llvm::Value *, SanitizerMask>>
+ Check;
+
+ // Each of these checks needs to return 'false' when an issue was detected.
+ ImplicitConversionCheckKind CheckKind;
+ llvm::SmallVector<std::pair<llvm::Value *, SanitizerMask>, 2> Checks;
+ // So we can 'and' all the checks together, and still get 'false',
+ // if at least one of the checks detected an issue.
+
+ Check = EmitBitfieldSignChangeCheckHelper(Src, SrcType, SrcBits, Dst, DstType,
+ DstBits, Builder);
+ CheckKind = Check.first;
+ Checks.emplace_back(Check.second);
+
+ if (CGF.SanOpts.has(SanitizerKind::ImplicitSignedIntegerTruncation) &&
+ (SrcBits > DstBits) && !SrcSigned && DstSigned) {
+ // If the signed integer truncation sanitizer was enabled,
+ // and we are truncating from larger unsigned type to smaller signed type,
+ // let's handle the case we skipped in that check.
+ Check =
+ EmitBitfieldTruncationCheckHelper(Src, SrcType, Dst, DstType, Builder);
+ CheckKind = ICCK_SignedIntegerTruncationOrSignChange;
+ Checks.emplace_back(Check.second);
+ // If the comparison result is 'i1 false', then the truncation was lossy.
+ }
+
+ llvm::Constant *StaticArgs[] = {
+ CGF.EmitCheckSourceLocation(Loc), CGF.EmitCheckTypeDescriptor(SrcType),
+ CGF.EmitCheckTypeDescriptor(DstType),
+ llvm::ConstantInt::get(Builder.getInt8Ty(), CheckKind),
+ llvm::ConstantInt::get(Builder.getInt32Ty(), Info.Size)};
+ // EmitCheck() will 'and' all the checks together.
+ CGF.EmitCheck(Checks, SanitizerHandler::ImplicitConversion, StaticArgs,
+ {Src, Dst});
+}
+
+void ScalarExprEmitter::EmitBitfieldConversionCheck(
+ Value *Src, QualType SrcType, Value *Dst, QualType DstType,
+ const CGBitFieldInfo &Info, SourceLocation Loc) {
+
+ if (!CGF.SanOpts.hasOneOf(SanitizerKind::ImplicitConversion))
+ return;
+
+ // We only care about int->int conversions here.
+ // We ignore conversions to/from pointer and/or bool.
+ if (!PromotionIsPotentiallyEligibleForImplicitIntegerConversionCheck(SrcType,
+ DstType))
+ return;
+
+ assert(!DstType->isBooleanType() && "we should not get here with booleans.");
+
+ EmitBitfieldTruncationCheck(Src, SrcType, Dst, DstType, Info, Loc);
+ EmitBitfieldSignChangeCheck(Src, SrcType, Dst, DstType, Info, Loc);
+}
+
Value *ScalarExprEmitter::EmitScalarCast(Value *Src, QualType SrcType,
QualType DstType, llvm::Type *SrcTy,
llvm::Type *DstTy,
@@ -2845,9 +3083,12 @@ ScalarExprEmitter::EmitScalarPrePostIncDec(const UnaryOperator *E, LValue LV,
}
// Store the updated result through the lvalue.
- if (LV.isBitField())
+ if (LV.isBitField()) {
+ Value *Src = value;
CGF.EmitStoreThroughBitfieldLValue(RValue::get(value), LV, &value);
- else
+ EmitBitfieldConversionCheck(Src, E->getType(), value, E->getType(),
+ LV.getBitFieldInfo(), E->getExprLoc());
+ } else
CGF.EmitStoreThroughLValue(RValue::get(value), LV);
// If this is a postinc, return the value read from memory, otherwise use the
@@ -3372,9 +3613,17 @@ LValue ScalarExprEmitter::EmitCompoundAssignLValue(
// specially because the result is altered by the store, i.e., [C99 6.5.16p1]
// 'An assignment expression has the value of the left operand after the
// assignment...'.
- if (LHSLV.isBitField())
+ if (LHSLV.isBitField()) {
+ Value *Src = Result;
+ QualType SrcType = E->getRHS()->getType();
+ QualType DstType = E->getLHS()->getType();
+ if (ImplicitCastExpr *ICE = dyn_cast<ImplicitCastExpr>(E->getRHS())) {
+ SrcType = ICE->getSubExpr()->getType();
+ }
CGF.EmitStoreThroughBitfieldLValue(RValue::get(Result), LHSLV, &Result);
- else
+ EmitBitfieldConversionCheck(Src, SrcType, Result, DstType,
+ LHSLV.getBitFieldInfo(), E->getExprLoc());
+ } else
CGF.EmitStoreThroughLValue(RValue::get(Result), LHSLV);
if (CGF.getLangOpts().OpenMP)
@@ -4510,7 +4759,15 @@ Value *ScalarExprEmitter::VisitBinAssign(const BinaryOperator *E) {
// 'An assignment expression has the value of the left operand after
// the assignment...'.
if (LHS.isBitField()) {
+ Value *Src = RHS;
+ QualType SrcType = E->getRHS()->getType();
+ QualType DstType = E->getLHS()->getType();
+ if (ImplicitCastExpr *ICE = dyn_cast<ImplicitCastExpr>(E->getRHS())) {
+ SrcType = ICE->getSubExpr()->getType();
+ }
CGF.EmitStoreThroughBitfieldLValue(RValue::get(RHS), LHS, &RHS);
+ EmitBitfieldConversionCheck(Src, SrcType, RHS, DstType,
+ LHS.getBitFieldInfo(), E->getExprLoc());
} else {
CGF.EmitNullabilityCheck(LHS, RHS, E->getExprLoc());
CGF.EmitStoreThroughLValue(RValue::get(RHS), LHS);
diff --git a/clang/test/CodeGen/ubsan-bitfield-conversion.c b/clang/test/CodeGen/ubsan-bitfield-conversion.c
new file mode 100644
index 00000000000000..4b2940f106c576
--- /dev/null
+++ b/clang/test/CodeGen/ubsan-bitfield-conversion.c
@@ -0,0 +1,27 @@
+// RUN: %clang -fsanitize=implicit-integer-truncation -target x86_64-linux -S -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang -fsanitize=implicit-integer-sign-change -target x86_64-linux -S -emit-llvm -o - %s | FileCheck %s
+typedef struct _xx {
+ int x1:3;
+ char x2:2;
+} xx, *pxx;
+
+xx vxx;
+
+// CHECK-LABEL: define{{.*}} void @foo1
+void foo1(int x) {
+ vxx.x1 = x;
+ // CHECK: call void @__ubsan_handle_implicit_conversion
+}
+
+// CHECK: declare void @__ubsan_handle_implicit_conversion
+
+// CHECK-LABEL: define{{.*}} void @foo2
+void foo2(int x) {
+ vxx.x2 = x;
+ // CHECK: call void @__ubsan_handle_implicit_conversion
+ // TODO: Ideally we should only emit once (emit is generated
+ // when evaluating RHS integer->char and when storing
+ // value in bitfield)
+ // CHECK: call void @__ubsan_handle_implicit_conversion
+}
+
diff --git a/compiler-rt/lib/ubsan/ubsan_diag.h b/compiler-rt/lib/ubsan/ubsan_diag.h
index b444e971b22838..1e60651e272a70 100644
--- a/compiler-rt/lib/ubsan/ubsan_diag.h
+++ b/compiler-rt/lib/ubsan/ubsan_diag.h
@@ -177,7 +177,7 @@ class Diag {
};
private:
- static const unsigned MaxArgs = 8;
+ static const unsigned MaxArgs = 9;
static const unsigned MaxRanges = 1;
/// The arguments which have been added to this diagnostic so far.
diff --git a/compiler-rt/lib/ubsan/ubsan_handlers.cpp b/compiler-rt/lib/ubsan/ubsan_handlers.cpp
index 0f16507d5d88f1..1232f60836b090 100644
--- a/compiler-rt/lib/ubsan/ubsan_handlers.cpp
+++ b/compiler-rt/lib/ubsan/ubsan_handlers.cpp
@@ -594,14 +594,16 @@ static void handleImplicitConversion(ImplicitConversionData *Data,
ScopedReport R(Opts, Loc, ET);
- // FIXME: is it possible to dump the values as hex with fixed width?
+ unsigned DstBits =
+ Data->BitfieldBits ? Data->BitfieldBits : DstTy.getIntegerBitWidth();
+ // FIXME: is it possible to dump the values as hex with fixed width?
Diag(Loc, DL_Error, ET,
"implicit conversion from type %0 of value %1 (%2-bit, %3signed) to "
- "type %4 changed the value to %5 (%6-bit, %7signed)")
+ "type %4 changed the value to %5 (%6-bit%7, %8signed)")
<< SrcTy << Value(SrcTy, Src) << SrcTy.getIntegerBitWidth()
- << (SrcSigned ? "" : "un") << DstTy << Value(DstTy, Dst)
- << DstTy.getIntegerBitWidth() << (DstSigned ? "" : "un");
+ << (SrcSigned ? "" : "un") << DstTy << Value(DstTy, Dst) << DstBits
+ << (Data->BitfieldBits ? " bitfield" : "") << (DstSigned ? "" : "un");
}
void __ubsan::__ubsan_handle_implicit_conversion(ImplicitConversionData *Data,
diff --git a/compiler-rt/lib...
[truncated]
|
Hi, @vitalybuka @zygoloid @AaronBallman you might have something to say about this commit |
Is is UB? |
@@ -1094,6 +1114,27 @@ void ScalarExprEmitter::EmitIntegerTruncationCheck(Value *Src, QualType SrcType, | |||
{Src, Dst}); | |||
} | |||
|
|||
static llvm::Value *EmitIsNegativeTestHelper(Value *V, QualType VType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please extract lambda to function into a separate patch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still see it here, does it needs rebase, or that patch needs to be landed?
@LebedevRI who added implicit-integer-truncation |
I guess it follows existing convention |
Don't think so? To quote [1]:
where "these sanitizers" refer to [1] https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html |
Feedback @vitalybuka @AaronBallman @LebedevRI? 😃 |
Please add test coverage for compound assignment and increment/decrement. This seems like a reasonable extension of the existing integer truncation checks, but we might want to consider giving it a unique name. Otherwise, people using integer truncation checks will have trouble upgrading their compiler. |
Sure 👍 Fair enough, what name do you suggest? Replace "integer" with "bitfield" as in |
Hi again, I have now finally gotten time and updated the patch so that the unnecessary emits I mentioned in the initial commit are avoided. The current patch introduces a number of new fsanitizer flags to separate integer conversions from bitfield conversions. E.g.,
Previously the following: typedef struct {
unsigned char a:4;
} X;
int main(void) {
X x;
unsigned int a = 272;
x.a = a;
return 0;
} emitted an implicit integer conversion error in the assignment of I have compiled clang with the -fsanitizer flag |
LGTM, but someone else must approve. Only other thing I might add is to try enabling these checks for some open source test cases. Maybe look through some past reviews to see what others have done, or wait for one of the maintainers/owners to comment on which ones might be best. Thanks for doing this work, Axel! |
Precommit CI seems to have found some related failures that should be investigated. I'd love to hear opinions from the codegen code owners on this, as well as @zygoloid as UBSan code owner. I'm not opposed to the changes, though I do find it somewhat strange to add them to UBSan given that they're not undefined behavior (even though there's precedent). |
537ee15
to
c70b72b
Compare
@@ -147,6 +147,7 @@ struct ImplicitConversionData { | |||
const TypeDescriptor &FromType; | |||
const TypeDescriptor &ToType; | |||
/* ImplicitConversionCheckKind */ unsigned char Kind; | |||
unsigned int BitfieldBits; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have an ABI stability policy for UBsan? Is the change allowed under it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping @zygoloid -- I'm not certain if we have an ABI stability policy for sanitizers or not.
@LebedevRI LGTM? |
This patch implements the implicit truncation and implicit sign change checks for bitfields.
Update testcases due to new -fsanitize options regarding bitfields
Also, I don't have commit access (first PR here), someone has to commit on my behalf! |
|
Thank you for mentioning this! I'll commit on your behalf, but I'd like to hear from @zygoloid on the ABI question. It would also be nice if @efriedma-quic signed off on the testing changes (I think they're sufficient, but he may have other cases in mind). |
I don't think we've established an explicit policy, and we've made ABI-breaking changes previously. I think we should avoid breaks within a major release version (don't backport this) to avoid giving packagers headaches: the ubsan runtime is installed in a versioned directory, but I don't think that includes the patch version. Having an ABI break here seems OK and in line with historic precedent. (But we ought to document our ABI policy for our runtime libraries.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@AaronBallman looks good? |
This bot started inconsistently crash https://lab.llvm.org/buildbot/#/builders/75/builds/45086 |
These are consistently broken after the patch: |
Thank you for catching that and doing the revert! |
Will look into this... |
Probably 1189e87 fixed the issue. |
Still had to revert with #87562 |
) Fix since #75481 got reverted. - Explicitly set BitfieldBits to 0 to avoid uninitialized field member for the integer checks: ```diff - llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first)}; + llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first), + llvm::ConstantInt::get(Builder.getInt32Ty(), 0)}; ``` - `Value **Previous` was erroneously `Value *Previous` in `CodeGenFunction::EmitWithOriginalRHSBitfieldAssignment`, fixed now. - Update following: ```diff - if (Kind == CK_IntegralCast) { + if (Kind == CK_IntegralCast || Kind == CK_LValueToRValue) { ``` CK_LValueToRValue when going from, e.g., char to char, and CK_IntegralCast otherwise. - Make sure that `Value *Previous = nullptr;` is initialized (see 1189e87) - Add another extensive testcase `ubsan/TestCases/ImplicitConversion/bitfield-conversion.c` --------- Co-authored-by: Vitaly Buka <vitalybuka@gmail.com>
This patch implements the implicit truncation and implicit sign change checks for bitfields using UBSan. E.g.,
-fsanitize=implicit-bitfield-truncation
and-fsanitize=implicit-bitfield-sign-change
.However, right now some unnecessary emits are generated that ideally will be removed in the future. Let me explain.This commit implements the truncation and sign change checks whenever we are emitting aEmitStoreThroughBitfieldLValue
. I use thellvm::Value
updated throughEmitStoreThroughBitfieldLValue
to check whether the value changed or not. However, callingEmitStoreThroughBitfieldLValue
means we have already evaluated RHS and potentially emitted a truncation or sign check already, which is the case if we have a bitfield of typechar
and the RHS is of typeint
. Thus, in this case we may report the truncation or sign change error two times at runtime under the right circumstances which is not ideal. E.g.,producesbecause we truncate the value 272 into a char when evaluating the RHS (resulting in the value 16) and then truncate it again when assigning it to the bitfield.I thought of disabling the relevant sanitizer checks when evaluating the RHS, but to make it work for all cases I need to extract the previous (before casting it to char in the above case)llvm::Value
fromVisit(E->getRHS())
in order to compare it, which seemed more complicated for a first approach. The current approach is easier to iterate upon and is more isolated, but again, not ideal.I would appreciate feedback so that I can advance this to a mergeable state. This is my first commit here, so I have probably missed some essentials!