Skip to content

Commit

Permalink
[clang][ubsan] Implicit Conversion Sanitizer - integer truncation - c…
Browse files Browse the repository at this point in the history
…lang part

Summary:
C and C++ are interesting languages. They are statically typed, but weakly.
The implicit conversions are allowed. This is nice, allows to write code
while balancing between getting drowned in everything being convertible,
and nothing being convertible. As usual, this comes with a price:

```
unsigned char store = 0;

bool consume(unsigned int val);

void test(unsigned long val) {
  if (consume(val)) {
    // the 'val' is `unsigned long`, but `consume()` takes `unsigned int`.
    // If their bit widths are different on this platform, the implicit
    // truncation happens. And if that `unsigned long` had a value bigger
    // than UINT_MAX, then you may or may not have a bug.

    // Similarly, integer addition happens on `int`s, so `store` will
    // be promoted to an `int`, the sum calculated (0+768=768),
    // and the result demoted to `unsigned char`, and stored to `store`.
    // In this case, the `store` will still be 0. Again, not always intended.
    store = store + 768; // before addition, 'store' was promoted to int.
  }

  // But yes, sometimes this is intentional.
  // You can either make the conversion explicit
  (void)consume((unsigned int)val);
  // or mask the value so no bits will be *implicitly* lost.
  (void)consume((~((unsigned int)0)) & val);
}
```

Yes, there is a `-Wconversion`` diagnostic group, but first, it is kinda
noisy, since it warns on everything (unlike sanitizers, warning on an
actual issues), and second, there are cases where it does **not** warn.
So a Sanitizer is needed. I don't have any motivational numbers, but i know
i had this kind of problem 10-20 times, and it was never easy to track down.

The logic to detect whether an truncation has happened is pretty simple
if you think about it - https://godbolt.org/g/NEzXbb - basically, just
extend (using the new, not original!, signedness) the 'truncated' value
back to it's original width, and equality-compare it with the original value.

The most non-trivial thing here is the logic to detect whether this
`ImplicitCastExpr` AST node is **actually** an implicit conversion, //or//
part of an explicit cast. Because the explicit casts are modeled as an outer
`ExplicitCastExpr` with some `ImplicitCastExpr`'s as **direct** children.
https://godbolt.org/g/eE1GkJ

Nowadays, we can just use the new `part_of_explicit_cast` flag, which is set
on all the implicitly-added `ImplicitCastExpr`'s of an `ExplicitCastExpr`.
So if that flag is **not** set, then it is an actual implicit conversion.

As you may have noted, this isn't just named `-fsanitize=implicit-integer-truncation`.
There are potentially some more implicit conversions to be warned about.
Namely, implicit conversions that result in sign change; implicit conversion
between different floating point types, or between fp and an integer,
when again, that conversion is lossy.

One thing i know isn't handled is bitfields.

This is a clang part.
The compiler-rt part is D48959.

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=21530 | PR21530 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=37552 | PR37552 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=35409 | PR35409 ]].
Partially fixes [[ https://bugs.llvm.org/show_bug.cgi?id=9821 | PR9821 ]].
Fixes google/sanitizers#940. (other than sign-changing implicit conversions)

Reviewers: rjmccall, rsmith, samsonov, pcc, vsk, eugenis, efriedma, kcc, erichkeane

Reviewed By: rsmith, vsk, erichkeane

Subscribers: erichkeane, klimek, #sanitizers, aaron.ballman, RKSimon, dtzWill, filcab, danielaustin, ygribov, dvyukov, milianw, mclow.lists, cfe-commits, regehr

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D48958

llvm-svn: 338288
  • Loading branch information
LebedevRI committed Jul 30, 2018
1 parent eb4a9bc commit b69ba22
Show file tree
Hide file tree
Showing 11 changed files with 829 additions and 37 deletions.
32 changes: 30 additions & 2 deletions clang/docs/ReleaseNotes.rst
Expand Up @@ -46,7 +46,9 @@ sections with improvements to Clang's support for those languages.
Major New Features
------------------

- ...
- A new Implicit Conversion Sanitizer (``-fsanitize=implicit-conversion``) group
was added. Please refer to the :ref:`release-notes-ubsan` section of the
release notes for the details.

Improvements to Clang's diagnostics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -280,10 +282,36 @@ Static Analyzer

...

.. _release-notes-ubsan:

Undefined Behavior Sanitizer (UBSan)
------------------------------------

* ...
* A new Implicit Conversion Sanitizer (``-fsanitize=implicit-conversion``) group
was added.

Currently, only one type of issues is caught - implicit integer truncation
(``-fsanitize=implicit-integer-truncation``), also known as integer demotion.
While there is a ``-Wconversion`` diagnostic group that catches this kind of
issues, it is both noisy, and does not catch **all** the cases.

.. code-block:: c++

unsigned char store = 0;

bool consume(unsigned int val);

void test(unsigned long val) {
if (consume(val)) // the value may have been silently truncated.
store = store + 768; // before addition, 'store' was promoted to int.
(void)consume((unsigned int)val); // OK, the truncation is explicit.
}

Just like other ``-fsanitize=integer`` checks, these issues are **not**
undefined behaviour. But they are not *always* intentional, and are somewhat
hard to track down. This group is **not** enabled by ``-fsanitize=undefined``,
but the ``-fsanitize=implicit-integer-truncation`` check
is enabled by ``-fsanitize=integer``.

Core Analysis Improvements
==========================
Expand Down
34 changes: 26 additions & 8 deletions clang/docs/UndefinedBehaviorSanitizer.rst
Expand Up @@ -89,6 +89,12 @@ Available checks are:
- ``-fsanitize=function``: Indirect call of a function through a
function pointer of the wrong type (Darwin/Linux, C++ and x86/x86_64
only).
- ``-fsanitize=implicit-integer-truncation``: Implicit conversion from
integer of larger bit width to smaller bit width, if that results in data
loss. That is, if the demoted value, after casting back to the original
width, is not equal to the original value before the downcast.
Issues caught by this sanitizer are not undefined behavior,
but are often unintentional.
- ``-fsanitize=integer-divide-by-zero``: Integer division by zero.
- ``-fsanitize=nonnull-attribute``: Passing null pointer as a function
parameter which is declared to never be null.
Expand Down Expand Up @@ -121,15 +127,21 @@ Available checks are:
unsigned overflow in C++. You can use ``-fsanitize=shift-base`` or
``-fsanitize=shift-exponent`` to check only left-hand side or
right-hand side of shift operation, respectively.
- ``-fsanitize=signed-integer-overflow``: Signed integer overflow,
including all the checks added by ``-ftrapv``, and checking for
overflow in signed division (``INT_MIN / -1``).
- ``-fsanitize=signed-integer-overflow``: Signed integer overflow, where the
result of a signed integer computation cannot be represented in its type.
This includes all the checks covered by ``-ftrapv``, as well as checks for
signed division overflow (``INT_MIN/-1``), but not checks for
lossy implicit conversions performed after the computation
(see ``-fsanitize=implicit-conversion``). Both of these two issues are
handled by ``-fsanitize=implicit-conversion`` group of checks.
- ``-fsanitize=unreachable``: If control flow reaches an unreachable
program point.
- ``-fsanitize=unsigned-integer-overflow``: Unsigned integer
overflows. Note that unlike signed integer overflow, unsigned integer
is not undefined behavior. However, while it has well-defined semantics,
it is often unintentional, so UBSan offers to catch it.
- ``-fsanitize=unsigned-integer-overflow``: Unsigned integer overflow, where
the result of an unsigned integer computation cannot be represented in its
type. Unlike signed integer overflow, this is not undefined behavior, but
it is often unintentional. This sanitizer does not check for lossy implicit
conversions performed after such a computation
(see ``-fsanitize=implicit-conversion``).
- ``-fsanitize=vla-bound``: A variable-length array whose bound
does not evaluate to a positive value.
- ``-fsanitize=vptr``: Use of an object whose vptr indicates that it is of
Expand All @@ -140,11 +152,17 @@ Available checks are:

You can also use the following check groups:
- ``-fsanitize=undefined``: All of the checks listed above other than
``unsigned-integer-overflow`` and the ``nullability-*`` checks.
``unsigned-integer-overflow``, ``implicit-conversion`` and the
``nullability-*`` group of checks.
- ``-fsanitize=undefined-trap``: Deprecated alias of
``-fsanitize=undefined``.
- ``-fsanitize=integer``: Checks for undefined or suspicious integer
behavior (e.g. unsigned integer overflow).
Enables ``signed-integer-overflow``, ``unsigned-integer-overflow``,
``shift``, ``integer-divide-by-zero``, and ``implicit-integer-truncation``.
- ``-fsanitize=implicit-conversion``: Checks for suspicious behaviours of
implicit conversions.
Currently, only ``-fsanitize=implicit-integer-truncation`` is implemented.
- ``-fsanitize=nullability``: Enables ``nullability-arg``,
``nullability-assign``, and ``nullability-return``. While violating
nullability does not have undefined behavior, it is often unintentional,
Expand Down
9 changes: 7 additions & 2 deletions clang/include/clang/Basic/Sanitizers.def
Expand Up @@ -131,9 +131,14 @@ SANITIZER_GROUP("undefined", Undefined,
// -fsanitize=undefined-trap is an alias for -fsanitize=undefined.
SANITIZER_GROUP("undefined-trap", UndefinedTrap, Undefined)

// ImplicitConversionSanitizer
SANITIZER("implicit-integer-truncation", ImplicitIntegerTruncation)
SANITIZER_GROUP("implicit-conversion", ImplicitConversion,
ImplicitIntegerTruncation)

SANITIZER_GROUP("integer", Integer,
SignedIntegerOverflow | UnsignedIntegerOverflow | Shift |
IntegerDivideByZero)
ImplicitIntegerTruncation | IntegerDivideByZero | Shift |
SignedIntegerOverflow | UnsignedIntegerOverflow)

SANITIZER("local-bounds", LocalBounds)
SANITIZER_GROUP("bounds", Bounds, ArrayBounds | LocalBounds)
Expand Down
3 changes: 2 additions & 1 deletion clang/include/clang/Basic/Sanitizers.h
Expand Up @@ -84,7 +84,8 @@ SanitizerMask expandSanitizerGroups(SanitizerMask Kinds);
/// Return the sanitizers which do not affect preprocessing.
inline SanitizerMask getPPTransparentSanitizers() {
return SanitizerKind::CFI | SanitizerKind::Integer |
SanitizerKind::Nullability | SanitizerKind::Undefined;
SanitizerKind::ImplicitConversion | SanitizerKind::Nullability |
SanitizerKind::Undefined;
}

} // namespace clang
Expand Down
105 changes: 89 additions & 16 deletions clang/lib/CodeGen/CGExprScalar.cpp
Expand Up @@ -299,13 +299,31 @@ class ScalarExprEmitter
Value *Src, QualType SrcType, QualType DstType,
llvm::Type *DstTy, SourceLocation Loc);

/// Known implicit conversion check kinds.
/// Keep in sync with the enum of the same name in ubsan_handlers.h
enum ImplicitConversionCheckKind : unsigned char {
ICCK_IntegerTruncation = 0,
};

/// Emit a check that an [implicit] truncation of an integer does not
/// discard any bits. It is not UB, so we use the value after truncation.
void EmitIntegerTruncationCheck(Value *Src, QualType SrcType, Value *Dst,
QualType DstType, SourceLocation Loc);

/// Emit a conversion from the specified type to the specified destination
/// type, both of which are LLVM scalar types.
Value *EmitScalarConversion(Value *Src, QualType SrcTy, QualType DstTy,
SourceLocation Loc);
struct ScalarConversionOpts {
bool TreatBooleanAsSigned;
bool EmitImplicitIntegerTruncationChecks;

Value *EmitScalarConversion(Value *Src, QualType SrcTy, QualType DstTy,
SourceLocation Loc, bool TreatBooleanAsSigned);
ScalarConversionOpts()
: TreatBooleanAsSigned(false),
EmitImplicitIntegerTruncationChecks(false) {}
};
Value *
EmitScalarConversion(Value *Src, QualType SrcTy, QualType DstTy,
SourceLocation Loc,
ScalarConversionOpts Opts = ScalarConversionOpts());

/// Emit a conversion from the specified complex type to the specified
/// destination type, where the destination type is an LLVM scalar type.
Expand Down Expand Up @@ -923,18 +941,59 @@ void ScalarExprEmitter::EmitFloatConversionCheck(
SanitizerHandler::FloatCastOverflow, StaticArgs, OrigSrc);
}

/// Emit a conversion from the specified type to the specified destination type,
/// both of which are LLVM scalar types.
Value *ScalarExprEmitter::EmitScalarConversion(Value *Src, QualType SrcType,
QualType DstType,
SourceLocation Loc) {
return EmitScalarConversion(Src, SrcType, DstType, Loc, false);
void ScalarExprEmitter::EmitIntegerTruncationCheck(Value *Src, QualType SrcType,
Value *Dst, QualType DstType,
SourceLocation Loc) {
if (!CGF.SanOpts.has(SanitizerKind::ImplicitIntegerTruncation))
return;

llvm::Type *SrcTy = Src->getType();
llvm::Type *DstTy = Dst->getType();

// We only care about int->int conversions here.
// We ignore conversions to/from pointer and/or bool.
if (!(SrcType->isIntegerType() && DstType->isIntegerType()))
return;

assert(isa<llvm::IntegerType>(SrcTy) && isa<llvm::IntegerType>(DstTy) &&
"clang integer type lowered to non-integer llvm type");

unsigned SrcBits = SrcTy->getScalarSizeInBits();
unsigned DstBits = DstTy->getScalarSizeInBits();
// This must be truncation. Else we do not care.
if (SrcBits <= DstBits)
return;

assert(!DstType->isBooleanType() && "we should not get here with booleans.");

CodeGenFunction::SanitizerScope SanScope(&CGF);

llvm::Value *Check = nullptr;

// 1. Extend the truncated value back to the same width as the Src.
bool InputSigned = DstType->isSignedIntegerOrEnumerationType();
Check = Builder.CreateIntCast(Dst, SrcTy, InputSigned, "anyext");
// 2. Equality-compare with the original source value
Check = Builder.CreateICmpEQ(Check, Src, "truncheck");
// If the comparison result is 'i1 false', then the truncation was lossy.

llvm::Constant *StaticArgs[] = {
CGF.EmitCheckSourceLocation(Loc), CGF.EmitCheckTypeDescriptor(SrcType),
CGF.EmitCheckTypeDescriptor(DstType),
llvm::ConstantInt::get(Builder.getInt8Ty(), ICCK_IntegerTruncation)};
CGF.EmitCheck(std::make_pair(Check, SanitizerKind::ImplicitIntegerTruncation),
SanitizerHandler::ImplicitConversion, StaticArgs, {Src, Dst});
}

/// Emit a conversion from the specified type to the specified destination type,
/// both of which are LLVM scalar types.
Value *ScalarExprEmitter::EmitScalarConversion(Value *Src, QualType SrcType,
QualType DstType,
SourceLocation Loc,
bool TreatBooleanAsSigned) {
ScalarConversionOpts Opts) {
QualType NoncanonicalSrcType = SrcType;
QualType NoncanonicalDstType = DstType;

SrcType = CGF.getContext().getCanonicalType(SrcType);
DstType = CGF.getContext().getCanonicalType(DstType);
if (SrcType == DstType) return Src;
Expand Down Expand Up @@ -1083,7 +1142,7 @@ Value *ScalarExprEmitter::EmitScalarConversion(Value *Src, QualType SrcType,

if (isa<llvm::IntegerType>(SrcTy)) {
bool InputSigned = SrcType->isSignedIntegerOrEnumerationType();
if (SrcType->isBooleanType() && TreatBooleanAsSigned) {
if (SrcType->isBooleanType() && Opts.TreatBooleanAsSigned) {
InputSigned = true;
}
if (isa<llvm::IntegerType>(DstTy))
Expand Down Expand Up @@ -1118,6 +1177,10 @@ Value *ScalarExprEmitter::EmitScalarConversion(Value *Src, QualType SrcType,
}
}

if (Opts.EmitImplicitIntegerTruncationChecks)
EmitIntegerTruncationCheck(Src, NoncanonicalSrcType, Res,
NoncanonicalDstType, Loc);

return Res;
}

Expand Down Expand Up @@ -1812,16 +1875,26 @@ Value *ScalarExprEmitter::VisitCastExpr(CastExpr *CE) {
return Builder.CreateVectorSplat(NumElements, Elt, "splat");
}

case CK_IntegralCast:
case CK_IntegralCast: {
ScalarConversionOpts Opts;
if (CGF.SanOpts.has(SanitizerKind::ImplicitIntegerTruncation)) {
if (auto *ICE = dyn_cast<ImplicitCastExpr>(CE))
Opts.EmitImplicitIntegerTruncationChecks = !ICE->isPartOfExplicitCast();
}
return EmitScalarConversion(Visit(E), E->getType(), DestTy,
CE->getExprLoc(), Opts);
}
case CK_IntegralToFloating:
case CK_FloatingToIntegral:
case CK_FloatingCast:
return EmitScalarConversion(Visit(E), E->getType(), DestTy,
CE->getExprLoc());
case CK_BooleanToSignedIntegral:
case CK_BooleanToSignedIntegral: {
ScalarConversionOpts Opts;
Opts.TreatBooleanAsSigned = true;
return EmitScalarConversion(Visit(E), E->getType(), DestTy,
CE->getExprLoc(),
/*TreatBooleanAsSigned=*/true);
CE->getExprLoc(), Opts);
}
case CK_IntegralToBoolean:
return EmitIntToBoolConversion(Visit(E));
case CK_PointerToBoolean:
Expand Down
1 change: 1 addition & 0 deletions clang/lib/CodeGen/CodeGenFunction.h
Expand Up @@ -116,6 +116,7 @@ enum TypeEvaluationKind {
SANITIZER_CHECK(DynamicTypeCacheMiss, dynamic_type_cache_miss, 0) \
SANITIZER_CHECK(FloatCastOverflow, float_cast_overflow, 0) \
SANITIZER_CHECK(FunctionTypeMismatch, function_type_mismatch, 0) \
SANITIZER_CHECK(ImplicitConversion, implicit_conversion, 0) \
SANITIZER_CHECK(InvalidBuiltin, invalid_builtin, 0) \
SANITIZER_CHECK(LoadInvalidValue, load_invalid_value, 0) \
SANITIZER_CHECK(MissingReturn, missing_return, 0) \
Expand Down
10 changes: 5 additions & 5 deletions clang/lib/Driver/SanitizerArgs.cpp
Expand Up @@ -27,22 +27,22 @@ using namespace clang::driver;
using namespace llvm::opt;

enum : SanitizerMask {
NeedsUbsanRt = Undefined | Integer | Nullability | CFI,
NeedsUbsanRt = Undefined | Integer | ImplicitConversion | Nullability | CFI,
NeedsUbsanCxxRt = Vptr | CFI,
NotAllowedWithTrap = Vptr,
NotAllowedWithMinimalRuntime = Vptr,
RequiresPIE = DataFlow | HWAddress | Scudo,
NeedsUnwindTables = Address | HWAddress | Thread | Memory | DataFlow,
SupportsCoverage = Address | HWAddress | KernelAddress | KernelHWAddress |
Memory | Leak | Undefined | Integer | Nullability |
DataFlow | Fuzzer | FuzzerNoLink,
RecoverableByDefault = Undefined | Integer | Nullability,
Memory | Leak | Undefined | Integer | ImplicitConversion |
Nullability | DataFlow | Fuzzer | FuzzerNoLink,
RecoverableByDefault = Undefined | Integer | ImplicitConversion | Nullability,
Unrecoverable = Unreachable | Return,
AlwaysRecoverable = KernelAddress | KernelHWAddress,
LegacyFsanitizeRecoverMask = Undefined | Integer,
NeedsLTO = CFI,
TrappingSupported = (Undefined & ~Vptr) | UnsignedIntegerOverflow |
Nullability | LocalBounds | CFI,
ImplicitConversion | Nullability | LocalBounds | CFI,
TrappingDefault = CFI,
CFIClasses =
CFIVCall | CFINVCall | CFIMFCall | CFIDerivedCast | CFIUnrelatedCast,
Expand Down
4 changes: 2 additions & 2 deletions clang/lib/Driver/ToolChain.cpp
Expand Up @@ -803,8 +803,8 @@ SanitizerMask ToolChain::getSupportedSanitizers() const {
using namespace SanitizerKind;

SanitizerMask Res = (Undefined & ~Vptr & ~Function) | (CFI & ~CFIICall) |
CFICastStrict | UnsignedIntegerOverflow | Nullability |
LocalBounds;
CFICastStrict | UnsignedIntegerOverflow |
ImplicitConversion | Nullability | LocalBounds;
if (getTriple().getArch() == llvm::Triple::x86 ||
getTriple().getArch() == llvm::Triple::x86_64 ||
getTriple().getArch() == llvm::Triple::arm ||
Expand Down

0 comments on commit b69ba22

Please sign in to comment.