New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid undefined behavior when doing pointer calculation. #1566
Conversation
Performing `base + offset` pointer arithmetic is only allowed when `base` itself is not nullptr. In other words, the compiler is assumed to allow that `base + offset` is always non-null, which an upcoming compiler release will do in this case. The result is that CommandStream.cpp, which calls this in a loop until the result is nullptr, will never terminate (until it runs junk data and crashes). Avoid this by using intptr_t instead of actual pointers (i.e. char*), which seems to avoid this failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Another reminder that I don't like undefined behaviors :)
@rupprecht what if |
Oh wait... isn't the reason that |
@rupprecht honestly I think this is bug in the optimizer of the upcoming compiler. The compiler is allowed to assume that But then you're essentially saying that a pointer can be Maybe the only true correct solution is to do the equivalent of |
I had the same reaction, and suggested upstream that it might be a miscompile at first, but it's definitely UB (if you're a language lawyer -- which I'm not, but I know a few). I think this is the relevant part of the spec explaining UB for this: http://eel.is/c++draft/expr.add#4 Basically, C++ says that expressions like We've seen test failures the other way -- code written like Here's the LLVM commit that takes advantage of this (i.e. causes test failures for filament): https://reviews.llvm.org/rL369789 By switching to |
…defined behaviour Summary: Quote from http://eel.is/c++draft/expr.add#4: ``` 4 When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P. (4.1) If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value. (4.2) Otherwise, if P points to an array element i of an array object x with n elements ([dcl.array]), the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) array element i+j of x if 0≤i+j≤n and the expression P - J points to the (possibly-hypothetical) array element i−j of x if 0≤i−j≤n. (4.3) Otherwise, the behavior is undefined. ``` Therefore, as per the standard, applying non-zero offset to `nullptr` (or making non-`nullptr` a `nullptr`, by subtracting pointer's integral value from the pointer itself) is undefined behavior. (*if* `nullptr` is not defined, i.e. e.g. `-fno-delete-null-pointer-checks` was *not* specified.) To make things more fun, in C (6.5.6p8), applying *any* offset to null pointer is undefined, although Clang front-end pessimizes the code by not lowering that info, so this UB is "harmless". Since rL369789 (D66608 `[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null`) LLVM middle-end uses those guarantees for transformations. If the source contains such UB's, said code may now be miscompiled. Such miscompilations were already observed: * https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/687838.html * google/filament#1566 Surprisingly, UBSan does not catch those issues ... until now. This diff teaches UBSan about these UB's. `getelementpointer inbounds` is a pretty frequent instruction, so this does have a measurable impact on performance; I've addressed most of the obvious missing folds (and thus decreased the performance impact by ~5%), and then re-performed some performance measurements using my [[ https://github.com/darktable-org/rawspeed | RawSpeed ]] benchmark: (all measurements done with LLVM ToT, the sanitizer never fired.) * no sanitization vs. existing check: average `+21.62%` slowdown * existing check vs. check after this patch: average `22.04%` slowdown * no sanitization vs. this patch: average `48.42%` slowdown Reviewers: vsk, filcab, rsmith, aaron.ballman, vitalybuka, rjmccall, #sanitizers Reviewed By: rsmith Subscribers: kristof.beyls, nickdesaulniers, nikic, ychen, dtzWill, xbolva00, dberris, arphaman, rupprecht, reames, regehr, llvm-commits, cfe-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D67122 llvm-svn: 374293
…defined behaviour Summary: Quote from http://eel.is/c++draft/expr.add#4: ``` 4 When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P. (4.1) If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value. (4.2) Otherwise, if P points to an array element i of an array object x with n elements ([dcl.array]), the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) array element i+j of x if 0≤i+j≤n and the expression P - J points to the (possibly-hypothetical) array element i−j of x if 0≤i−j≤n. (4.3) Otherwise, the behavior is undefined. ``` Therefore, as per the standard, applying non-zero offset to `nullptr` (or making non-`nullptr` a `nullptr`, by subtracting pointer's integral value from the pointer itself) is undefined behavior. (*if* `nullptr` is not defined, i.e. e.g. `-fno-delete-null-pointer-checks` was *not* specified.) To make things more fun, in C (6.5.6p8), applying *any* offset to null pointer is undefined, although Clang front-end pessimizes the code by not lowering that info, so this UB is "harmless". Since rL369789 (D66608 `[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null`) LLVM middle-end uses those guarantees for transformations. If the source contains such UB's, said code may now be miscompiled. Such miscompilations were already observed: * https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/687838.html * google/filament#1566 Surprisingly, UBSan does not catch those issues ... until now. This diff teaches UBSan about these UB's. `getelementpointer inbounds` is a pretty frequent instruction, so this does have a measurable impact on performance; I've addressed most of the obvious missing folds (and thus decreased the performance impact by ~5%), and then re-performed some performance measurements using my [[ https://github.com/darktable-org/rawspeed | RawSpeed ]] benchmark: (all measurements done with LLVM ToT, the sanitizer never fired.) * no sanitization vs. existing check: average `+21.62%` slowdown * existing check vs. check after this patch: average `22.04%` slowdown * no sanitization vs. this patch: average `48.42%` slowdown Reviewers: vsk, filcab, rsmith, aaron.ballman, vitalybuka, rjmccall, #sanitizers Reviewed By: rsmith Subscribers: kristof.beyls, nickdesaulniers, nikic, ychen, dtzWill, xbolva00, dberris, arphaman, rupprecht, reames, regehr, llvm-commits, cfe-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D67122 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374293 91177308-0d34-0410-b5e6-96231b3b80d8
…defined behaviour Summary: Quote from http://eel.is/c++draft/expr.add#4: ``` 4 When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P. (4.1) If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value. (4.2) Otherwise, if P points to an array element i of an array object x with n elements ([dcl.array]), the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) array element i+j of x if 0≤i+j≤n and the expression P - J points to the (possibly-hypothetical) array element i−j of x if 0≤i−j≤n. (4.3) Otherwise, the behavior is undefined. ``` Therefore, as per the standard, applying non-zero offset to `nullptr` (or making non-`nullptr` a `nullptr`, by subtracting pointer's integral value from the pointer itself) is undefined behavior. (*if* `nullptr` is not defined, i.e. e.g. `-fno-delete-null-pointer-checks` was *not* specified.) To make things more fun, in C (6.5.6p8), applying *any* offset to null pointer is undefined, although Clang front-end pessimizes the code by not lowering that info, so this UB is "harmless". Since rL369789 (D66608 `[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null`) LLVM middle-end uses those guarantees for transformations. If the source contains such UB's, said code may now be miscompiled. Such miscompilations were already observed: * https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/687838.html * google/filament#1566 Surprisingly, UBSan does not catch those issues ... until now. This diff teaches UBSan about these UB's. `getelementpointer inbounds` is a pretty frequent instruction, so this does have a measurable impact on performance; I've addressed most of the obvious missing folds (and thus decreased the performance impact by ~5%), and then re-performed some performance measurements using my [[ https://github.com/darktable-org/rawspeed | RawSpeed ]] benchmark: (all measurements done with LLVM ToT, the sanitizer never fired.) * no sanitization vs. existing check: average `+21.62%` slowdown * existing check vs. check after this patch: average `22.04%` slowdown * no sanitization vs. this patch: average `48.42%` slowdown Reviewers: vsk, filcab, rsmith, aaron.ballman, vitalybuka, rjmccall, #sanitizers Reviewed By: rsmith Subscribers: kristof.beyls, nickdesaulniers, nikic, ychen, dtzWill, xbolva00, dberris, arphaman, rupprecht, reames, regehr, llvm-commits, cfe-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D67122 git-svn-id: https://llvm.org/svn/llvm-project/compiler-rt/trunk@374293 91177308-0d34-0410-b5e6-96231b3b80d8
…defined behaviour Summary: Quote from http://eel.is/c++draft/expr.add#4: ``` 4 When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P. (4.1) If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value. (4.2) Otherwise, if P points to an array element i of an array object x with n elements ([dcl.array]), the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) array element i+j of x if 0≤i+j≤n and the expression P - J points to the (possibly-hypothetical) array element i−j of x if 0≤i−j≤n. (4.3) Otherwise, the behavior is undefined. ``` Therefore, as per the standard, applying non-zero offset to `nullptr` (or making non-`nullptr` a `nullptr`, by subtracting pointer's integral value from the pointer itself) is undefined behavior. (*if* `nullptr` is not defined, i.e. e.g. `-fno-delete-null-pointer-checks` was *not* specified.) To make things more fun, in C (6.5.6p8), applying *any* offset to null pointer is undefined, although Clang front-end pessimizes the code by not lowering that info, so this UB is "harmless". Since rL369789 (D66608 `[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null`) LLVM middle-end uses those guarantees for transformations. If the source contains such UB's, said code may now be miscompiled. Such miscompilations were already observed: * https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/687838.html * google/filament#1566 Surprisingly, UBSan does not catch those issues ... until now. This diff teaches UBSan about these UB's. `getelementpointer inbounds` is a pretty frequent instruction, so this does have a measurable impact on performance; I've addressed most of the obvious missing folds (and thus decreased the performance impact by ~5%), and then re-performed some performance measurements using my [[ https://github.com/darktable-org/rawspeed | RawSpeed ]] benchmark: (all measurements done with LLVM ToT, the sanitizer never fired.) * no sanitization vs. existing check: average `+21.62%` slowdown * existing check vs. check after this patch: average `22.04%` slowdown * no sanitization vs. this patch: average `48.42%` slowdown Reviewers: vsk, filcab, rsmith, aaron.ballman, vitalybuka, rjmccall, #sanitizers Reviewed By: rsmith Subscribers: kristof.beyls, nickdesaulniers, nikic, ychen, dtzWill, xbolva00, dberris, arphaman, rupprecht, reames, regehr, llvm-commits, cfe-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D67122 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374293 91177308-0d34-0410-b5e6-96231b3b80d8
…defined behaviour Summary: Quote from http://eel.is/c++draft/expr.add#4: ``` 4 When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P. (4.1) If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value. (4.2) Otherwise, if P points to an array element i of an array object x with n elements ([dcl.array]), the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) array element i+j of x if 0≤i+j≤n and the expression P - J points to the (possibly-hypothetical) array element i−j of x if 0≤i−j≤n. (4.3) Otherwise, the behavior is undefined. ``` Therefore, as per the standard, applying non-zero offset to `nullptr` (or making non-`nullptr` a `nullptr`, by subtracting pointer's integral value from the pointer itself) is undefined behavior. (*if* `nullptr` is not defined, i.e. e.g. `-fno-delete-null-pointer-checks` was *not* specified.) To make things more fun, in C (6.5.6p8), applying *any* offset to null pointer is undefined, although Clang front-end pessimizes the code by not lowering that info, so this UB is "harmless". Since rL369789 (D66608 `[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null`) LLVM middle-end uses those guarantees for transformations. If the source contains such UB's, said code may now be miscompiled. Such miscompilations were already observed: * https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/687838.html * google/filament#1566 Surprisingly, UBSan does not catch those issues ... until now. This diff teaches UBSan about these UB's. `getelementpointer inbounds` is a pretty frequent instruction, so this does have a measurable impact on performance; I've addressed most of the obvious missing folds (and thus decreased the performance impact by ~5%), and then re-performed some performance measurements using my [[ https://github.com/darktable-org/rawspeed | RawSpeed ]] benchmark: (all measurements done with LLVM ToT, the sanitizer never fired.) * no sanitization vs. existing check: average `+21.62%` slowdown * existing check vs. check after this patch: average `22.04%` slowdown * no sanitization vs. this patch: average `48.42%` slowdown Reviewers: vsk, filcab, rsmith, aaron.ballman, vitalybuka, rjmccall, #sanitizers Reviewed By: rsmith Subscribers: kristof.beyls, nickdesaulniers, nikic, ychen, dtzWill, xbolva00, dberris, arphaman, rupprecht, reames, regehr, llvm-commits, cfe-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D67122 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374293 91177308-0d34-0410-b5e6-96231b3b80d8
…defined behaviour Summary: Quote from http://eel.is/c++draft/expr.add#4: ``` 4 When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P. (4.1) If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value. (4.2) Otherwise, if P points to an array element i of an array object x with n elements ([dcl.array]), the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) array element i+j of x if 0≤i+j≤n and the expression P - J points to the (possibly-hypothetical) array element i−j of x if 0≤i−j≤n. (4.3) Otherwise, the behavior is undefined. ``` Therefore, as per the standard, applying non-zero offset to `nullptr` (or making non-`nullptr` a `nullptr`, by subtracting pointer's integral value from the pointer itself) is undefined behavior. (*if* `nullptr` is not defined, i.e. e.g. `-fno-delete-null-pointer-checks` was *not* specified.) To make things more fun, in C (6.5.6p8), applying *any* offset to null pointer is undefined, although Clang front-end pessimizes the code by not lowering that info, so this UB is "harmless". Since rL369789 (D66608 `[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null`) LLVM middle-end uses those guarantees for transformations. If the source contains such UB's, said code may now be miscompiled. Such miscompilations were already observed: * https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/687838.html * google/filament#1566 Surprisingly, UBSan does not catch those issues ... until now. This diff teaches UBSan about these UB's. `getelementpointer inbounds` is a pretty frequent instruction, so this does have a measurable impact on performance; I've addressed most of the obvious missing folds (and thus decreased the performance impact by ~5%), and then re-performed some performance measurements using my [[ https://github.com/darktable-org/rawspeed | RawSpeed ]] benchmark: (all measurements done with LLVM ToT, the sanitizer never fired.) * no sanitization vs. existing check: average `+21.62%` slowdown * existing check vs. check after this patch: average `22.04%` slowdown * no sanitization vs. this patch: average `48.42%` slowdown Reviewers: vsk, filcab, rsmith, aaron.ballman, vitalybuka, rjmccall, #sanitizers Reviewed By: rsmith Subscribers: kristof.beyls, nickdesaulniers, nikic, ychen, dtzWill, xbolva00, dberris, arphaman, rupprecht, reames, regehr, llvm-commits, cfe-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D67122 git-svn-id: http://llvm.org/svn/llvm-project/cfe/trunk@374293 91177308-0d34-0410-b5e6-96231b3b80d8
…defined behaviour Summary: Quote from http://eel.is/c++draft/expr.add#4: ``` 4 When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P. (4.1) If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value. (4.2) Otherwise, if P points to an array element i of an array object x with n elements ([dcl.array]), the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) array element i+j of x if 0≤i+j≤n and the expression P - J points to the (possibly-hypothetical) array element i−j of x if 0≤i−j≤n. (4.3) Otherwise, the behavior is undefined. ``` Therefore, as per the standard, applying non-zero offset to `nullptr` (or making non-`nullptr` a `nullptr`, by subtracting pointer's integral value from the pointer itself) is undefined behavior. (*if* `nullptr` is not defined, i.e. e.g. `-fno-delete-null-pointer-checks` was *not* specified.) To make things more fun, in C (6.5.6p8), applying *any* offset to null pointer is undefined, although Clang front-end pessimizes the code by not lowering that info, so this UB is "harmless". Since rL369789 (D66608 `[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null`) LLVM middle-end uses those guarantees for transformations. If the source contains such UB's, said code may now be miscompiled. Such miscompilations were already observed: * https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/687838.html * google/filament#1566 Surprisingly, UBSan does not catch those issues ... until now. This diff teaches UBSan about these UB's. `getelementpointer inbounds` is a pretty frequent instruction, so this does have a measurable impact on performance; I've addressed most of the obvious missing folds (and thus decreased the performance impact by ~5%), and then re-performed some performance measurements using my [[ https://github.com/darktable-org/rawspeed | RawSpeed ]] benchmark: (all measurements done with LLVM ToT, the sanitizer never fired.) * no sanitization vs. existing check: average `+21.62%` slowdown * existing check vs. check after this patch: average `22.04%` slowdown * no sanitization vs. this patch: average `48.42%` slowdown Reviewers: vsk, filcab, rsmith, aaron.ballman, vitalybuka, rjmccall, #sanitizers Reviewed By: rsmith Subscribers: kristof.beyls, nickdesaulniers, nikic, ychen, dtzWill, xbolva00, dberris, arphaman, rupprecht, reames, regehr, llvm-commits, cfe-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D67122 llvm-svn: 374293
…ro-offset Summary: Per standard pointer arithmetic is only defined when keeping pointers within the bounds of an array object. Applying non-zero offset to nullptr (or making non-nullptr a nullptr by subtracting pointer's integral value from the pointer itself) is undefined behavior. Since https://reviews.llvm.org/D66608 `[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null) LLVM middle-end uses those guarantees for transformations.` and mis-compilations have been observed: - https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/687838.html - google/filament#1566 To prevent help weed out bugs before they lead to future miscompilations a new UBSAN check has been added: `nullptr-with-nonzero-offset` - https://reviews.llvm.org/D67122 [UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour `folly::io::Cursor` does this type of operations when checking if `crtPos_ + N <= crtEnd_`: when it's empty it becomes `nullptr + N <= nullptr`: const uint8_t* crtBegin_{nullptr}; const uint8_t* crtEnd_{nullptr}; const uint8_t* crtPos_{nullptr}; Switch to `uintptr_t` space where the math is no longer UB. Reviewed By: yfeldblum Differential Revision: D33737556 fbshipit-source-id: 588b91ac1387112a6f183edfda5555ca1b7193d8
…NULL (caught by UBSAN nullptr-with-nonzero-offset) According to the standard pointer arithmetic is only defined in these situations: - `nullptr` ± 0 => `nullptr` - P±N and N±P are pointers that point at an element of the same array with index I±N `**nullptr**`** + N≠0 is undefined behavior** - even if you don't dereference the resulting pointer. Since [InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null) https://reviews.llvm.org/D66608 LLVM middle-end uses those guarantees for transformations and mis-compilations have been observed: - https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/687838.html - google/filament#1566 To prevent help weed out bugs before they lead to future miscompilations a new UBSAN check has been added: `nullptr-with-nonzero-offset` - [UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour https://reviews.llvm.org/D67122 This new nullptr-with-nonzero-offset check fails on invocations of this macro. Fix the macro to avoid adding to NULL.
Performing
base + offset
pointer arithmetic is only allowed whenbase
itself is not nullptr. In other words, the compiler is assumed to allow thatbase + offset
is always non-null, which an upcoming compiler release will do in this case. The result is that CommandStream.cpp, which calls this in a loop until the result is nullptr, will never terminate (until it runs junk data and crashes).Avoid this by using intptr_t instead of actual pointers (i.e. char*), which seems to avoid this failure.