-
Notifications
You must be signed in to change notification settings - Fork 350
[BoundsSafety] Implement compiler instrumentation for a soft trap mode #11645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@swift-ci test llvm |
191903a to
8f8cd43
Compare
|
@swift-ci test llvm |
8f8cd43 to
92a1a26
Compare
|
@swift-ci test llvm |
| # Check that reason for bounds check failing can be seen in the stacktrace | ||
| bt | ||
| # CHECK: * frame #{{.*}}`__bounds_safety_soft_trap_s(reason="indexing above upper bound in 'array[index]'") | ||
| # CHECK-NEXT: frame #{{.*}}`__clang_trap_msg$Bounds check failed$indexing above upper bound in 'array[index]' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So with soft-traps we still want the fake frame in debug-info because the reason parameter to __bounds_safety_soft_trap_s isn't guaranteed to be preserved on higher optimisation levels? But yea, we probably do want to make the frame recogniser aware of the soft-trap. (Probably just need to add it to the regex so the frame recogniser starts walking the backtrace).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So with soft-traps we still want the fake frame in debug-info because the reason parameter to __bounds_safety_soft_trap_s isn't guaranteed to be preserved on higher optimisation levels?
The fake frame is really only there for __bounds_safety_soft_trap_c case where we just pass an integer constant to represent the reason for trapping. Having the trap reason in the debug info is really useful in this case because currently the integer constant tells us nothing useful.
It's technically not needed for the __bounds_safety_soft_trap_s case as I don't see how the optimizer could remove the constant string being passed. However, it was simpler in the implementation to just always set the fake frame in the debug info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Michael137 Regarding the frame recognizer. I've been talking with @jimingham and we are considering adding an InstrumentationRuntime plug-in that automatically sets the breakpoints on these two symbols and also sets the stop reason appropriately. If we that do still need to change the frame recognizer too?
92a1a26 to
ccf75e6
Compare
|
@swift-ci test llvm |
| /* TO_UPSTREAM(BoundsSafety) ON*/ | ||
| if (LangOpts->hasBoundsSafety() && | ||
| Opts.BoundsSafetySoftTrapFuncName.empty()) { | ||
| // Set the default function name if the driver didn't provide one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When does the driver not provide one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently the driver never provides one so the default values used here are the only names that are currently used.
My thinking there is the driver may wish to set a platform specific name. E.g. If a platform's C library decided it wanted to provide an implementation the driver may wish to use a different name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel strongly about where the line is drawn between the compiler and driver in terms of responsibilities, but intuitively I would've expected it to be the driver's responsibility to provide the default name. That way when you print the cc1 flags the behaviour is explicit rather than implicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem there is cc1 needs to have a default otherwise when we use cc1 in our test cases we'd have to provide the function name every time.
|
|
||
| #include <ptrcheck.h> | ||
|
|
||
| // FIXME: the `@.src` global contains the absolute path to this source file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Swift we have https://github.com/swiftlang/swift/blob/main/utils/PathSanitizingFileCheck. Perhaps adding something similar in LLVM at some point and integrating it with UTC could be worthwhile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely at some point. I spent way too long battling update_cc_test_checks.py trying to set the right set of filters but I couldn't get it work. It's not a task I want to take on right now though.
| // RUN: %clang -fbounds-safety -fbounds-safety-soft-traps=disabled -### %s 2>&1 | FileCheck --check-prefix=DISABLED %s | ||
| // RUN: %clang -fbounds-safety -fbounds-safety-soft-traps=call_with_str -### %s 2>&1 | FileCheck --check-prefix=CWS %s | ||
| // RUN: %clang -fbounds-safety -fbounds-safety-soft-traps=call_with_code -### %s 2>&1 | FileCheck --check-prefix=CWC %s | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this also test that the driver sends the right function name to the compiler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now the driver isn't actually sending a custom function name to the frontend. Instead the frontend has some default function names set.
|
Mostly looks good to me! I do think we should change to use snake-case in the CLI opts though. |
|
|
||
| CachedGlobalStrings[Value] = GlobalStrAsI8; | ||
| return GlobalStrAsI8; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this? Are we concerned about compile-time? For code size, AFAIK the linker would merge strings by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to avoid generating duplicate globals in the LLVM IR (we already do something similar for the fake frame names we put in debug info for trap reasons). Technically the linker should solve this in the final binary but this wouldn't fix bloating the unoptimized IR. Given we can easily solve this during Clang's CodeGen I don't know why we wouldn't do this.
|
@swift-ci test llvm |
1 similar comment
|
@swift-ci test llvm |
This patch implements a "soft trap mode" for `-fbounds-safety` where hard traps are replaced with calls to a runtime function. It is assumed the runtime function can return so code is generated so that execution can proceed past the "soft trap" (i.e. as if the bounds check passed even though it didn't). It is expected implementations of the runtime function log the issue in some way. The motivation behind this is to provide a smoother path to adopting `-fbounds-safety` in existing codebases. Using soft trap mode with an appropriate runtime allows: * Improved stability of the program because bounds safety violations are no longer fatal because execution continues. * The ability to collect all the soft traps that occurred to get an overview of how many issues there are. Note this patch does not implement the actual runtime itself. This will be done in a separate patch. This patch introduces the `-fbounds-safety-soft-traps=` driver and CC1 flag. This allows enabling the soft trap compilation mode. It currently has two different modes which offer different trade-offs. * `call_with_str` - This generates calls that pass a constant C string describing the reason for trapping. This greatly simplifies logging but at the cost of increasing code size because a global string needs to be emitted for every "trap reason". * `call_with_code` - This generates calls that pass an integer constant to identify the reason for trapping. This produces better codesize than `call_with_str` but the reason for trapping is much less descriptive. Currently the integer constant is hard coded to zero because we don't currently have a good stable integer representation of trap reasons. This will be fixed in rdar://162824128. We may want to add a third `soft_trap_instruction` option one day that emits a special trap instruction that uses its immediate to identify itself as a trap that should be treated as soft by program host (e.g. the kernel). Implementing that is out-of-scope for this patch because we'd need a new LLVM IR intrinsic and backend support. This patch also introduces a new `-fbounds-safety-soft-trap-function=` CC1 flag. It is not intended users use this flag. This flag is for Clang driver to communicate to CC1 what the function name should be. Currently the driver doesn't pass anything so CC1 picks these function names: * `__bounds_safety_soft_trap_s` for `call_with_str` * `__bounds_safety_soft_trap_c` for `call_with_code` As implementations of Bounds Safety soft trap runtimes are brought up the driver logic will need add support for them. We will likely need a new driver flag to control which implementation to use (if any). The soft trap function names and API are defined in a new header (`bounds_safety_soft_traps.h`) that ships with the compiler. Having the interface defined in header provides several advantages: * Runtime implementations can include the header to check they conform to the current interface which might evolve over time. * It provides an explicit interface that runtime implementers can implement. In contrast `-ftrap-function` has an implicit interface that basically can't be evolved without causing a silent ABI break. Note this implementation deliberately doesn't re-use `-ftrap-function` and `-ftrap-function-returns` so that these features can be used independently. In particular it allows using `-fbounds-safety` in soft trap mode with trapping UBSan in any mode (see test cases). Note this implementation continues to deliberately use "trap reasons". This is the compiler feature where artificial inline frames are added to debug info on traps to describe the reason for trapping. This means the string representation for reason from trapping is still available in the `call_with_code` case provided debug information is available. This can be seen the in the LLDB tests added. This does mean that technically there is some redundancy here (trap reasons passed as arguments and in the debug info) but this is intentional because it means in the `call_with_str` case without debug info can still do a good job of logging problems. rdar://158088757
c5ef4fd to
6a6e9e8
Compare
In swiftlang#11645 test cases combining `-fbounds-safety` trap mode with UBSan in trapping mode. The generated IR includes a global string array that contains an absolute file path. The test case was previously manually patched so that any path would match to make the test case portable. However, the length of the constant array was not changed which still meant the test wasn't portable. This fixes that by allowing any length for the global arrays rdar://158088757
In #11645 test cases combining `-fbounds-safety` trap mode with UBSan in trapping mode. The generated IR includes a global string array that contains an absolute file path. The test case was previously manually patched so that any path would match to make the test case portable. However, the length of the constant array was not changed which still meant the test wasn't portable. This fixes that by allowing any length for the global arrays rdar://158088757
This patch implements a "soft trap mode" for
-fbounds-safetywherehard traps are replaced with calls to a runtime function. It is assumed
the runtime function can return so code is generated so that execution
can proceed past the "soft trap" (i.e. as if the bounds check passed
even though it didn't). It is expected implementations of the runtime
function log the issue in some way.
The motivation behind this is to provide a smoother path to adopting
-fbounds-safetyin existing codebases. Using soft trap mode with anappropriate runtime allows:
are no longer fatal because execution continues.
overview of how many issues there are.
Note this patch does not implement the actual runtime itself. This will
be done in a separate patch.
This patch introduces the
-fbounds-safety-soft-traps=driver and CC1flag. This allows enabling the soft trap compilation mode. It currently
has two different modes which offer different trade-offs.
call_with_str- This generates calls that pass a constant C stringdescribing the reason for trapping. This greatly simplifies logging
but at the cost of increasing code size because a global string needs
to be emitted for every "trap reason".
call_with_code- This generates calls that pass an integer constantto identify the reason for trapping. This produces better codesize
than
call_with_strbut the reason for trapping is much lessdescriptive. Currently the integer constant is hard coded to zero
because we don't currently have a good stable integer representation
of trap reasons. This will be fixed in rdar://162824128.
We may want to add a third
soft_trap_instructionoption one day thatemits a special trap instruction that uses its immediate to identify
itself as a trap that should be treated as soft by program host (e.g.
the kernel). Implementing that is out-of-scope for this patch because
we'd need a new LLVM IR intrinsic and backend support.
This patch also introduces a new
-fbounds-safety-soft-trap-function=CC1 flag. It is not intended users use this flag. This flag is for Clang
driver to communicate to CC1 what the function name should be. Currently
the driver doesn't pass anything so CC1 picks these function names:
__bounds_safety_soft_trap_sforcall_with_str__bounds_safety_soft_trap_cforcall_with_codeAs implementations of Bounds Safety soft trap runtimes are brought up
the driver logic will need add support for them. We will likely need
a new driver flag to control which implementation to use (if any).
Note this implementation deliberately doesn't re-use
-ftrap-functionand
-ftrap-function-returnsso that these features can be usedindependently. In particular it allows using
-fbounds-safetyin softtrap mode with trapping UBSan in any mode.
Note this implementation continues to deliberately use "trap reasons".
This is the compiler feature where artificial inline frames are added to
debug info on traps to describe the reason for trapping. This means the
string representation for reason from trapping is still available in the
call_with_codecase provided debug information is available. This canbe seen the in the LLDB tests added. This does mean that technically
there is some redundancy here (trap reasons passed as arguments and in
the debug info) but this is intentional because it means in the
call_with_strcase without debug info can still do a good job oflogging problems.
rdar://158088757