Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tools,unix: Enable Undefined Behavior Sanitizer (UBSan) for unix port and in CI. #15303

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

projectgus
Copy link
Contributor

This PR enables UBSan on the coverage build by default, so it will be checked in CI.

Includes two supporting commits to fix UB that appeared in test runs. Kudos to @jepler who did a ton of previous work with UBSan in #7237, #7370, #3799, probably others - hence why this PR is pretty small.

Null argument checks

The biggest commit in this PR is to fix cases where UBSan detects a null pointer being passed to a function argument decorated as non-null. In every case this was a libc function (memset, memcpy, etc) where the length argument was zero, so a no-op in principle.

@jepler did a good job of explaining the potential risk in #3799 so I'll quote that here:

Traditionally, memset(NULL, value, 0) has been accepted without causing problems. However, it is not standards-compliant behavior; and for instance Ted Unangst of the OpenBSD project notes that "A smart C compiler may observe a call to memcpy, flag both pointers as valid, and then delete any null checks. Forwards and backwards."
https://www.tedunangst.com/flak/post/zero-size-objects

Since micropython is using -fdelete-null-pointer-checks ("enabled by default on most targets") and it is probably giving good code size improvements, we have to pay a modest price and add a few checks.

There are multiple ways we could handle this:

  1. Current approach in this PR is to add the checks in the code, and incur a code size increase. It's relatively small, as most of the checks can be optimised out again by the compiler, but it's still there and may be an unacceptably high cost.
  2. The approach @jepler chose in various: Fix null pointer errors #7369 was to have a compile time flag that adds these checks via macros, disabled in production builds to save code size. I think this is an OK approach, but worry that it adds similar source code complexity as 1 without adding protection against the optimisation issue described above - i.e. UBSan builds won't see the UB due to the conditional check, so if it does happen then we'll find it the old fashioned way.
  3. Disable all non-null argument checks in UBSan. In practice MicroPython really only encounters these when calling libc functions, so there's a similar benefit and risk as approach 2 but the code stays simpler. This was actually my first choice, there's a branch at https://github.com/projectgus/micropython/tree/feature/ubsan_unix_skip_nonnull that implements it. Then I got nervous about the risk of not checking for it...

@jepler I'd be very interested in your thoughts on this, if you're around. I can see benefits for each approach.

As detected by UBSan.

This work was funded through GitHub Sponsors.

Signed-off-by: Angus Gratton <angus@redyak.com.au>
A number of places call memset, memcpy, memcmp where one or both pointer
arguments is NULL but the length is also zero.

This is generally safe provided the length is zero, however it is also UB
and there is a risk that an ambitious compiler can decide that because an
argument is passed to the one of these functions then it is not NULL, and
therefore optimise out an existing NULL check.

The checks are added in place, not guarded behind a macro, to avoid
that (maybe remote) possibility. There is a size impact, although it seems
like gcc does a pretty good job of optimising them back out in a lot of
cases.

Alternative to this would be to tell UBSan not to check non-null argument
attributes and accept any associated risk.

This work was funded through GitHub Sponsors.

Signed-off-by: Angus Gratton <angus@redyak.com.au>
Allows UB to be caught during CI passes.

One incident of UB in third party code has to be ignored using a
suppressions file.

This work was funded through GitHub Sponsors.

Signed-off-by: Angus Gratton <angus@redyak.com.au>
@projectgus projectgus added py-core Relates to py/ directory in source port-unix tools Relates to tools/ directory in source, or other tooling labels Jun 19, 2024
@projectgus projectgus changed the title tools,unix: Unable Undefined Behavior Sanitizer (UBSan) for unix port and in CI. tools,unix: Enable Undefined Behavior Sanitizer (UBSan) for unix port and in CI. Jun 19, 2024
@@ -91,6 +91,17 @@ ifndef DEBUG
CFLAGS += -U _FORTIFY_SOURCE
endif

# Optional Undefined Behavior Sanitizer
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a display of pedantry befitting a computer programmer, I've used US English spelling for the proper noun Undefined Behavior Sanitizer but used British English spelling elsewhere. If this is too silly then I can change it.

Copy link

codecov bot commented Jun 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.42%. Comparing base (e9c898c) to head (176fc1c).

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #15303      +/-   ##
==========================================
- Coverage   98.42%   98.42%   -0.01%     
==========================================
  Files         161      161              
  Lines       21204    20955     -249     
==========================================
- Hits        20870    20624     -246     
+ Misses        334      331       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

Code size report:

   bare-arm:    -4 -0.007% 
minimal x86:   -95 -0.051% [incl -8(data)]
   unix x64:  +152 +0.018% standard
      stm32:   +64 +0.016% PYBV10
     mimxrt:   +64 +0.018% TEENSY40
        rp2:  +128 +0.015% RPI_PICO_W
       samd:   +64 +0.024% ADAFRUIT_ITSYBITSY_M4_EXPRESS

@projectgus projectgus marked this pull request as draft June 19, 2024 07:21
@projectgus
Copy link
Contributor Author

Few things still needed:

  • coverage 32-bit build breaks when UBSan enabled. Weirdly in CI it prints an (odd) warning for a read overflow. Locally building with UBSan causes a segfault on nlr_raise. Guessing UBSan is inserting checks into at least one of the pseudo-assembler NLR C functions, but so far haven't found a root cause...
  • The unix coverage builds for qemu mips and qemu_arm have toolchains that don't ship libubsan. Might be able to cross-install the Ubuntu packages for the respective architectures, but probably simpler to disable UBSAN except for x86/x64.

@projectgus
Copy link
Contributor Author

projectgus commented Jun 19, 2024

  • coverage 32-bit build breaks when UBSan enabled.

It seems UBSan generates enough code in vm.c that an nlr_jump back to the nlr_push target will crashe on x86. By default with UBSan, an exception jumps and lands on an SSE2 instruction:

outer_dispatch_loop:
        if (nlr_push(&nlr) == 0) {
     283:	83 ec 0c             	sub    $0xc,%esp
     286:	ff 74 24 48          	push   0x48(%esp)
     28a:	e8 fc ff ff ff       	call   28b <mp_execute_bytecode+0x28b>
     28f:	66 0f 6f 8b 00 00 00 	movdqa 0x0(%ebx),%xmm1
     296:	00 
     297:	89 44 24 48          	mov    %eax,0x48(%esp)

Exception handler crashes at 28f. Compiling vm.c with -fno-sanitize works, compiling with -mno-sse doesn't work as UBSan just generates different additional code here and it also crashes.

I think probably the safe way to use UBSan with NLR is going to be falling back to setjmp/longjmp, after all that's the only C standard compliant way to do it... So probably that means leaving the coverage builds as they are, and adding another CI config for UBSan. 🙃

@jepler
Copy link
Contributor

jepler commented Jun 19, 2024

I don't have any detailed comments but thank you for picking up this work and trying to move it forward!

@projectgus
Copy link
Contributor Author

stm32: +64 +0.016% PYBV10
mimxrt: +64 +0.018% TEENSY40
rp2: +128 +0.015% RPI_PICO_W
samd: +64 +0.024% ADAFRUIT_ITSYBITSY_M4_EXPRESS

Hmm OK, this is more code size increase than I saw in my testing as well (GCC version dependent, maybe?) So I think something else to tackle here...

@projectgus
Copy link
Contributor Author

Some notes-to-self for when I get back to this:

  • Damien is on board with changing the coverage build to setjmp/longjmp for NLR, so we don't need another build and test config. Native NLR is still tested in the standard builds.
  • Some builds got smaller from the extra NULL checks, so it may be worth digging into which checks add vs remove code size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
port-unix py-core Relates to py/ directory in source tools Relates to tools/ directory in source, or other tooling
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants