Runtime Sanitizer Issues: Disable sanitizer checks where known behaviors occur #299

nirosys · 2022-06-29T21:47:55Z

Issue #, if available: None

Description of changes:
This PR is part of the effort to cleanup sanitizer errors that are generated when running debug builds. The general gist of this PR is to disable specific sanitizer checks for functions that have been verified to work as expected. The goal is to be able to treat a sanitizer error during unit test execution as a failed unit test, so existing sanitizer errors need to be addressed first. I'm not thrilled at the idea of disabling sanitizer checks, but I would rather do that and get testing to consider sanitizer findings than update mature code just to satisfy the sanitizers. That is very much open for discussion though, and the reason I split this PR apart from the other.

More specifically, this PR addresses two classifications of issues. The first is inherent to how decNumber is implemented and starts off with how the decNumber is defined:

  /* The number of units needed is ceil(DECNUMDIGITS/DECDPUN)         */
  #define DECNUMUNITS ((DECNUMDIGITS+DECDPUN-1)/DECDPUN)

  /* The data structure... */
  typedef struct {
    int32_t digits;      /* Count of digits in the coefficient; >0    */
    int32_t exponent;    /* Unadjusted exponent, unbiased, in         */
                         /* range: -1999999997 through 999999999      */
    uint8_t bits;        /* Indicator bits (see above)                */
                         /* Coefficient, from least significant unit  */
    decNumberUnit lsu[DECNUMUNITS];
    } decNumber;

The struct defines the field lsu as having a concrete size of DECNUMUNITS * sizeof(decNumberUnit). This is not always true though. The library will, when needed, malloc a larger chunk of memory in order to accommodate larger numbers that do not fit into the (as compiled) 12 decNumberUnits. When this occurs and the code walks beyond the 12 decNumberUnits the sanitizer understandably sees this as an overflow.

Having a variable sized array in a struct is fairly normal, and C has a standard way of communicating that by defining the last field in the struct as an array with no length (decNumberUnit lsu[]). Clang's AddressSanitizer is smart enough to see a definition like that and not throw bounds check errors because the unknown length is communicated. Ideally, this is how decNumber would be implemented, since it would communicate the expectation both to maintainers as well as analyzers.

This PR does not attempt to rewrite decNumber to change the struct definition. There are places throughout the library, and in ion-c where decNumbers are declared on the stack and the default space provided allows that to work. Moving to an unsized lsu would require adopting a new allocation mechanism and adequate testing. Given the maturity of the decNumber library I don't consider those changes to be high priority, though any efforts to benefit from static & dynamic analysis, rather than disable them, are definitely preferred.

This issue presents with sanitizer errors like the following:

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/runner/work/ion-c/ion-c/ionc/ion_int.c:745:9 in 
/home/runner/work/ion-c/ion-c/ionc/ion_int.c:745:9: runtime error: index 17 out of bounds for type 'uint16_t const[12]'

The second type of errors this PR addresses are errors resulting from signed overflows that would cause unexpected behavior per the C standard. For instance,

C17 - 6.5.7 - Bitwise shift operators

The result of $E_1 << E_2$ is $E_1$ left-shifted $E_2$ bit positions.
If $E_1$ has a signed type and nonnegative value, and $E_1 \times 2^{E_2}$ is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/runner/work/ion-c/ion-c/ionc/ion_symbol_table.c:2177:42 in 
/home/runner/work/ion-c/ion-c/ionc/ion_symbol_table.c:2177:28: runtime error: left shift of 2586866107476332251 by 6 places cannot be represented in type 'int_fast32_t' (aka 'long')

The sanitizer is validating the standard, but the concrete implementation works for our intentions. Other sanitizer errors addressed in this PR are similar:

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/runner/work/ion-c/ion-c/ionc/ion_symbol_table.c:2177:28 in 
/home/runner/work/ion-c/ion-c/ionc/ion_symbol_table.c:2177:34: runtime error: signed integer overflow: -5913942817386134049 + -5345388826668433408 cannot be represented in type 'long'

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/runner/work/ion-c/ion-c/ionc/ion_scanner.c:2096:23 in 
/home/runner/work/ion-c/ion-c/ionc/ion_scanner.c:2096:23: runtime error: left shift of 1416128768 by 8 places cannot be represented in type 'int'

Sanitizer errors after this PR:

$ test/all_tests 2>&1 >/dev/null
all_tests(71113,0x10c802600) malloc: nano zone abandoned due to inability to preallocate reserved vm space.
$

[==========] 2957 tests from 41 test cases ran. (2789 ms total)
[  PASSED  ] 2957 tests.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

When running with sanitizers enabled some of the functionality in decNumber and a few places within ion-c itself, triggers runtime sanitizer checks. I've disabled specific sanitizer checks on the functions that exhibit the behaviors in order to quiet the sanitizers, with the expectation that a sanitizer error should cause unit test failure. I have looked at each of the instances and verified that despite the sanitizer being correct, the code is behaving as expected. That said, there are certainly ways to implement the code without changing functionality, that would not exhibit these behaviors. It would be ideal to revisit each case and eliminate the need for disabling the sanitizer checks.

tgregg

This approach makes sense to me.

tgregg approved these changes Jun 29, 2022

View reviewed changes

zslayton merged commit 312503f into amazon-ion:master Jul 5, 2022

nirosys deleted the rgiliam/decnum_sanitizers branch July 7, 2022 21:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runtime Sanitizer Issues: Disable sanitizer checks where known behaviors occur #299

Runtime Sanitizer Issues: Disable sanitizer checks where known behaviors occur #299

nirosys commented Jun 29, 2022

tgregg left a comment

Runtime Sanitizer Issues: Disable sanitizer checks where known behaviors occur #299

Runtime Sanitizer Issues: Disable sanitizer checks where known behaviors occur #299

Conversation

nirosys commented Jun 29, 2022

tgregg left a comment

Choose a reason for hiding this comment