Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime Sanitizer Issues: Disable sanitizer checks where known behaviors occur #299

Merged
merged 1 commit into from
Jul 5, 2022

Conversation

nirosys
Copy link
Contributor

@nirosys nirosys commented Jun 29, 2022

Issue #, if available: None

Description of changes:
This PR is part of the effort to cleanup sanitizer errors that are generated when running debug builds. The general gist of this PR is to disable specific sanitizer checks for functions that have been verified to work as expected. The goal is to be able to treat a sanitizer error during unit test execution as a failed unit test, so existing sanitizer errors need to be addressed first. I'm not thrilled at the idea of disabling sanitizer checks, but I would rather do that and get testing to consider sanitizer findings than update mature code just to satisfy the sanitizers. That is very much open for discussion though, and the reason I split this PR apart from the other.

More specifically, this PR addresses two classifications of issues. The first is inherent to how decNumber is implemented and starts off with how the decNumber is defined:

  /* The number of units needed is ceil(DECNUMDIGITS/DECDPUN)         */
  #define DECNUMUNITS ((DECNUMDIGITS+DECDPUN-1)/DECDPUN)

  /* The data structure... */
  typedef struct {
    int32_t digits;      /* Count of digits in the coefficient; >0    */
    int32_t exponent;    /* Unadjusted exponent, unbiased, in         */
                         /* range: -1999999997 through 999999999      */
    uint8_t bits;        /* Indicator bits (see above)                */
                         /* Coefficient, from least significant unit  */
    decNumberUnit lsu[DECNUMUNITS];
    } decNumber;

The struct defines the field lsu as having a concrete size of DECNUMUNITS * sizeof(decNumberUnit). This is not always true though. The library will, when needed, malloc a larger chunk of memory in order to accommodate larger numbers that do not fit into the (as compiled) 12 decNumberUnits. When this occurs and the code walks beyond the 12 decNumberUnits the sanitizer understandably sees this as an overflow.

Having a variable sized array in a struct is fairly normal, and C has a standard way of communicating that by defining the last field in the struct as an array with no length (decNumberUnit lsu[]). Clang's AddressSanitizer is smart enough to see a definition like that and not throw bounds check errors because the unknown length is communicated. Ideally, this is how decNumber would be implemented, since it would communicate the expectation both to maintainers as well as analyzers.

This PR does not attempt to rewrite decNumber to change the struct definition. There are places throughout the library, and in ion-c where decNumbers are declared on the stack and the default space provided allows that to work. Moving to an unsized lsu would require adopting a new allocation mechanism and adequate testing. Given the maturity of the decNumber library I don't consider those changes to be high priority, though any efforts to benefit from static & dynamic analysis, rather than disable them, are definitely preferred.

This issue presents with sanitizer errors like the following:

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/runner/work/ion-c/ion-c/ionc/ion_int.c:745:9 in 
/home/runner/work/ion-c/ion-c/ionc/ion_int.c:745:9: runtime error: index 17 out of bounds for type 'uint16_t const[12]'

The second type of errors this PR addresses are errors resulting from signed overflows that would cause unexpected behavior per the C standard. For instance,

C17 - 6.5.7 - Bitwise shift operators

The result of $E_1 << E_2$ is $E_1$ left-shifted $E_2$ bit positions.
If $E_1$ has a signed type and nonnegative value, and $E_1 \times 2^{E_2}$ is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/runner/work/ion-c/ion-c/ionc/ion_symbol_table.c:2177:42 in 
/home/runner/work/ion-c/ion-c/ionc/ion_symbol_table.c:2177:28: runtime error: left shift of 2586866107476332251 by 6 places cannot be represented in type 'int_fast32_t' (aka 'long')

The sanitizer is validating the standard, but the concrete implementation works for our intentions. Other sanitizer errors addressed in this PR are similar:

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/runner/work/ion-c/ion-c/ionc/ion_symbol_table.c:2177:28 in 
/home/runner/work/ion-c/ion-c/ionc/ion_symbol_table.c:2177:34: runtime error: signed integer overflow: -5913942817386134049 + -5345388826668433408 cannot be represented in type 'long'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/runner/work/ion-c/ion-c/ionc/ion_scanner.c:2096:23 in 
/home/runner/work/ion-c/ion-c/ionc/ion_scanner.c:2096:23: runtime error: left shift of 1416128768 by 8 places cannot be represented in type 'int'

Sanitizer errors after this PR:

$ test/all_tests 2>&1 >/dev/null
all_tests(71113,0x10c802600) malloc: nano zone abandoned due to inability to preallocate reserved vm space.
$
[==========] 2957 tests from 41 test cases ran. (2789 ms total)
[  PASSED  ] 2957 tests.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

When running with sanitizers enabled some of the functionality in
decNumber and a few places within ion-c itself, triggers runtime
sanitizer checks. I've disabled specific sanitizer checks on the
functions that exhibit the behaviors in order to quiet the
sanitizers, with the expectation that a sanitizer error should
cause unit test failure. I have looked at each of the instances
and verified that despite the sanitizer being correct, the code
is behaving as expected.

That said, there are certainly ways to implement the code without
changing functionality, that would not exhibit these behaviors.
It would be ideal to revisit each case and eliminate the need for
disabling the sanitizer checks.
Copy link
Contributor

@tgregg tgregg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach makes sense to me.

@zslayton zslayton merged commit 312503f into amazon-ion:master Jul 5, 2022
@nirosys nirosys deleted the rgiliam/decnum_sanitizers branch July 7, 2022 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants