New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the tests run cleanly under UndefinedBehaviorSanitizer #5568
Conversation
249081e
to
95f9f12
Compare
deps/ntlmclient/ntlm.c
Outdated
@@ -260,6 +260,9 @@ static inline bool write_buf( | |||
const unsigned char *buf, | |||
size_t len) | |||
{ | |||
if (!len) | |||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as with the other PR, splitting he ntlmclient changes into a separate commit would make it easier for @ethomson to cherry-pick.
script/sanitizers.supp
Outdated
[undefined] | ||
# This library allows unaligned access on Intel-like processors. Prevent UBSan | ||
# from complaining about that. | ||
fun:sha1_compression_states |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably belongs into #5569, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/apply.c
Outdated
@@ -66,22 +66,24 @@ static int patch_image_init_fromstr( | |||
if (git_pool_init(&out->pool, sizeof(git_diff_line)) < 0) | |||
return -1; | |||
|
|||
for (start = in; start < in + in_len; start = end) { | |||
end = memchr(start, '\n', in_len - (start - in)); | |||
if (in_len) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be easier to read if we just exited early in case !in_len
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay for the guard clause pattern! (Wasn't sure whether to add it here since it's not used consistently in the codebase, but I personally prefer it).
@@ -759,7 +759,8 @@ int git_buf_join( | |||
ssize_t offset_a = -1; | |||
|
|||
/* not safe to have str_b point internally to the buffer */ | |||
assert(str_b < buf->ptr || str_b >= buf->ptr + buf->size); | |||
if (buf->size) | |||
assert(str_b < buf->ptr || str_b >= buf->ptr + buf->size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, what exactly is the undefined behaviour here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to do arithmetic with NULL
(in this case it was buf->ptr
). From https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#ubsan-checks
-fsanitize=pointer-overflow
: Performing pointer arithmetic which overflows, or where either the old or new pointer value is a null pointer (or in C, when they both are).
This is because C sort of explicitly disallows that: https://stackoverflow.com/a/54233071/688337
src/index.c
Outdated
disk_size -= offsetof(struct entry_long, path); | ||
memcpy(mem, &ondisk_ext, path_offset); | ||
path = mem; | ||
path += path_offset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
path = mem + path_offset;
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, my attempt to avoid a cast backfired.
This change makes the tests run cleanly under `-fsanitize=undefined,nullability` and comprises of: * Avoids some arithmetic with NULL pointers (which UBSan does not like). * Avoids an overflow in a shift, due to an uint8_t being implicitly converted to a signed 32-bit signed integer after being shifted by a 32-bit signed integer. * Avoids a unaligned read in libgit2. * Ignores unaligned reads in the SHA1 library, since it only happens on Intel processors, where it is _still_ undefined behavior, but the semantics are moderately well-understood. Of notable omission is `-fsanitize=integer`, since there are lots of warnings in zlib and the SHA1 library which probably don't make sense to fix and I could not figure out how to silence easily. libgit2 itself also has ~100s of warnings which are mostly innocuous (e.g. use of enum constants that only fit on an `uint32_t`, but there is no way to do that in a simple fashion because the data type chosen for enumerated types is implementation-defined), and investigating whether there are worrying warnings would need reducing the noise significantly.
Great. Thanks! |
This change makes the tests run cleanly under
-fsanitize=undefined,nullability
and comprises of:converted to a signed 32-bit signed integer after being shifted by a
32-bit signed integer.
Intel processors, where it is still undefined behavior, but the
semantics are moderately well-understood.
Of notable omission is
-fsanitize=integer
, since there are lots ofwarnings in zlib and the SHA1 library which probably don't make sense to
fix and I could not figure out how to silence easily. libgit2 itself
also has ~100s of warnings which are mostly innocuous (e.g. use of enum
constants that only fit on an
uint32_t
, but there is no way to do thatin a simple fashion because the data type chosen for enumerated types is
implementation-defined), and investigating whether there are worrying
warnings would need reducing the noise significantly.