Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the tests run cleanly under UndefinedBehaviorSanitizer #5568

Merged
merged 1 commit into from Jul 9, 2020

Conversation

lhchavez
Copy link
Contributor

This change makes the tests run cleanly under
-fsanitize=undefined,nullability and comprises of:

  • Avoids some arithmetic with NULL pointers (which UBSan does not like).
  • Avoids an overflow in a shift, due to an uint8_t being implicitly
    converted to a signed 32-bit signed integer after being shifted by a
    32-bit signed integer.
  • Avoids a unaligned read in libgit2.
  • Ignores unaligned reads in the SHA1 library, since it only happens on
    Intel processors, where it is still undefined behavior, but the
    semantics are moderately well-understood.

Of notable omission is -fsanitize=integer, since there are lots of
warnings in zlib and the SHA1 library which probably don't make sense to
fix and I could not figure out how to silence easily. libgit2 itself
also has ~100s of warnings which are mostly innocuous (e.g. use of enum
constants that only fit on an uint32_t, but there is no way to do that
in a simple fashion because the data type chosen for enumerated types is
implementation-defined), and investigating whether there are worrying
warnings would need reducing the noise significantly.

@@ -260,6 +260,9 @@ static inline bool write_buf(
const unsigned char *buf,
size_t len)
{
if (!len)
return true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as with the other PR, splitting he ntlmclient changes into a separate commit would make it easier for @ethomson to cherry-pick.

[undefined]
# This library allows unaligned access on Intel-like processors. Prevent UBSan
# from complaining about that.
fun:sha1_compression_states
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably belongs into #5569, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

src/apply.c Outdated
@@ -66,22 +66,24 @@ static int patch_image_init_fromstr(
if (git_pool_init(&out->pool, sizeof(git_diff_line)) < 0)
return -1;

for (start = in; start < in + in_len; start = end) {
end = memchr(start, '\n', in_len - (start - in));
if (in_len) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be easier to read if we just exited early in case !in_len

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay for the guard clause pattern! (Wasn't sure whether to add it here since it's not used consistently in the codebase, but I personally prefer it).

@@ -759,7 +759,8 @@ int git_buf_join(
ssize_t offset_a = -1;

/* not safe to have str_b point internally to the buffer */
assert(str_b < buf->ptr || str_b >= buf->ptr + buf->size);
if (buf->size)
assert(str_b < buf->ptr || str_b >= buf->ptr + buf->size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, what exactly is the undefined behaviour here?

Copy link
Contributor Author

@lhchavez lhchavez Jun 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to do arithmetic with NULL (in this case it was buf->ptr). From https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#ubsan-checks

-fsanitize=pointer-overflow: Performing pointer arithmetic which overflows, or where either the old or new pointer value is a null pointer (or in C, when they both are).

This is because C sort of explicitly disallows that: https://stackoverflow.com/a/54233071/688337

src/index.c Outdated
disk_size -= offsetof(struct entry_long, path);
memcpy(mem, &ondisk_ext, path_offset);
path = mem;
path += path_offset;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

path = mem + path_offset;?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, my attempt to avoid a cast backfired.

This change makes the tests run cleanly under
`-fsanitize=undefined,nullability` and comprises of:

* Avoids some arithmetic with NULL pointers (which UBSan does not like).
* Avoids an overflow in a shift, due to an uint8_t being implicitly
  converted to a signed 32-bit signed integer after being shifted by a
  32-bit signed integer.
* Avoids a unaligned read in libgit2.
* Ignores unaligned reads in the SHA1 library, since it only happens on
  Intel processors, where it is _still_ undefined behavior, but the
  semantics are moderately well-understood.

Of notable omission is `-fsanitize=integer`, since there are lots of
warnings in zlib and the SHA1 library which probably don't make sense to
fix and I could not figure out how to silence easily. libgit2 itself
also has ~100s of warnings which are mostly innocuous (e.g. use of enum
constants that only fit on an `uint32_t`, but there is no way to do that
in a simple fashion because the data type chosen for enumerated types is
implementation-defined), and investigating whether there are worrying
warnings would need reducing the noise significantly.
@ethomson
Copy link
Member

ethomson commented Jul 9, 2020

Great. Thanks!

@ethomson ethomson merged commit 325375e into libgit2:master Jul 9, 2020
@lhchavez lhchavez deleted the ubsan branch July 9, 2020 22:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants