Join GitHub today
This patch changes the SHA512 code to avoid faults from mis-aligned accesses #90
In particular, such faults arise with R-3.5.1 and digest-0.6.16 on a
R version 3.5.1 (2018-07-02) -- "Feather Spray"
R is free software and comes with ABSOLUTELY NO WARRANTY.
R is a collaborative project with many contributors.
Type 'demo()' for some demos, 'help()' for on-line help, or
*** caught bus error ***
Whether the problem occurs varies with the compiler used, since some
The fix is made to sha2.[ch], which is used for SHA512. There is an
The issue is in the SHA512_Transform function, which is used locally
void SHA512_Transform(SHA512_CTX* context, const sha2_word64* data)
Alignment faults may arise when it is called as
since 'data' may not be 64-bit aligned. It is also called as
which does not produce a fault because context-buffer happens to be
This patch modifies SHA512_Transform so it is declared as
static void SHA512_Transform(SHA512_CTX* context)
with data always coming from context->buffer. The previous call with
MEMCPY_BCOPY(context->buffer, data, SHA256_BLOCK_LENGTH);
The copy should handled any alignment for 'data'.
There is no measurable performance impact.
@@ Coverage Diff @@ ## master #90 +/- ## ===================================== Coverage 100% 100% ===================================== Files 16 16 Lines 1612 1613 +1 ===================================== + Hits 1612 1613 +1
Thanks a lot, @radfordneal, and apologies that this fell to the side for a few days.
@hannesmuehleisen You were the one who added the
Otherwise, would either one of you know of another implementation that passes muster on all little-vs-big endian, 32-vs-64 bit OS, ... cases? I presume that that Sparc system Radford uses here is a little unusual as we have not heard from anyone else yet?
This is now on CRAN. As a side effect, it also took out an UBSAN report against the file.
I still have a few UBSAN issues left in
If either of you, or @jimhester who contributed
are possibly false positives. Any comments?
Regarding the note on line 276 of pmurhash.c, the problem is that the macro used has the line c = c>>8 | *ptr++<<24; where c is a uint32_t and ptr is a pointer to uint8_t. Since *ptr++ will be of uint8_t type, it will be subject to "integer promotion", in which the C99 standard says "If an int can represent all values of the original type, the value is converted to an int". Once converted to int (which is 32 bits in size, and signed), a left shift of the value 254 results in an overflow, which is undefined behaviour. I think that changing it to c = c>>8 | (uint32_t)*ptr++<<24; would fix the problem. In theory, *ptr++ will be converted first to an int, and then to uint32_t, before being shifted. All these operations are guaranteed to have defined behaviour. It's unlikely that the problem would actually cause a bug, though it's hard to be sure if the compiler is overly-enthusiastic about assuming that undefined behaviour can't occur. Radford…
On Thu, 13 Sep 2018 at 12:53, Dirk Eddelbuettel ***@***.***> wrote: I took another look and I may have to tell CRAN that two remaining UBSAN issues (now updated after our newest version got onto CRAN with the changes by @radfordneal): all #ifdef for an alignment choice in xxhash.c lines 227 and 240 in pmurhash.c line 276 is it hidden away in an upstream macro that has not changed are possibly false positives. Any comments? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Thanks a lot for suggesting this, Radford! I'll look into this this evening -- easy enough to swap the line and use the UBSAN-configured builds at RHub (though it may actually use the same Docker container I created and may need to update to whatever Brian Ripley uses these days...).